date:20110714

[R] recursive function - finding connections

2011-07-14 Thread Benton, Paul

Dear all,

I'm having some problems getting my recursive function to work. At first I 
though that maybe my data was too big and I increase option(expressions=5). 
Then I thought that I would try it on some smaller data. Still not working. :( 
I would have thought there should be a function for this already, so any 
suggestions are welcomed for other methods. I did try igraph but couldn't get 
cliques() to give anything useful. Also a quick play with hclust and cut, again 
nothing too useful.

Basically the function is trying to find uniquely connected subgraphs. So the 
sub-network is only connected by itself and not to other nodes. If everything 
is connected then the list (connectedList) should be length of 1 and have every 
index in the 1st slot.

cheers,

Paul


findconnection-function(mat, cutoff){
toList-function(mat, connectList, cutoff, i, idx){
idx-which(mat[,idx]  cutoff)
if(length(idx) = 1){
connectList[[i]]-idx
for(z in 1:length(idx)){
connectList-toList(mat, connectList, cutoff, 
i, idx[z])
}
}else{
return(connectList)
}
}

connectList-list()
for(i in 1:ncol(mat)){
connectList-toList(mat, connectList, cutoff, i, i)
}
return(unique(connectList)) 
}

foomat-matrix(sample(c(1,0.5,0), 100, replace=T), nrow=10) ## example data
 foomat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  0.0  0.5  0.0  0.5  0.5  0.0  0.5  1.0  0.5   0.0
 [2,]  0.0  1.0  1.0  0.0  0.0  1.0  0.0  1.0  0.5   1.0
 [3,]  1.0  1.0  1.0  1.0  0.5  0.0  0.5  0.5  0.5   0.5
 [4,]  0.0  0.5  0.0  0.0  0.5  0.5  0.5  0.0  1.0   0.0
 [5,]  0.5  0.5  1.0  1.0  0.5  1.0  1.0  0.5  0.5   0.5
 [6,]  0.0  0.5  0.0  0.5  0.5  0.5  0.5  0.5  1.0   1.0
 [7,]  1.0  1.0  0.0  1.0  0.0  0.5  1.0  1.0  0.5   0.5
 [8,]  0.5  1.0  0.0  0.5  1.0  0.0  1.0  0.0  0.0   0.0
 [9,]  0.0  0.5  0.0  0.0  0.5  0.0  0.5  0.0  0.5   0.5
[10,]  1.0  1.0  0.5  1.0  0.0  1.0  0.0  0.0  0.0   0.5
 pb-findconnection(foomat, 0.01)
Error: C stack usage is too close to the limit
Error during wrapup: 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding vertical space before and after Sweave chunk

2011-07-14 Thread Mark Heckmann

Thanks Duncan,
the problem now is that, the space between R code and R output is also 
increased.
I would like to avoid this, i.e.

vertical space
R code
NO SPACE
R results
vertical space

TIA,
Mark

Am 14.07.2011 um 02:13 schrieb Duncan Murdoch:

 On 13/07/2011 7:14 PM, Mark Heckmann wrote:
 I would like to have a bigger default space in front of and behind every 
 Sweave chunk.
 I have seen that space between input and output is removed as follows:
 
 \fvset{listparameters={\setlength{\topsep}{0pt}}}
 \renewenvironment{Schunk}{\vspace{\topsep}}{\vspace{\topsep}}
 
 Still, I can't figure out how to add vertical space before and after the 
 Sweave chunk.
 Can someone help?
 
 
 0pt standard for zero points.  Use a bigger number if you want a bigger 
 space, e.g. 20pt.
 
 Duncan Murdoch


Mark Heckmann
Blog: www.markheckmann.de
R-Blog: http://ryouready.wordpress.com





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to translate string to variable inside a command in an easy way in R

2011-07-14 Thread Ivan Calandra


Hi!

What about as.forumla()?

Like this:
form - as.formula(paste(num, y, ~MemberID, sep=))
agg-aggregate(form, right.a, sum)

Would it work as you expect to?

HTH,
Ivan

Le 7/13/2011 19:30, Daniel Nordlund a écrit :

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of UriB
Sent: Wednesday, July 13, 2011 9:28 AM
To: r-help@r-project.org
Subject: Re: [R] How to translate string to variable inside a command in
an easy way in R

Thanks
Here is another question

I want to have a function that get a string for example
y=AMI and make commands like the following

agg-aggregate(numAMI ~MemberID,right.a,sum)

Note that I know that numAMI is part of right.a because another function
with the string AMI already generated it.

Is there a way to do it without paste parse and eval?

I can do it by the following commands for y=AMI

numy-paste(num,y,sep=)
texta-paste(agg-aggregate(,numy,sep=)
text2-~MemberID, right.a, sum)
text1-paste(texta,text2,sep=)
eval(parse(text=text1))

If you can have a shorter code for it then it can be productive.


Well, you could do the above in one line (sorry, my email client is breaking 
the line)


agg- eval(parse(text=paste(aggregate(num,y,~MemberID,right.a,sum),sep='')))

You can put that in a function and call it like


your_function- function(y) 
eval(parse(text=paste(aggregate(num,y,~MemberID,right.a,sum),sep='')))
agg- your_function(y='AMI')

However, you seem to be trying to wrestle R to the ground to get it to do 
things your way.  Maybe if you gave R-help some context about your overall task 
(including why you need to construct these commands), someone could provide 
suggestions on solving your programming tasks in a more R-ish way.  Just a 
thought.

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Dept. Mammalogy
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Re: Sum weights of independent variables across models (AIC)

2011-07-14 Thread Kamil Bartoń

?model.avg

Look at relative importance.

 Message: 102
 Date: Wed, 13 Jul 2011 18:01:14 -0500
 From: Michael Just mgj...@gmail.com
 To: r-help r-help@r-project.org
 Subject: [R] Sum weights of independent variables across models (AIC)
 Message-ID:
   CAHdFeLNoQBAHYL=CJ3dB=jbculdpycgk03dncypf2obvkca...@mail.gmail.com
 Content-Type: text/plain; charset=ISO-8859-1

 Hello,
 I'd like to sum the weights of each independent variable across linear
 models that have been evaluated using AIC.

 For example:

  library(MuMIn)
  data(Cement)
  lm1  -  lm(y  ~  .,  data  =  Cement)
  dd  -  dredge(lm1,  beta  = TRUE,  eval  =  TRUE,  rank  =  AICc)
  get.models(dd, subset = delta 4)
 There are 5 models with a Delta AIC Score of less than 4.  I would
 like to sum the weights for each of the independent variables across
 the five models.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R in Batch mode

2011-07-14 Thread Bos, Roger

Duncan's suggestion is probably the way to go, but I will just point out
that R does have a facility to perform a task when an error occurs.  I
have my code set up to send me an email when my batch code fails.
(email() is a function I wrote that executes sql command to send email
via dbmail.)

.Err - function() {email(roger@rothschild.com,!Notify: Job
FAILED on  %+% Sys.info()[4],,TEXT,HIGH)}
options (error = .Err)

Thanks,

Roger


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Duncan Murdoch
Sent: Wednesday, July 13, 2011 5:04 PM
To: cornejo...@gmail.com
Cc: r-help@r-project.org
Subject: Re: [R] R in Batch mode

On 11-07-13 4:15 PM, Jorge Cornejo wrote:
 Hi, I'm running long code in Batch mode (as a process in ubuntu using 
 $nohup R CMD BATCH MyCode.R out.out), but every one in a while it 
 fail and the process stops.

 When the code finish with no problems, it sends an e-mail, but if 
 fails, I'm not aware unless I check the working process, and I cannot 
 be checking the process all day!.

 Is there any way to send me a warning when the process stops/fails?

This is really an Ubuntu question, isn't it?

I'd suggest writing a script which calls your R batch file.  The last
thing the batch file should do is write some sort of status value.  The
last thing the script should do is check and act on the status.

Of course, if whatever kills your batch file kills the script too,
you're out of luck.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
***

This message is for the named person's use only. It may\...{{dropped:20}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Gui editor / viewer for large data

2011-07-14 Thread Andreas Borg


Dear all,

I am searching for a possibility to view large data sets (e.g. stored in 
ffdf objects) in a GUI window in a memory-efficient way. So far I looked 
at gtkDfEdit (package RGtk2Extras) and gdf (package gWidgets). Both 
operate (as far as I can see) on data frames stored in memory. gtkDfEdit 
accepts an ff data frame as input, but there is a long delay before it 
shows up, so I presume the data is converted to an in-memory data.frame 
beforehand.


Ideal would be a solution similar to JTable in Java, where one can 
implement a 'table model' class with a method (getValueAt or sth. like 
that) that returns the value of the underlying data for a given table 
cell (by x and y coordinates). When the table is drawn, getValueAt() is 
called for each cell in the viewable area. Cells outside the viewable 
range are not queried, so there is no need to keep the whole data in 
memory - it can be fetched from disk or computed on demand.


Before I seriously consider coding a suitable function or class myself, 
I would like to ask you, the R community, if anyone knows of an existing 
solution. To summarize, I need
   - a scrollable spreadsheet-like GUI element to view and potentially 
edit data
   - which only loads as much of the data as it displays at one point 
in time,
   - thereby able to handle datasets that are too large to fit into 
main memory


Thanks for any suggestions,

Andreas

--
Andreas Borg
Medizinische Informatik

UNIVERSITÄTSMEDIZIN
der Johannes Gutenberg-Universität
Institut für Medizinische Biometrie, Epidemiologie und Informatik
Obere Zahlbacher Straße 69, 55131 Mainz
www.imbei.uni-mainz.de

Telefon +49 (0) 6131 175062
E-Mail: b...@imbei.uni-mainz.de

Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. 
Wenn Sie nicht der
richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren 
Sie bitte sofort den
Absender und löschen Sie diese Mail. Das unerlaubte Kopieren sowie die 
unbefugte Weitergabe
dieser Mail und der darin enthaltenen Informationen ist nicht gestattet.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding vertical space before and after Sweave chunk

2011-07-14 Thread Duncan Murdoch


On 11-07-14 3:23 AM, Mark Heckmann wrote:

Thanks Duncan,
the problem now is that, the space between R code and R output is also
increased.
I would like to avoid this, i.e.

vertical space
R code
NO SPACE
R results
vertical space


Don't modify \topsep then, just put the spacing directly into the Schunk 
redefinition.  For example,


\renewenvironment{Schunk}{\vspace{20pt}}{\vspace{30pt}}

The way environments work in LaTeX is that the first arg (\vspace{20pt}) 
is put at the start, and the second one (\vspace{30pt}) at the end. 
Nothing happens in the middle, but the definitions for \Sinput and 
\Soutput implicitly make use of the \topsep size, so the definition 
below affected everything.


Duncan Murdoch



TIA,
Mark

Am 14.07.2011 um 02:13 schrieb Duncan Murdoch:


On 13/07/2011 7:14 PM, Mark Heckmann wrote:

I would like to have a bigger default space in front of and behind
every Sweave chunk.
I have seen that space between input and output is removed as follows:

\fvset{listparameters={\setlength{\topsep}{0pt}}}
\renewenvironment{Schunk}{\vspace{\topsep}}{\vspace{\topsep}}

Still, I can't figure out how to add vertical space before and after
the Sweave chunk.
Can someone help?



0pt standard for zero points. Use a bigger number if you want a
bigger space, e.g. 20pt.

Duncan Murdoch


–––
Mark Heckmann
Blog: www. markheckmann .de http://www.markheckmann.de/
R-Blog: http:// ryouready.wordpress .com http://ryouready.wordpress.com/






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R-help Digest, Vol 101, Issue 14

2011-07-14 Thread mihalicza . peter

Július 7-től 14-ig irodán kívül vagyok, és az emailjeimet nem érem el.

Sürgős esetben kérem forduljon Kárpáti Edithez (karpati.e...@gyemszi.hu).

Üdvözlettel,
Mihalicza Péter


I will be out of the office from 7 July till 14 July with no access to my 
emails.

In urgent cases please contact Ms. Edit Kárpáti (karpati.e...@gyemszi.hu).

With regards,
Peter Mihalicza

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Scaling in SVM

2011-07-14 Thread Meffy

Thanks! For testing purposes this rescaling works! But unfortunately due to
timing constraints I'm not able to do the rescaling of the data, so as I
mentioned I have to work on with unscaled data. So I have to calculate
$f(\vec x) = sum_{i \in sv} coefs_i \langle \vec x_i \cdot \vec x \rangle -
\rho$, where the sum runs over the support vectors (sorry, no LaTex
available here ). The problem is how to get unscaled coefs and rho to
calculate the result of the formula in order to save calculation time for
the evaluation of the formula.
Greetings 
Matthias

--
View this message in context: 
http://r.789695.n4.nabble.com/Scaling-in-SVM-tp3664588p3666945.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] t-test on a data-frame.

2011-07-14 Thread Economics Student

Dear R-helpers,

In a data frame I have 100 securities,monthly closing value,from 1995 to
present,which I have to

1. Sampling with replacement,make 50 samples of 10 securities each,each
sample hence will be a data frame with 10 columns.
2. With uniform probabilty,mark a month from 2000 onwards as a special
month,t=0.
3. I have to subtract the market index from each column of each sample and
then compute the residues.
4. For each data frame of residues I have to compute the statistic (  Eps i0
- mean(Eps it )  ) / var( Eps it ). Here i and t vary over one particular
data frame. i0 corresponds to ith security residue on the special month.
Basically a t-test involving a frame instead of a vector.
5. Print out a table where the statistic is significant at the 1,5,10%
level.

Could someone give the broad ideas on doing this ?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Var.calc in Matching

2011-07-14 Thread lglew

Hi,

I already have matched samples, which were matched using different software.
I need to calculate Abadie-Imbens standard errors, together with the average
treatment effect. I know that the Matching package enables me to calculate
these after a Matching procedure, but is there any way to do it on already
matched samples, which were not produce by the Match function?

Many thanks,

Louise.

--
View this message in context: 
http://r.789695.n4.nabble.com/Var-calc-in-Matching-tp3666950p3666950.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with x labels of barplot

2011-07-14 Thread Don

Hello everyone,
i am currently creating a barplot.
This barplot takes a vector of ~200 datapoints. 

Each datapoint represents one bar.
http://img96.imageshack.us/i/human1w.png/
(Ok as you see, it is not only one barplot, but a series of barplots).

Now, these barplots represent a human chromosome. This means they are
ordered. For instance bar number 50, means position 50 in the human
chromosome. I would like to have x-axis labels showing this. 
0...50..100...150200...250
Yet i do not know how to accomplish this.
If you use the normal plot function, these numbers on the xaxis are
autogenerated, in case of a barplot, they are not, and i do not know how to
create these labels.

I would be happy about a solution.

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-x-labels-of-barplot-tp3667337p3667337.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Add a density line to a cumulative histogram - second try

2011-07-14 Thread Jochen1980

Hi list,

this is my second try for first post on this list (I tried to post via email
and nothing appeared in my email-inbox, so now I try to use the
nabble-web-interface) - I hope that you will only have to read one post in
your inbox! Okay, my question ...

I was able to plot a histogram and add the density()-line to this plot.
I was able to plot a cumulative form of this histogram.
Yet, I was not able to add the density line to this cumulative histogram.

You can watch a picture of the histograms here:
http://www.jochen-bauer.net/downloads/histo-cumulative-density.png

Source:

# Histogramm
histo - hist( hgd$V1, freq=FALSE )

# Dichte-Schätzer-Funktion reinzeichnen
denspoints - density( hgd$V1 )
lines( denspoints, col=GREEN, lwd=2 )

# Kumulative Verteilung relativer Häufigkeiten
relcum - cumsum( histo$counts ) / sum(histo$counts)
barplot( relcum, names.arg=round( histo$mids, 2), col=green,
ylab=Wahrscheinlichkeit)

# Kumulative Dichtefunktion ?

Thanks in advance - Jochen

--
View this message in context: 
http://r.789695.n4.nabble.com/Add-a-density-line-to-a-cumulative-histogram-second-try-tp3666969p3666969.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with x labels of barplot

2011-07-14 Thread Duncan Murdoch


On 11-07-14 7:51 AM, Don wrote:

Hello everyone,
i am currently creating a barplot.
This barplot takes a vector of ~200 datapoints.

Each datapoint represents one bar.
http://img96.imageshack.us/i/human1w.png/
(Ok as you see, it is not only one barplot, but a series of barplots).

Now, these barplots represent a human chromosome. This means they are
ordered. For instance bar number 50, means position 50 in the human
chromosome. I would like to have x-axis labels showing this.
0...50..100...150200...250
Yet i do not know how to accomplish this.
If you use the normal plot function, these numbers on the xaxis are
autogenerated, in case of a barplot, they are not, and i do not know how to
create these labels.

I would be happy about a solution.



The axis() function can draw whatever you ask for.  The tricky thing is 
to get the ticks in the right place, because a barplot doesn't have an 
obvious x-axis scale.  The secret is to save the result of the barplot 
call:  it contains the centres of the bars.


For example:

x - rpois(200, 20)
centres - barplot(x)
axis(1, at=centres[c(50, 100, 150)], labels=c(50, 100, 150))
box()

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] t-test on a data-frame.

2011-07-14 Thread Sarah Goslee

Hi Student (since you have no other name),

On Thu, Jul 14, 2011 at 6:42 AM, Economics Student
economicsstudent2...@gmail.com wrote:
 Dear R-helpers,

 In a data frame I have 100 securities,monthly closing value,from 1995 to
 present,which I have to

 1. Sampling with replacement,make 50 samples of 10 securities each,each
 sample hence will be a data frame with 10 columns.
 2. With uniform probabilty,mark a month from 2000 onwards as a special
 month,t=0.
 3. I have to subtract the market index from each column of each sample and
 then compute the residues.
 4. For each data frame of residues I have to compute the statistic (  Eps i0
 - mean(Eps it )  ) / var( Eps it ). Here i and t vary over one particular
 data frame. i0 corresponds to ith security residue on the special month.
 Basically a t-test involving a frame instead of a vector.
 5. Print out a table where the statistic is significant at the 1,5,10%
 level.

 Could someone give the broad ideas on doing this ?

Why yes. You should talk to your professor or TA. The list doesn't do
homework problems, though if you ask nicely we may help with the finer
points of R. And by nicely, I don't mean saying please, though it
doesn't hurt. I mean providing reproducible examples, clear statements
of your problem, and signing your email professionally, with an actual
name.

Sarah
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read 20 columns from the file

2011-07-14 Thread Jim Lemon


On 07/13/2011 10:08 PM, Kishorenalluri wrote:

Hi Jim,
 Saving is not a problem. I wanted to load/read the columns from the
file followed by plotting the area plot using ggplot2.

I am a basic user. I am trying to reproduce the plot similar to the example
given here.
http://processtrends.com/images/RClimate_NINO_34_latest.png


Hi Kishorenalluri,
How about this?

# download the data file from NOAA
# http://www.cpc.ncep.noaa.gov/data/indices/wksst.for
# make the columns match the header row with a global
# replace of - with  - (add a space)
# then this seems to reproduce the plot you indicated
plot(as.Date(sst$Week,%d%b%Y),sst$SSTA1,type=h,
 col=2+2*(sst$SSTA10),
 main=Sea surface anomalies from NOAA,
 xlab=Year,ylab=Temperature anomaly)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to read 20 columns from the file

2011-07-14 Thread Jim Lemon


On 07/14/2011 10:35 PM, Jim Lemon wrote:

On 07/13/2011 10:08 PM, Kishorenalluri wrote:

Hi Jim,
Saving is not a problem. I wanted to load/read the columns from the
file followed by plotting the area plot using ggplot2.

I am a basic user. I am trying to reproduce the plot similar to the
example
given here.
http://processtrends.com/images/RClimate_NINO_34_latest.png


Hi Kishorenalluri,
How about this?

# download the data file from NOAA
# http://www.cpc.ncep.noaa.gov/data/indices/wksst.for
# make the columns match the header row with a global
# replace of - with  - (add a space)
# then this seems to reproduce the plot you indicated
plot(as.Date(sst$Week,%d%b%Y),sst$SSTA1,type=h,
col=2+2*(sst$SSTA10),
main=Sea surface anomalies from NOAA,
xlab=Year,ylab=Temperature anomaly)


Oops, forgot to include the line reading in the data file:

sst-read.table(sst_1990_2011.dat,header=TRUE)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] SQldf with sqlite and H2

2011-07-14 Thread Mandans

SQldf with sqlite and H2

I have a large csv file (about 2GB) and wanted to import the file into R and do 
some filtering and analysis. Came across sqldf ( a great idea and product) and 
was trying to play around to see what would be the best method of doing this. 
csv file is comma delimited with some columns having comma inside the quoation 
like this John, Doe. 

I tried this first 

###
library(sqldf)
sqldf(attach testdb as new)
In.File - C:/JP/Temp/2008.csv
read.csv.sql(In.File, sql = create table table1 as select * from file, 
  dbname = testdb)

It errored out with message

NULL
Warning message:
closing unused connection 3 (C:/JP/Temp/2008.csv)   

When this failed, I converted this file from comma delimited to tab delimited 
and used this command  

#  
read.csv.sql(In.File, sql = create table table1 as select * from file, 
  dbname = testdb, sep = \t)

and this worked, it created testdb sqlite file with the size of 3GB

now my question is in 3 parts.

1. Is it possible to create a dataframe with appropriate column classes and use 
that column classes when I use the read.csv.sql command to create the table. 
Something like may be create the table from that DF and then update with 
read.csv.sql.?

Any example code will be really helpful.

2. If we use the H2 database instead of default sqlite and use the readcsv 
option, will that be faster and is there a way we can specify the above thought 
of applying a DF class to table column properties and update with CSVREAD

library(RH2)
something like SELECT * FROM CSVREAD('C:/JP/Temp/2008.csv')

Any example code will be really helpful.

3. How do we specify where the H2 file is saved. Saw something like this, when 
I ran this example from RH2 package, couldn't find the file in the working 
directory.

con - dbConnect(H2(), jdbc:h2:~/test, sa, )

Sorry for the long mail. Appreciate all for building a great community and for 
the wonderful software in R.
Thanks for Gabor Grothendieck for bring sqldf to this great community.

Any help or direction you can provide in this is highly appreciated.

Thanks all.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum weights of independent variables across models (AIC)

2011-07-14 Thread Michael Just

This is what I was looking for. When I initially read about model.avg
I didn't recognize it also provided variable scores.

Thank you kindly,
Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with x labels of barplot

2011-07-14 Thread Don

Thanks a lot! Great help!

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-x-labels-of-barplot-tp3667337p3667498.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plotting x y z data from an irregular grid

2011-07-14 Thread erika

Hi,
I'm trying to plot some data (z) that is linked to latlong coordinates
(xy). These coordinates are not however on a regular grid. I also have some
shapefiles on which I would like to overlay the data. I can plot the
shapefiles (country/city outlines) and overplot the data, but only using
quilt.plot because I otherwise always get the error message that 
'Error in image.default(..., add = add, col = col) : 
  increasing 'x' and 'y' values expected' 
which has something to do with the organization of my data but I cannot
figure out how to change it correctly. This is the code that I have that
works:


data-read.csv('with coord_observational log data trends all years all
data.8.11.10.csv', header=TRUE)

## this is what the 'data' looks like:
head(data)
  X SiteCode Latitude   Longitude   p perc_per_year
perc_per_year_lower
1 1  A30 51.37357 -0.29172504 0.369164267-0.4781589  
-1.390382
2 2  BB1 51.68299 -0.03254972 0.005546354-3.1810064  
-5.665312
3 3  BG1 51.56375  0.17789100 0.000405606-3.2260763  
-5.344999
4 4  BG2 51.52939  0.13285700 0.434756172-5.1558318 
-22.123800
5 5  BH1 50.82334 -0.13724510 0.183375348-0.8735160  
-2.240289
6 6  BH2 50.82785 -0.17051300 0.002702969-2.1443157  
-3.543378
  perc_per_year_upper sig
1   0.4786508 -999.00
2  -0.8588293   -3.181006
3  -1.5251377   -3.226076
4  11.0957982 -999.00
5   0.3431518 -999.00
6  -0.7966338   -2.144316

#read in the shapefile
england-readShapePoly('D:/arcGIS/england boundary/england.shp')
class(england)
#define the projection
proj4string(england)-CRS('+proj=tmerc +lat_0=49 +lon_0=-2 +k=0.999601271625
+x_0=40 +y_0=-10 +ellps=airy +units=m +no_defs')
# transform the map into the WGS84 projection (epsg:4326):
england.wgs-spTransform(england, CRS('+init=epsg:4326'))

plot(england.wgs)
#plot data over the map:
quilt.plot(data$Longitude, data$Latitude, data$perc_per_year, add=TRUE)

My problem is that I would like to be able to change how this data looks
(not just 'grid squares', but circles, etc) which I can't seem to do wtih
quilt.plot. If I could do this I could plot one layer of gridded data
(squares as in quilt.plot) and overlay a second layer of 'z' data as points.
I have tried plot.surface and image.plot and a number of others, but because
of the error message that I get above it won't work. I can use image.plot,
etc if I create a grid and interpolate my data onto the grid (see code
below), but I don't want interpolated data, I would like discreet values for
each latlong. 


x-seq(-4,2, by=0.0625)
y-seq(50,53, by=0.0625)
d1-expand.grid(x=x, y=y)
data.li-interp(data$Longitude, data$Latitude, data$perc_per_year,
duplicate='mean')

So my questions are, (1) is there a different function that I should be
using with my data as it is and still be able to overplot it onto the map
that i've plotted? or (2) is there a way to create this grid and integrate
my data into the grid, but not interpolate it?

Any help would be very much appreciated. My R skills just are not good
enough to do this yet.

Thank you!!
Erika. 



--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-x-y-z-data-from-an-irregular-grid-tp3667605p3667605.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] glm() scale parameters and predicted Values

2011-07-14 Thread Ben Bolker

Peter Maclean pmaclean2011 at yahoo.com writes:

 
 In glm() you can use the summary() function to recover
 the shape parameter (the reciprocal of the
 dispersion parameter). How do you recover the scale parameter? 
 Also, in the given example, how I estimate
 and save the geometric mean of the predicted values? 
 For a simple model you can use fitted() or predicted()
 functions. I will appreciate any help. 
  
  
  
 #Call required R packages
 require(plyr)  
 require(stats) 
 require(fitdistrplus)
 require(MASS)
 #Grouped vector
 n - c(1:10)
 yr -c(1:10)
 ny - list(yr=yr,n=n)
 require(utils)
 ny - expand.grid(ny) 
 y = rgamma(100, shape=1.5, rate = 1, scale = 2)

   It's a bit odd to specify both the rate and scale parameters,
which are redundant.  I would have guessed that the
rate parameter would dominate, but it looks like the
scale parameter dominates:

 set.seed(1001); rgamma(1,shape=1,rate=2)
[1] 1.622577
 set.seed(1001); rgamma(1,shape=1,scale=2)
[1] 6.490306
 set.seed(1001); rgamma(1,shape=1,rate=2,scale=2)
[1] 6.490306
 set.seed(1001); rgamma(1,shape=1,scale=2,rate=2)
[1] 6.490306

  (I know, I could go look at the source, but it's
a .Internal() function and I'm feeling lazy ...)

 Gdata - cbind(ny,y)
 Gdata2- Gdata
 Gdata$x1 - cos((3.14*yr)/365.25) 
 Gdata$x2 - sin((3.14*yr)/365.25) 
 #Fitting Generalized Linear Models 
 Gdata - split(Gdata,Gdata$n)
 FGLM - lapply(Gdata, function(x){
   m - as.numeric(x$y)
   x1 - m - as.numeric(x$x1)
   x2 - m - as.numeric(x$x2)
   summary(glm(m~1+x1+x2, family=Gamma),dispersion=NULL) 
    })
 

  Note that you have simulated a null model (the data don't
depend on the covariates).

 #Save the results of the estimated parameters
 str(FGLM,no.list = TRUE)
 SFGLMC- ldply(FGLM, function(x) x$coefficients)
 SFGLMD- ldply(FGLM, function(x) x$dispersion)
 GLM - cbind(SFGLMC,SFGLMD)

  Which scale parameter do you mean?  In a real
model it will vary with x1 and x2.  Let's try an
experiment first:

 set.seed(1001)
 z - rgamma(1,shape=2,scale=3)
 g1 - glm(z~1,family=Gamma)
 summary(g1)

Call:
glm(formula = z ~ 1, family = Gamma)

Deviance Residuals: 
Min   1Q   Median   3Q  Max  
-2.8434  -0.6579  -0.1702   0.3081   2.6220  

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept) 0.167013   0.001189   140.5   2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

(Dispersion parameter for Gamma family taken to be 0.5066566)

Null deviance: 5419.8  on   degrees of freedom
Residual deviance: 5419.8  on   degrees of freedom
AIC: 53526

  Here the intercept estimate is 0.167 , which is very
nearly 1/6 = 1/(shape*scale) -- i.e. the Gamma GLM is
parameterized in terms of the mean (on the inverse
scale).  If you want to recover
the scale parameter for the intercept case, then

summary(g1)$dispersion/coef(g1)[1]

should be pretty good.

As for the geometric means -- that's pretty tricky.
*If* you used a log link, then the predicted values on
the link scale (i.e. predict(g1,type=link)) would indeed
give you the geometric means.  On the inverse scale, though,
you would have to do a bit of finagling.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculating distance inland from coastline

2011-07-14 Thread Simon Goodman

Hi All, 

Does anybody know of any existing functions that will calculate distance
inland from a coastline?

It's possible to test if a lon,lat location is land or sea using
map.where(), but I need to add a buffer to this of say 2km, to allow for
points that are just on the coast, and below the resolution of the
worldHires database.

I'm working with a marine mammal satellite telemetry dataset and wish to
filter out spurious locations on land.

On another issue, does anybody know of any free vector map datasets that are
more up to date and have a higher resolution than the worldHires database
from the 'maps' package.

Thanks, Simon 



--
View this message in context: 
http://r.789695.n4.nabble.com/calculating-distance-inland-from-coastline-tp3667464p3667464.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R package: pbatR

2011-07-14 Thread Lisa

Dear All,

Does anybody have experience with R package pbatR
(http://cran.r-project.org/web/packages/pbatR/index.html)? I am trying to
use it to analyze the family-based case-control data, but the package
totally doesn’t work on my computer. I contacted the authors of the package,
but I haven’t heard anything from them. 

Following the package manual, I tried the simple example as below:

library(pbatR)
library(tcltk)
pbat.set(C:/pbat)

x - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3),  # three families
 id = c(1,2,3,1,2,3,4,1,2,3),
 idfath = c(0,0,1,0,0,1,1,0,0,1),
 idmoth = c(0,0,2,0,0,2,2,0,0,2),
sex = c(1,2,1,1,2,2,2,1,2,1),
AffectionStatus = c(0,0,1,0,0,1,0,0,0,1),  # 1 for case, 0 for control
   m1.1 = c(1,1,2,2,1,1,2,2,2,1),  # two SNPs with two columns
for each SNP
   m1.2 = c(1,2,1,2,1,2,1,1,2,2),
   m2.1 = c(1,1,2,2,2,1,1,1,2,1),
   m2.2 = c(2,1,2,1,2,2,2,1,1,1))
x1 - as.ped(x)

y - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3),
 id = c(1,2,3,1,2,3,4,1,2,3),
age = c(55,50,22,38,37,15,11,42,41,17),
 weight = c(185,170,130,165,170,90,60,170,160,120))
y1 - as.phe(y)


1. I first consider a model with the disease as a phenotype, and two SNPs as
predictors (on covariates) as bellow:

pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee,
distribution='categorical', offset='none')

But some error messages were returned:
Error in writeCommandStrMatch(distribution, distribution, c(default,  : 
  'distribution' can only take on the following values: 'default', 'jiang',
'murphy', 'naive', 'observed'.  You passed the invalid value 'categorical'.

Then I removed last two arguments 
pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee)

This time, a box appeared on the console:
R for Windows GUI front-end has encountered a problem and needs to close.

2. I consider a model with the disease as a phenotype, and two covariates
(age, weigth) and two SNPs as predictors as bellow:

pbat.m(AffectionStatus ~ age + weight, y1, x1, fbat=gee)

The function had been running for a very long time and no output was
returned until I had to stop it.

Any help would be greatly appreciated. 

Thanks.

Lisa


--
View this message in context: 
http://r.789695.n4.nabble.com/R-package-pbatR-tp3667844p3667844.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R package: pbatR

2011-07-14 Thread David Winsemius

I am guessing (from other evidence of lapses in attention to  
documentation)  that you failed to pay attention when you encountered  
these sentences on the page you offered a link to:


For analysis, this package provides a frontend to the PBAT software

For analysis, users must download PBAT (developed by Christoph Lange)  
and accept it's license, available on the PBAT webpage.


--
david.

On Jul 14, 2011, at 11:05 AM, Lisa wrote:


Dear All,

Does anybody have experience with R package pbatR
(http://cran.r-project.org/web/packages/pbatR/index.html)? I am  
trying to

use it to analyze the family-based case-control data, but the package
totally doesn’t work on my computer. I contacted the authors of the  
package,

but I haven’t heard anything from them.

Following the package manual, I tried the simple example as below:

library(pbatR)
library(tcltk)
pbat.set(C:/pbat)

x - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3),  # three families
id = c(1,2,3,1,2,3,4,1,2,3),
idfath = c(0,0,1,0,0,1,1,0,0,1),
idmoth = c(0,0,2,0,0,2,2,0,0,2),
   sex = c(1,2,1,1,2,2,2,1,2,1),
   AffectionStatus = c(0,0,1,0,0,1,0,0,0,1),  # 1 for case, 0 for  
control
  m1.1 = c(1,1,2,2,1,1,2,2,2,1),  # two SNPs with two  
columns

for each SNP
  m1.2 = c(1,2,1,2,1,2,1,1,2,2),
  m2.1 = c(1,1,2,2,2,1,1,1,2,1),
  m2.2 = c(2,1,2,1,2,2,2,1,1,1))
x1 - as.ped(x)

y - data.frame(pid = c(1,1,1,2,2,2,2,3,3,3),
id = c(1,2,3,1,2,3,4,1,2,3),
   age = c(55,50,22,38,37,15,11,42,41,17),
weight = c(185,170,130,165,170,90,60,170,160,120))
y1 - as.phe(y)


1. I first consider a model with the disease as a phenotype, and two  
SNPs as

predictors (on covariates) as bellow:

pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee,
distribution='categorical', offset='none')

But some error messages were returned:
Error in writeCommandStrMatch(distribution, distribution,  
c(default,  :
 'distribution' can only take on the following values: 'default',  
'jiang',
'murphy', 'naive', 'observed'.  You passed the invalid value  
'categorical'.


Then I removed last two arguments
pbat.m(AffectionStatus ~ NONE, y1, x1, fbat=gee)

This time, a box appeared on the console:
R for Windows GUI front-end has encountered a problem and needs to  
close.


2. I consider a model with the disease as a phenotype, and two  
covariates

(age, weigth) and two SNPs as predictors as bellow:

pbat.m(AffectionStatus ~ age + weight, y1, x1, fbat=gee)

The function had been running for a very long time and no output was
returned until I had to stop it.

Any help would be greatly appreciated.

Thanks.

Lisa


--
View this message in context: 
http://r.789695.n4.nabble.com/R-package-pbatR-tp3667844p3667844.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calculating distance inland from coastline

2011-07-14 Thread Francois Rousseu


Hi Simon
 
A combination of functions gDistance, gBuffer and gIntersects from package 
rgeos should do the job. Also, have a look at www.naturalearthdata.com. They 
have various shapefiles with coastlines and land polygons, though I don't know 
how the resolution compares with the worldHires database.
 
Cheers,
Francois Rousseu
 

 Date: Thu, 14 Jul 2011 05:43:23 -0700
 From: s.j.good...@leeds.ac.uk
 To: r-help@r-project.org
 Subject: [R] calculating distance inland from coastline
 
 Hi All, 
 
 Does anybody know of any existing functions that will calculate distance
 inland from a coastline?
 
 It's possible to test if a lon,lat location is land or sea using
 map.where(), but I need to add a buffer to this of say 2km, to allow for
 points that are just on the coast, and below the resolution of the
 worldHires database.
 
 I'm working with a marine mammal satellite telemetry dataset and wish to
 filter out spurious locations on land.
 
 On another issue, does anybody know of any free vector map datasets that are
 more up to date and have a higher resolution than the worldHires database
 from the 'maps' package.
 
 Thanks, Simon 
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/calculating-distance-inland-from-coastline-tp3667464p3667464.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Adding rows based on column value

2011-07-14 Thread Bansal, Vikas

Dear all,

I have one problem and did not find any solution.(I have also attached the 
problem in text file because sometimes column spacing is not good in mail)

I have a file(file.txt) attached with this mail.I am reading it using this code 
to make a data frame (file)-

file=read.table(file.txt,fill=T,colClasses = character,header=T)

file looks like this-

 Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

Now some of the values in column Pos are same.for these same positions i want 
to add the values of columns 2:6
I will explain with an example-
The output of first row should be-

 Chr   PosCaseA CaseCCaseG  CaseT
 10 135344110  0.00 24.00  48.00  0.00

so the whole output for above input should be-

 Chr   PosCaseA CaseCCaseG  CaseT
 10 1353441100.00  24.00  48.000.00
 10 1353441130.00   0.00   24.000.00
  10 135344114  48.00  0.000.00 0.00
  10 135344116   0.00   0.000.0024.00
  10 135344118   0.00  24.00   0.0024.00
  10 135344122  24.00 24.00   0.000.00
  10 135344123   0.00  48.00   0.0024.00
  10 135344126   0.00  0.0024.00   0.00

Can you please help me.



Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London Chr   Pos CaseA CaseC CaseG CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00
  Dear all,

I have one problem and did not find any solution.

I have a file(file.txt) attached with this mail.I am reading it using this code 
to make a data frame (file)-

file=read.table(file.txt,fill=T,colClasses = character,header=T)

file looks like this-

 Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

Now some of the values in column Pos are same.For these same positions i want 
to add the values of columns 3:6
I will explain with an example-
The output of first row should be-

 Chr   Pos   CaseA CaseC  CaseG  CaseT
 10 135344110  0.00 24.00  48.00  0.00

because first three rows have same value in Pos column.

so the whole output for above input should be-

 Chr   PosCaseA CaseC CaseG  CaseT
 10 1353441100.00  24.00  48.000.00
 10 1353441130.00   0.00   24.000.00
  10 135344114  48.00  0.000.00 0.00
  10 135344116   0.00   0.000.0024.00

Re: [R] Adding rows based on column value

2011-07-14 Thread Bert Gunter

?tapply (in base R)
?aggregate  ?by  (wrapper for tapply)
?ave (in base R -- based on tapply)

Also package plyr (and several others, undoubtedly).

Also google on R summarize data by groups or similar gets many relevant hits.

-- Bert




2011/7/14 Bansal, Vikas vikas.ban...@kcl.ac.uk:
 Dear all,

 I have one problem and did not find any solution.(I have also attached the 
 problem in text file because sometimes column spacing is not good in mail)

 I have a file(file.txt) attached with this mail.I am reading it using this 
 code to make a data frame (file)-

 file=read.table(file.txt,fill=T,colClasses = character,header=T)

 file looks like this-

  Chr       Pos            CaseA     CaseC            CaseG      CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

 Now some of the values in column Pos are same.for these same positions i want 
 to add the values of columns 2:6
 I will explain with an example-
 The output of first row should be-

  Chr       Pos            CaseA     CaseC            CaseG      CaseT
  10 135344110  0.00 24.00  48.00  0.00

 so the whole output for above input should be-

  Chr       Pos            CaseA     CaseC            CaseG      CaseT
  10 135344110    0.00  24.00  48.00    0.00
  10 135344113    0.00   0.00   24.00    0.00
  10 135344114  48.00  0.00    0.00     0.00
  10 135344116   0.00   0.00    0.00    24.00
  10 135344118   0.00  24.00   0.00    24.00
  10 135344122  24.00 24.00   0.00    0.00
  10 135344123   0.00  48.00   0.00    24.00
  10 135344126   0.00  0.00    24.00   0.00

 Can you please help me.



 Thanking you,
 Warm Regards
 Vikas Bansal
 Msc Bioinformatics
 Kings College London
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question on formula and terms.formula()

2011-07-14 Thread William Dunlap

I think you should replace
   terms.formula(formula)
by
   terms(formula)
When terms() is given a formula object it will
execute terms.formula but for other classes of
inputs it will invoke the appropriate method.
E.g., your formula may already be a terms object,
in which case terms.formula(formula) will waste
time reanalyzing it while terms(formula) will
invoke terms.terms, which just returns its input.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

-Original Message-
From: Pang Du [mailto:pan...@vt.edu] 
Sent: Thursday, July 14, 2011 8:36 AM
To: William Dunlap; r-help@r-project.org
Subject: RE: [R] question on formula and terms.formula()

Thank you so much for your suggestion, Bill.

The R program I try to modify needs match.call() for something else. But the
problem does seem to be caused by this statement as you suggested. Following
this clue, I find out that 

terms.formula(formula)

does essentially what I want for terms.formula(mf$formula).

Pang Du
Virginia Tech


-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Wednesday, July 13, 2011 12:21 PM
To: pan...@vt.edu; r-help@r-project.org
Subject: RE: [R] question on formula and terms.formula()

Does your code work if you omit the match.call step?  terms()
generally doesn't need that.  Also, do not call terms.formula(formula):
call just terms(formula).  The terms method appropriate to the class
of the formula will be used.

   f1 - function(formula, ...) {
  + terms(formula, ...)
  + }
   form - as.formula(y ~ x1 + Error(x2))
   f1(form, specials=Error)
  y ~ x1 + Error(x2)
  attr(,variables)
  list(y, x1, Error(x2))
  attr(,factors)
x1 Error(x2)
  y  0 0
  x1 1 0
  Error(x2)  0 1
  attr(,term.labels)
  [1] x1Error(x2)
  attr(,specials)
  attr(,specials)$Error
  [1] 3
  attr(,order)
  [1] 1 1
  attr(,intercept)
  [1] 1
  attr(,response)
  [1] 1
  attr(,.Environment)
  environment: R_GlobalEnv

Bill Dunlap
TIBCO Spotfire

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf
of pan...@vt.edu [pan...@vt.edu]
Sent: Tuesday, July 12, 2011 8:40 PM
To: r-help@r-project.org
Subject: [R] question on formula and terms.formula()

I'm trying to create a formula object to pass on to a function that
applies the function terms.formula() to it.

f - function(formula, ...)
{
...
mf - match.call()
term - terms.formula(mf$formula)
...
}

However, my code below gives an error.

form - as.formula(y~x)
f(form, ...)

The error message was:
Error in terms.formula(mf$formula): argument is not a valid model.

Could anybody help me figure out the problem and find a solution here?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rgl: reproduce final state of interactive plot?

2011-07-14 Thread sjaffe

After interacting with a 3d plot (eg plot3d, persp3d), is there a way to
capture the final settings of view angles, etc, so that the final plot could
be easily reproduced? The plot functions themselves just return a vector of
'ids'.

--
View this message in context: 
http://r.789695.n4.nabble.com/rgl-reproduce-final-state-of-interactive-plot-tp3667866p3667866.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question on formula and terms.formula()

2011-07-14 Thread Pang Du

Thank you so much for your suggestion, Bill.

The R program I try to modify needs match.call() for something else. But the
problem does seem to be caused by this statement as you suggested. Following
this clue, I find out that 

terms.formula(formula)

does essentially what I want for terms.formula(mf$formula).

Pang Du
Virginia Tech


-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Wednesday, July 13, 2011 12:21 PM
To: pan...@vt.edu; r-help@r-project.org
Subject: RE: [R] question on formula and terms.formula()

Does your code work if you omit the match.call step?  terms()
generally doesn't need that.  Also, do not call terms.formula(formula):
call just terms(formula).  The terms method appropriate to the class
of the formula will be used.

   f1 - function(formula, ...) {
  + terms(formula, ...)
  + }
   form - as.formula(y ~ x1 + Error(x2))
   f1(form, specials=Error)
  y ~ x1 + Error(x2)
  attr(,variables)
  list(y, x1, Error(x2))
  attr(,factors)
x1 Error(x2)
  y  0 0
  x1 1 0
  Error(x2)  0 1
  attr(,term.labels)
  [1] x1Error(x2)
  attr(,specials)
  attr(,specials)$Error
  [1] 3
  attr(,order)
  [1] 1 1
  attr(,intercept)
  [1] 1
  attr(,response)
  [1] 1
  attr(,.Environment)
  environment: R_GlobalEnv

Bill Dunlap
TIBCO Spotfire

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf
of pan...@vt.edu [pan...@vt.edu]
Sent: Tuesday, July 12, 2011 8:40 PM
To: r-help@r-project.org
Subject: [R] question on formula and terms.formula()

I'm trying to create a formula object to pass on to a function that
applies the function terms.formula() to it.

f - function(formula, ...)
{
...
mf - match.call()
term - terms.formula(mf$formula)
...
}

However, my code below gives an error.

form - as.formula(y~x)
f(form, ...)

The error message was:
Error in terms.formula(mf$formula): argument is not a valid model.

Could anybody help me figure out the problem and find a solution here?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question on formula and terms.formula()

2011-07-14 Thread Pang Du

Points taken and terms(formula) is used now. Thanks.

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Thursday, July 14, 2011 11:56 AM
To: Pang Du; r-help@r-project.org
Subject: RE: [R] question on formula and terms.formula()

I think you should replace
   terms.formula(formula)
by
   terms(formula)
When terms() is given a formula object it will
execute terms.formula but for other classes of
inputs it will invoke the appropriate method.
E.g., your formula may already be a terms object,
in which case terms.formula(formula) will waste
time reanalyzing it while terms(formula) will
invoke terms.terms, which just returns its input.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

-Original Message-
From: Pang Du [mailto:pan...@vt.edu] 
Sent: Thursday, July 14, 2011 8:36 AM
To: William Dunlap; r-help@r-project.org
Subject: RE: [R] question on formula and terms.formula()

Thank you so much for your suggestion, Bill.

The R program I try to modify needs match.call() for something else. But the
problem does seem to be caused by this statement as you suggested. Following
this clue, I find out that 

terms.formula(formula)

does essentially what I want for terms.formula(mf$formula).

Pang Du
Virginia Tech

-Original Message-
From: William Dunlap [mailto:wdun...@tibco.com] 
Sent: Wednesday, July 13, 2011 12:21 PM
To: pan...@vt.edu; r-help@r-project.org
Subject: RE: [R] question on formula and terms.formula()

Does your code work if you omit the match.call step?  terms()
generally doesn't need that.  Also, do not call terms.formula(formula):
call just terms(formula).  The terms method appropriate to the class
of the formula will be used.

   f1 - function(formula, ...) {
  + terms(formula, ...)
  + }
   form - as.formula(y ~ x1 + Error(x2))
   f1(form, specials=Error)
  y ~ x1 + Error(x2)
  attr(,variables)
  list(y, x1, Error(x2))
  attr(,factors)
x1 Error(x2)
  y  0 0
  x1 1 0
  Error(x2)  0 1
  attr(,term.labels)
  [1] x1Error(x2)
  attr(,specials)
  attr(,specials)$Error
  [1] 3
  attr(,order)
  [1] 1 1
  attr(,intercept)
  [1] 1
  attr(,response)
  [1] 1
  attr(,.Environment)
  environment: R_GlobalEnv

Bill Dunlap
TIBCO Spotfire

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf
of pan...@vt.edu [pan...@vt.edu]
Sent: Tuesday, July 12, 2011 8:40 PM
To: r-help@r-project.org
Subject: [R] question on formula and terms.formula()

I'm trying to create a formula object to pass on to a function that
applies the function terms.formula() to it.

f - function(formula, ...)
{
...
mf - match.call()
term - terms.formula(mf$formula)
...
}

However, my code below gives an error.

form - as.formula(y~x)
f(form, ...)

The error message was:
Error in terms.formula(mf$formula): argument is not a valid model.

Could anybody help me figure out the problem and find a solution here?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.=

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl: reproduce final state of interactive plot?

2011-07-14 Thread Duncan Murdoch


On 14/07/2011 11:12 AM, sjaffe wrote:

After interacting with a 3d plot (eg plot3d, persp3d), is there a way to
capture the final settings of view angles, etc, so that the final plot could
be easily reproduced? The plot functions themselves just return a vector of
'ids'.



Yes, saving the result of par3d() will save it.  (It saves more than 
necessary; see ?par3d for more details.)


Duncan Murdoch


--
View this message in context: 
http://r.789695.n4.nabble.com/rgl-reproduce-final-state-of-interactive-plot-tp3667866p3667866.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-14 Thread Bansal, Vikas

I have checked it but did not get any results.Is there a way I can do it?I will 
be very thankful to you.

Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: Bert Gunter [gunter.ber...@gene.com]
Sent: Thursday, July 14, 2011 4:54 PM
To: Bansal, Vikas
Cc: r-help@r-project.org
Subject: Re: [R] Adding rows based on column value

?tapply (in base R)
?aggregate  ?by  (wrapper for tapply)
?ave (in base R -- based on tapply)

Also package plyr (and several others, undoubtedly).

Also google on R summarize data by groups or similar gets many relevant hits.

-- Bert




2011/7/14 Bansal, Vikas vikas.ban...@kcl.ac.uk:
 Dear all,

 I have one problem and did not find any solution.(I have also attached the 
 problem in text file because sometimes column spacing is not good in mail)

 I have a file(file.txt) attached with this mail.I am reading it using this 
 code to make a data frame (file)-

 file=read.table(file.txt,fill=T,colClasses = character,header=T)

 file looks like this-

  Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

 Now some of the values in column Pos are same.for these same positions i want 
 to add the values of columns 3:6
 I will explain with an example-
 The output of first row should be-

  Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  48.00  0.00

 so the whole output for above input should be-

  Chr   PosCaseA CaseCCaseG  CaseT
  10 1353441100.00  24.00  48.000.00
  10 1353441130.00   0.00   24.000.00
  10 135344114  48.00  0.000.00 0.00
  10 135344116   0.00   0.000.0024.00
  10 135344118   0.00  24.00   0.0024.00
  10 135344122  24.00 24.00   0.000.00
  10 135344123   0.00  48.00   0.0024.00
  10 135344126   0.00  0.0024.00   0.00

 Can you please help me.



 Thanking you,
 Warm Regards
 Vikas Bansal
 Msc Bioinformatics
 Kings College London
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





--
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R package: pbatR

2011-07-14 Thread Lisa

Thanks. I have installed PBAT on my computer.

--
View this message in context: 
http://r.789695.n4.nabble.com/R-package-pbatR-tp3667844p3667907.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-14 Thread Bansal, Vikas


Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: Bansal, Vikas
Sent: Thursday, July 14, 2011 6:07 PM
To: Bert Gunter
Subject: RE: [R] Adding rows based on column value

Yes sir.I am trying.

I am using this-

aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)

but I think this is not a right way.Because we cannot use sum to add.That is 
why I was asking for help.

Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: Bert Gunter [gunter.ber...@gene.com]
Sent: Thursday, July 14, 2011 6:01 PM
To: Bansal, Vikas
Subject: Re: [R] Adding rows based on column value

Not from me -- I don't believe you've made an honest effort. Maybe
someone else will help you. You might try posting reproducible code
that show your efforts -- as the posting guide requests.
-- Bert

On Thu, Jul 14, 2011 at 9:46 AM, Bansal, Vikas vikas.ban...@kcl.ac.uk wrote:
 I have checked it but did not get any results.Is there a way I can do it?I 
 will be very thankful to you.

 Thanking you,
 Warm Regards
 Vikas Bansal
 Msc Bioinformatics
 Kings College London
 
 From: Bert Gunter [gunter.ber...@gene.com]
 Sent: Thursday, July 14, 2011 4:54 PM
 To: Bansal, Vikas
 Cc: r-help@r-project.org
 Subject: Re: [R] Adding rows based on column value

 ?tapply (in base R)
 ?aggregate  ?by  (wrapper for tapply)
 ?ave (in base R -- based on tapply)

 Also package plyr (and several others, undoubtedly).

 Also google on R summarize data by groups or similar gets many relevant 
 hits.

 -- Bert




 2011/7/14 Bansal, Vikas vikas.ban...@kcl.ac.uk:
 Dear all,

 I have one problem and did not find any solution.(I have also attached the 
 problem in text file because sometimes column spacing is not good in mail)

 I have a file(file.txt) attached with this mail.I am reading it using this 
 code to make a data frame (file)-

 file=read.table(file.txt,fill=T,colClasses = character,header=T)

 file looks like this-

  Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  0.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344110  0.00  0.00 24.00  0.00
  10 135344113  0.00  0.00 24.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344114 24.00  0.00  0.00  0.00
  10 135344116  0.00  0.00  0.00 24.00
  10 135344118  0.00 24.00  0.00  0.00
  10 135344118  0.00  0.00  0.00 24.00
  10 135344122 24.00  0.00  0.00  0.00
  10 135344122  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00 24.00  0.00  0.00
  10 135344123  0.00  0.00  0.00 24.00
  10 135344126  0.00  0.00 24.00  0.00

 Now some of the values in column Pos are same.for these same positions i 
 want to add the values of columns 3:6
 I will explain with an example-
 The output of first row should be-

  Chr   PosCaseA CaseCCaseG  CaseT
  10 135344110  0.00 24.00  48.00  0.00

 so the whole output for above input should be-

  Chr   PosCaseA CaseCCaseG  CaseT
  10 1353441100.00  24.00  48.000.00
  10 1353441130.00   0.00   24.000.00
  10 135344114  48.00  0.000.00 0.00
  10 135344116   0.00   0.000.0024.00
  10 135344118   0.00  24.00   0.0024.00
  10 135344122  24.00 24.00   0.000.00
  10 135344123   0.00  48.00   0.0024.00
  10 135344126   0.00  0.0024.00   0.00

 Can you please help me.



 Thanking you,
 Warm Regards
 Vikas Bansal
 Msc Bioinformatics
 Kings College London
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





 --
 Men by nature long to get on to the ultimate truths, and will often
 be impatient with elementary studies or fight shy of them. If it were
 possible to reach the ultimate truths without the elementary studies
 usually prefixed to them, these would not be preparatory studies but
 superfluous diversions.

 -- Maimonides (1135-1204)

 Bert Gunter
 Genentech Nonclinical Biostatistics




--
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

--

Re: [R] Writing Complex Formulas

2011-07-14 Thread warmstron1

I resolved this issue.  It appears that ^ won't work for this case, but
** worked.  I can't find any reference to this, but where ^ seems to be
used to raise a value to a numerical function, ** is used for a y raised
to the power of x where x it a computation.   

--
View this message in context: 
http://r.789695.n4.nabble.com/Writing-Complex-Formulas-tp3638379p3668109.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing functions with Euler's number (e^n)

2011-07-14 Thread warmstron1

I solved this in two ways:
1. ** was necessary to raise (-dummy + 1) to the power of B.  ^ doesn't
work here, for some reason.
2. I needed to use as.complex which greatly simplified my code and
produces the correct response.  (I had to revisit math that I had not used
in many years.)

 W - as.complex(((fq/cf[j])**B[j])*(exp(-(fq/cf[j])+1)**B[j]))


--
View this message in context: 
http://r.789695.n4.nabble.com/computing-functions-with-Euler-s-number-e-n-tp3655205p3668125.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-14 Thread Ben Bolker

Bansal, Vikas vikas.bansal at kcl.ac.uk writes:

 I am using this-
 
 aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)
 

  Better, although still not reproducible (please *do* read the posting
guide -- it is listed at the bottom of every R list post and is the
*first* google hit for posting guide (!); search for
Examples).  What about removing the quotation marks around sum?

  aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)


 but I think this is not a right way.
 Because we cannot use sum to add.That is
  why I was asking for help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-14 Thread Bansal, Vikas

I have tried that also.But it is showing this error-

 aggregate(file[,3:6], by = list(file[,2]), FUN = sum)

Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument



Thanking you,
Warm Regards
Vikas Bansal
Msc Bioinformatics
Kings College London

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Ben Bolker [bbol...@gmail.com]
Sent: Thursday, July 14, 2011 6:24 PM
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] Adding rows based on column value

Bansal, Vikas vikas.bansal at kcl.ac.uk writes:

 I am using this-

 aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)


  Better, although still not reproducible (please *do* read the posting
guide -- it is listed at the bottom of every R list post and is the
*first* google hit for posting guide (!); search for
Examples).  What about removing the quotation marks around sum?

  aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)


 but I think this is not a right way.
 Because we cannot use sum to add.That is
  why I was asking for help.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing functions with Euler's number (e^n)

2011-07-14 Thread Berend Hasselman


warmstron1 wrote:
 
 I solved this in two ways:
 1. ** was necessary to raise (-dummy + 1) to the power of B.  ^
 doesn't work here, for some reason.
 ...
 

Using which version R on which platform?

Most strange. The help page for Arithmetic operators clearly states in a
Note that  ** is translated in the parser to ^, ...

In R-2.13.1 on Mac OS X I see no difference between results for ^ and **.

Berend

--
View this message in context: 
http://r.789695.n4.nabble.com/computing-functions-with-Euler-s-number-e-n-tp3655205p3668256.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-14 Thread Ben Bolker

On 07/14/2011 01:46 PM, Bansal, Vikas wrote:
 I have tried that also.But it is showing this error-
 
  aggregate(file[,3:6], by = list(file[,2]), FUN = sum)
 
 Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument
 
 

 Farther down in your previous e-mail you state that you read the file
in using

file=read.table(file.txt,fill=T,colClasses = character,header=T)

   the 'colClasses' argument is telling R to read in the data as type
character, which of course it is having trouble summing (as the error
message suggests: R's error messages are often cryptic, but in this case
it seems to be telling you exactly what's wrong).  (You probably
put it in there so that R wouldn't mess up your second column, but it
was overkill. It converted *all* the columns to character.)

   Try changing your read statement to:

file=read.table(file.txt,fill=TRUE,
   colClasses = rep(c(character,numeric),c(2,4)),header=TRUE)

(changing T to TRUE is safer; the different colClasses is the important
part.  fill=TRUE is probably unnecessary.)
  If you're unsure what this is doing, please do your best to read
?read.table and ?rep, and try out examples, before responding with
further queries ...

   Ben Bolker


 
 Thanking you,
 Warm Regards
 Vikas Bansal
 Msc Bioinformatics
 Kings College London
 
 From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf 
 Of Ben Bolker [bbol...@gmail.com]
 Sent: Thursday, July 14, 2011 6:24 PM
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Adding rows based on column value
 
 Bansal, Vikas vikas.bansal at kcl.ac.uk writes:
 
 I am using this-

 aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)

 
   Better, although still not reproducible (please *do* read the posting
 guide -- it is listed at the bottom of every R list post and is the
 *first* google hit for posting guide (!); search for
 Examples).  What about removing the quotation marks around sum?
 
   aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)
 
 
 but I think this is not a right way.
 Because we cannot use sum to add.That is
  why I was asking for help.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding rows based on column value

2011-07-14 Thread Bert Gunter

Yes, because from your previous posts, you appeared to have read in
the data as character:
file=read.table(file.txt,fill=T,colClasses = character,header=T)

But, of course, without a reproducible example, one cannot be sure.

-- Bert



On Thu, Jul 14, 2011 at 10:46 AM, Bansal, Vikas vikas.ban...@kcl.ac.uk wrote:
 I have tried that also.But it is showing this error-

  aggregate(file[,3:6], by = list(file[,2]), FUN = sum)

 Error in FUN(X[[1L]], ...) : invalid 'type' (character) of argument



 Thanking you,
 Warm Regards
 Vikas Bansal
 Msc Bioinformatics
 Kings College London
 
 From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf 
 Of Ben Bolker [bbol...@gmail.com]
 Sent: Thursday, July 14, 2011 6:24 PM
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Adding rows based on column value

 Bansal, Vikas vikas.bansal at kcl.ac.uk writes:

 I am using this-

 aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)


  Better, although still not reproducible (please *do* read the posting
 guide -- it is listed at the bottom of every R list post and is the
 *first* google hit for posting guide (!); search for
 Examples).  What about removing the quotation marks around sum?

  aggregate(x = file[,3:6], by = list(file[,2]), FUN = sum)


 but I think this is not a right way.
 Because we cannot use sum to add.That is
  why I was asking for help.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl: reproduce final state of interactive plot?

2011-07-14 Thread sjaffe

Terrific! This is great to know.

I first tried saving and restoring the entire set from par3d but this
produced some changes (eg bg) and also one must call par3d with
no.readonly=TRUE. Clearly this is the way to go if one has changed a variety
of rgl properties.  But if one has only used the mouse to rotate/scale I
discovered (by looking at the documentation of view3d) that I believe that
all one needs are userMatrix, FOV, and zoom:


## create an rgl plot, interact with it, then:
snap - par3d( c(userMatrix, FOV, zoom) )

## create a new rgl plot, apply the same transformation:
par3d( snap )


This can also be used to 'snap' the current view during an interactive
session and restore it later during that same session, which could be quite
useful.

To save typing, a (trivial) pair of functions to encapsulate this:

snap.view- function() par3d( c(userMatrix, FOV, zoom) )
restore.view - function( snap) par3d( snap )





--
View this message in context: 
http://r.789695.n4.nabble.com/rgl-reproduce-final-state-of-interactive-plot-tp3667866p3668272.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SQldf with sqlite and H2

2011-07-14 Thread Gabor Grothendieck

On Thu, Jul 14, 2011 at 10:33 AM, Mandans mandan...@yahoo.com wrote:
 SQldf with sqlite and H2

 I have a large csv file (about 2GB) and wanted to import the file into R and 
 do some filtering and analysis. Came across sqldf ( a great idea and product) 
 and was trying to play around to see what would be the best method of doing 
 this. csv file is comma delimited with some columns having comma inside the 
 quoation like this John, Doe.

 I tried this first

 ###
 library(sqldf)
 sqldf(attach testdb as new)
 In.File - C:/JP/Temp/2008.csv
 read.csv.sql(In.File, sql = create table table1 as select * from file,
  dbname = testdb)

 It errored out with message

 NULL
 Warning message:
 closing unused connection 3 (C:/JP/Temp/2008.csv)

 When this failed, I converted this file from comma delimited to tab delimited 
 and used this command

 #
 read.csv.sql(In.File, sql = create table table1 as select * from file,
  dbname = testdb, sep = \t)

 and this worked, it created testdb sqlite file with the size of 3GB

 now my question is in 3 parts.

 1. Is it possible to create a dataframe with appropriate column classes and 
 use that column classes when I use the read.csv.sql command to create the 
 table. Something like may be create the table from that DF and then update 
 with read.csv.sql.?

 Any example code will be really helpful.

Here is an example of using method = name__class.  Note there are
two underscores in a row.  It appears I neglected to document that
Date2 means convert from character representation whereas Date means
convert from numeric representation.  It would also be possible to use
method = raw and then coerce the columns yourself afterwards.

# create test file
Lines - 'A__Date2|B
2000-01-01|x,y
2000-01-02|c,d
'
tf - tempfile()
cat(Lines, file = tf)


library(sqldf)
DF - read.csv.sql(tf, sep = |, method = name__class)
str(DF)


 2. If we use the H2 database instead of default sqlite and use the readcsv 
 option, will that be faster and is there a way we can specify the above 
 thought of applying a DF class to table column properties and update with 
 CSVREAD

 library(RH2)
 something like SELECT * FROM CSVREAD('C:/JP/Temp/2008.csv')

 Any example code will be really helpful.

Sorry, I haven't tested the speed of this.  postgresql and mysql, both
supported by sqldf, also have builtin methods to read files. If I had
to guess I would guess that mysql would be fastest but this would have
to be tested.


 3. How do we specify where the H2 file is saved. Saw something like this, 
 when I ran this example from RH2 package, couldn't find the file in the 
 working directory.

 con - dbConnect(H2(), jdbc:h2:~/test, sa, )

~ means your home directory so ~/test means test is in the home directory.

Try

normalizePath(~)
normalizePath(~/test)
etc.

to see what they refer to.

Regards.


 Sorry for the long mail. Appreciate all for building a great community and 
 for the wonderful software in R.
 Thanks for Gabor Grothendieck for bring sqldf to this great community.

 Any help or direction you can provide in this is highly appreciated.

 Thanks all.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple clustering help

2011-07-14 Thread donvolencia

Hi all,

I have just begun to use R and am hoping to receive some advice about the
problem I need to solve.  I have a file containing xy points that I need to
find all significant clusters and write each of their xy coordinates to
file(total points  ~ 75000 and sig. cluster = 2500 points.  I want to use a
euclidean distance threshold to determine if a point belongs to a cluster.

My initial thought is to take a random (seed) point and write a region
growing method to determine how many points belong to the cluster
(basically, add to the cluster all points that are within the threshold and
continue this until no points are with the threshold) .  Once there are no
neighboring points within the threshold, if the number of points added to
the region (cluster) is greater than 2500 I would write all of the point's
coordinates to a text file and remove them from the list of seed candidates
and begin again.  If the cluster size is less than 2500 I would simply
remove the points as they are not significant.  The process would continue
until there are less than 2500 points remaining.

 Is there a package that would be helpful in this task? 

Thanks
Don



--
View this message in context: 
http://r.789695.n4.nabble.com/Simple-clustering-help-tp3668274p3668274.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] LME and overall treatment effects

2011-07-14 Thread Mark Bilton


Hello fellow R users,

I am having a problem finding the estimates for some overall treatment  
effects for my mixed models using 'lme' (package nlme). I hope someone  
can help.


Firstly then, the model:
The data: Plant biomass (log transformed)
Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009)
Random Factors: 5 plots per treatment, 5 quadrats per plot (N=594 (3*5*5*8)
with 6 missing values).

I am modelling this in two ways, firstly with year as a continuous variable
(interested in the difference in estimated slope over time in each treatment
'year*treatment'), and secondly with year as a categorical variable
(interested in difference between 'treatments').

When using Year as a continuous variable, the output of the lme means  
that I can compare the 3 treatments within my model...
i.e. it takes one of the Treatment*year interactions as the baseline  
and compares (contrasts) the other two to that.
I can then calculate the overall treatment*year effect using  
'anova.lme(Model).


However, the problem comes when I use Year as a categorical variable.  
Here, I am interested solely in the Treatment effect (not the  
interaction with year). However, the output for the two labelled  
'Treatment's against the third comparison, are not the overall effect  
but are a comparison within a year (2002). I can still get my overall  
effect (using anova.lme) but how do I calculate the estimates (with  
P-Values if possible) for each seperate overall treatment comparison  
(not within year). I tried 'glht' (package 'multicomp') but this only  
works if there are no interactions present, otherwise again it gives a  
comparison for one particular year.


Very grateful for any assistance,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Amelia_Multiple_Imputation_with_observational_priors_noms

2011-07-14 Thread Eric Miner

I am fairly new at using R/programming in general so I apologize if I am
leaving crucial parts of the puzzle out, but here goes.



First and foremost this is the error I am receiving:



Error in muPriors[priors[, 1:2]] - priors[, 3] :

  NAs are not allowed in subscripted assignments



This occurs only when I am using observational priors and some number of
nominal variables, it does not occur when I change to specifying the
variables as ordinal, or if I do not use priors. Using a doctored version of
the example from the Amelia User Guide this is what I came up with:



library(Amelia)

library(MatchIt)

library(Zelig)

library(foreign)



data(freetrade)



a.out = amelia(freetrade, m=5, ts = year, cs = country)


newtrade = which(is.na(freetrade), arr.ind = TRUE)

newtrade

newtradeMean =
c(30,29,28,31,32,34,30,34,26,32,30,29,28,31,32,34,30,34,26,32,30,29,28,31,32,34,30,34,26,

32,30,29,28,31,32,34,30,34,26,32,30,29,28,31,32,34,30,34,26,32,31,31,32,33,34,35,20,21,

7,8,2.0,2.1,2.2,2.0,2.5,2.3,2.5,2.1,1.9,1.8,1.7,2.9,2.0,0,0,1,.3,.4,.5,.3,.2,.3,.2,.4,.5,.3,.2,.3,.3,.4,.5,.3,.2,.3)

newishtrade = cbind(newtrade,newtradeMean)

newishtrade

newtradeSD =
c(5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,5,5,5,4,3,3,2,4,5,4,3,4,5,6,4,3,4,2,

4,1,1.2,.3,.4,.3,.4,.3,.2,.3,.4,.3,.3,.3,.2,.3,.2,.3,.3,.1,.1,.2,.2,.3,.3,.1,.1,.2,.2,.3,.3,.1,.1,.2,.2,.3)

newishtrade = cbind(newtrade,newtradeMean, newtradeSD)

newishtrade



a.out = amelia(freetrade, m=5, ts = year, cs = country, priors =
newishtrade, p2s=2, noms=signed)



a.out = amelia(freetrade, m=5, ts = year, cs = country, priors =
newishtrade, p2s=2, ords=signed)




I followed the trail back through the source code to try and see what was
occurring. I learned some debugging techniques and tried to discern if it
was something with the code or perhaps some incorrect steps in using priors
with nominal variables. To summarize, I cannot get observational priors to
work when I specify nominal variables. If it helps below is my sourcing
output scattered through the various functions in the source code. Thoughts?


amelia starting
before prep
amelia.prep started
beginning prep functions
Variables used:  tariff polity pop gdp.pc intresmi fiveop usheg
noms.signed.2
after prep
running bootstrap
does this work
Error in muPriors[priors[, 1:2]] - priors[, 3] :
  NAs are not allowed in subscripted assignments

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Writing Complex Formulas

2011-07-14 Thread Duncan Murdoch


On 14/07/2011 12:46 PM, warmstron1 wrote:

I resolved this issue.  It appears that ^ won't work for this case, but
** worked.  I can't find any reference to this, but where ^ seems to be
used to raise a value to a numerical function, ** is used for a y raised
to the power of x where x it a computation.


Those should be equivalent.  Can you post the code that wasn't working, 
and describe what not working meant?


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rgl: reproduce final state of interactive plot?

2011-07-14 Thread Ben Bolker

sjaffe sjaffe at riskspan.com writes:

[snip snip]

 This can also be used to 'snap' the current view during an interactive
 session and restore it later during that same session, which could be quite
 useful.
 
 To save typing, a (trivial) pair of functions to encapsulate this:
 
 snap.view- function() par3d( c(userMatrix, FOV, zoom) )
 restore.view - function( snap) par3d( snap )
 

  This is very nice, I keep having to remember how to do things
like it.  Duncan, perhaps it could go in an example in the package ... ?

  Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Repating a loop of lm function with different columns of database

2011-07-14 Thread Jon Toledo


Hi,
First let me thank you for the incredible help and resource that this forum is.
I am trying to compare the  repeated measurement of more than 100 analytes that 
have been take in 70 subjects at 2 time points adjusted for the time difference 
of sample times(TimeDifferenceDays), therefore I wanted to do it with a 
function that allows me to do all at once. (131 is the column difference that 
separates the two different measurements of the same anlyte)
I have this one:
for(i in 1:125){return(summary(lm(Data[,i] ~ Data[,(i+131)] + 
Data$TimeDifferenceDays)))}
But it only gives me one result

I also wanted to get the p-value in a dataframe with three columns:
Fitst column: analyte name (that´s the name of the column)Second column: pvalue 
of the first measure of the analte predicting the second measureThird column: 
The effect of time

I copy a samller example:


txt - V1a  V2a  V3a  V1b  V2b  V3b   TimeDifferenceDays2.42  72.4 3.75 2.46 
55.4 4.44 6081.66 89.7 2.54 2.17 94.0 2.15 4192.45 112. 0.46 2.40 129.0 .42 
7142.58 55.6 5.05 2.44 135.0 5.39 7212.61 332.0 22.6 3.55 238.0 16.4 729
Data - read.table(textConnection(txt), header = TRUE)
Thank you
Jon Toledo, MD

Postdoctoral fellow
University of Pennsylvania School of Medicine
Center for Neurodegenerative Disease Research

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LME and overall treatment effects

2011-07-14 Thread Bert Gunter

Probably no hope of help until you do as the posting guide asks:

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and *** provide commented, minimal, self-contained, reproducible code.***


-- Bert

On Thu, Jul 14, 2011 at 11:02 AM, Mark Bilton
mark.bil...@uni-tuebingen.de wrote:
 Hello fellow R users,

 I am having a problem finding the estimates for some overall treatment
 effects for my mixed models using 'lme' (package nlme). I hope someone can
 help.

 Firstly then, the model:
 The data: Plant biomass (log transformed)
 Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009)
 Random Factors: 5 plots per treatment, 5 quadrats per plot (N=594 (3*5*5*8)
 with 6 missing values).

 I am modelling this in two ways, firstly with year as a continuous variable
 (interested in the difference in estimated slope over time in each treatment
 'year*treatment'), and secondly with year as a categorical variable
 (interested in difference between 'treatments').

 When using Year as a continuous variable, the output of the lme means that I
 can compare the 3 treatments within my model...
 i.e. it takes one of the Treatment*year interactions as the baseline and
 compares (contrasts) the other two to that.
 I can then calculate the overall treatment*year effect using
 'anova.lme(Model).

 However, the problem comes when I use Year as a categorical variable. Here,
 I am interested solely in the Treatment effect (not the interaction with
 year). However, the output for the two labelled 'Treatment's against the
 third comparison, are not the overall effect but are a comparison within a
 year (2002). I can still get my overall effect (using anova.lme) but how do
 I calculate the estimates (with P-Values if possible) for each seperate
 overall treatment comparison (not within year). I tried 'glht' (package
 'multicomp') but this only works if there are no interactions present,
 otherwise again it gives a comparison for one particular year.

 Very grateful for any assistance,
 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread Dimitri Liakhovitski

Hello!

I am aggregating using a formula in aggregate - of the type:
aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

However, I actually have an object (vector of my variables to be aggregated):
myvars-c(var1,var2,var3)

I'd like my aggregate formula (its cbind part) to be able to use my
myvars object. Is it possible?
Thanks for your help!

Dimitri

Reproducible example:

mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

example-data.frame(mydate=mydate,value1=value1,value2=value2)
example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
example$group-as.factor(example$group)
(example);str(example)

example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
# this works
(example.agg1)

### Building my object (vector of 2 names - in reality, many more):
myvars-c(value1,value2)
example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
### does not work


-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread David Winsemius



On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:


Hello!

I am aggregating using a formula in aggregate - of the type:
aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

However, I actually have an object (vector of my variables to be  
aggregated):

myvars-c(var1,var2,var3)

I'd like my aggregate formula (its cbind part) to be able to use my
myvars object. Is it possible?
Thanks for your help!



Not sure I have gotten all the way there, but this does work:

example.agg1-aggregate(as.matrix(example[myvars])~group 
+mydate,sum,data=example)


 example.agg1
   group mydate example[myvars]NA
1 group1 2008-12-01   4   4.2
2 group2 2008-12-01   6   6.2
3 group1 2009-01-01  40  40.2
4 group2 2009-01-01  60  60.2
5 group1 2009-02-01 400 400.2
6 group2 2009-02-01 600 600.2


Dimitri

Reproducible example:

mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

example-data.frame(mydate=mydate,value1=value1,value2=value2)
example$group-c(rep(group1,3),rep(group2,3),rep(group1, 
3),rep(group2,3))

example$group-as.factor(example$group)
(example);str(example)

example.agg1-aggregate(cbind(value1,value2)~group 
+mydate,sum,data=example)

# this works
(example.agg1)

### Building my object (vector of 2 names - in reality, many more):
myvars-c(value1,value2)
example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
### does not work


--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread Dimitri Liakhovitski

Thank you, David, it does work.
Could you please explain why? What exactly does changing it to as matrix do?
Thank you!
Dimitri

On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:

 Hello!

 I am aggregating using a formula in aggregate - of the type:
 aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

 However, I actually have an object (vector of my variables to be
 aggregated):
 myvars-c(var1,var2,var3)

 I'd like my aggregate formula (its cbind part) to be able to use my
 myvars object. Is it possible?
 Thanks for your help!


 Not sure I have gotten all the way there, but this does work:

 example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)

 example.agg1
   group     mydate example[myvars]    NA
 1 group1 2008-12-01               4   4.2
 2 group2 2008-12-01               6   6.2
 3 group1 2009-01-01              40  40.2
 4 group2 2009-01-01              60  60.2
 5 group1 2009-02-01             400 400.2
 6 group2 2009-02-01             600 600.2

 Dimitri

 Reproducible example:

 mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
 value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
 value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

 example-data.frame(mydate=mydate,value1=value1,value2=value2)

 example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
 example$group-as.factor(example$group)
 (example);str(example)


 example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
 # this works
 (example.agg1)

 ### Building my object (vector of 2 names - in reality, many more):
 myvars-c(value1,value2)
 example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
 ### does not work


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT





-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread Bert Gunter

Dmitri:

Look at my vars from myvars-c(value1,value2)
It's just a character vector of length 2!

You can't cbind a character vector of length 2! These are not
references/pointers.

It's not at all clear to me what you ultimately want to do, but IF
it's: pass a character vector of names to be used as the LHS of the
aggregate .formula call, then something like: (untested)

MyVars - do.call(cbind, lapply(myvars,get))

and then

aggregate(MyVars ~ ...)

might do. But there are so many potential scoping problems here that I
would not be surprised if it failed. The usual advice for this sort of
thing is to use substitute() or maybe the dreaded eval(parse(...))
construction -- but as I said, I don't really understand what you're
after.

-- Bert



On Thu, Jul 14, 2011 at 12:32 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Thank you, David, it does work.
 Could you please explain why? What exactly does changing it to as matrix do?
 Thank you!
 Dimitri

 On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net 
 wrote:

 On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:

 Hello!

 I am aggregating using a formula in aggregate - of the type:
 aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

 However, I actually have an object (vector of my variables to be
 aggregated):
 myvars-c(var1,var2,var3)

 I'd like my aggregate formula (its cbind part) to be able to use my
 myvars object. Is it possible?
 Thanks for your help!


 Not sure I have gotten all the way there, but this does work:

 example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)

 example.agg1
   group     mydate example[myvars]    NA
 1 group1 2008-12-01               4   4.2
 2 group2 2008-12-01               6   6.2
 3 group1 2009-01-01              40  40.2
 4 group2 2009-01-01              60  60.2
 5 group1 2009-02-01             400 400.2
 6 group2 2009-02-01             600 600.2

 Dimitri

 Reproducible example:

 mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
 value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
 value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

 example-data.frame(mydate=mydate,value1=value1,value2=value2)

 example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
 example$group-as.factor(example$group)
 (example);str(example)


 example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
 # this works
 (example.agg1)

 ### Building my object (vector of 2 names - in reality, many more):
 myvars-c(value1,value2)
 example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
 ### does not work


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT





 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread David Winsemius


Dmitri:

as.matrix makes a matrix out of the dataframe that is passed to it.

As a further note I attempted and failed for reasons that are unclear  
to me to construct a formula that would (I hoped) preserve the column  
names which are being mangle in the posted effort:


form - as.formula(paste(
 cbind(,
  paste( myvars, collapse=,),
  ) ~ group+mydate,
  sep= ) )
 myvars-c(value1,value2)
 example.agg1-aggregate(formula=form,data=example, FUN=sum)
Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
 traceback()
2: aggregate.formula(formula = form, data = example, FUN = sum)
1: aggregate(formula = form, data = example, FUN = sum)

 form
cbind(value1, value2) ~ group + mydate
 parse(text=form)
expression(~
cbind(value1, value2), group + mydate)

So it seems to be correctly dispatched to aggregate.formula but not  
passing some check or another. Also tried with formula() rather than  
as.formula with identical error message. Also tried including without  
naming the argument.


--
David


On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote:


Thank you, David, it does work.
Could you please explain why? What exactly does changing it to as  
matrix do?

Thank you!
Dimitri

On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:


Hello!

I am aggregating using a formula in aggregate - of the type:
aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

However, I actually have an object (vector of my variables to be
aggregated):
myvars-c(var1,var2,var3)

I'd like my aggregate formula (its cbind part) to be able to use  
my

myvars object. Is it possible?
Thanks for your help!



Not sure I have gotten all the way there, but this does work:

example.agg1-aggregate(as.matrix(example[myvars])~group 
+mydate,sum,data=example)



example.agg1

  group mydate example[myvars]NA
1 group1 2008-12-01   4   4.2
2 group2 2008-12-01   6   6.2
3 group1 2009-01-01  40  40.2
4 group2 2009-01-01  60  60.2
5 group1 2009-02-01 400 400.2
6 group2 2009-02-01 600 600.2


Dimitri

Reproducible example:

mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
value2 
=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)


example-data.frame(mydate=mydate,value1=value1,value2=value2)

example$group-c(rep(group1,3),rep(group2,3),rep(group1, 
3),rep(group2,3))

example$group-as.factor(example$group)
(example);str(example)


example.agg1-aggregate(cbind(value1,value2)~group 
+mydate,sum,data=example)

# this works
(example.agg1)

### Building my object (vector of 2 names - in reality, many more):
myvars-c(value1,value2)
example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
### does not work


--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT






--
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Sum weights of independent variables across models (AIC)

2011-07-14 Thread Frank Harrell

Why go to so much trouble?  Why not fit a single full model and use it?  Even
better why not use a quadratic penalty on the full model to get optimum
cross-validation?
Frank

nofunsally wrote:
 
 Hello,
 I'd like to sum the weights of each independent variable across linear
 models that have been evaluated using AIC.
 
 For example:
 
 library(MuMIn)
 data(Cement)
 lm1  -  lm(y  ~  .,  data  =  Cement)
 dd  -  dredge(lm1,  beta  = TRUE,  eval  =  TRUE,  rank  =  AICc)
 get.models(dd, subset = delta 4)
 
 There are 5 models with a Delta AIC Score of less than 4.  I would
 like to sum the weights for each of the independent variables across
 the five models.
 
 How can I do that?
 
 Thanks,
 Mike
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Sum-weights-of-independent-variables-across-models-AIC-tp3666306p3668513.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Repating a loop of lm function with different columns of database

2011-07-14 Thread Weidong Gu

If I understood your question

x-data.frame(matrix(rnorm(2000,10,10),ncol=50))

sapply(1:5,function(i) summary(lm(x[,i]~x[,i+10]+x[,50])))

Weidong Gu

On Thu, Jul 14, 2011 at 2:27 PM, Jon Toledo tintin...@hotmail.com wrote:

 Hi,
 First let me thank you for the incredible help and resource that this forum 
 is.
 I am trying to compare the  repeated measurement of more than 100 analytes 
 that have been take in 70 subjects at 2 time points adjusted for the time 
 difference of sample times(TimeDifferenceDays), therefore I wanted to do it 
 with a function that allows me to do all at once. (131 is the column 
 difference that separates the two different measurements of the same anlyte)
 I have this one:
 for(i in 1:125){return(summary(lm(Data[,i] ~ Data[,(i+131)] + 
 Data$TimeDifferenceDays)))}
 But it only gives me one result

 I also wanted to get the p-value in a dataframe with three columns:
 Fitst column: analyte name (that愀 the name of the column)Second column: 
 pvalue of the first measure of the analte predicting the second measureThird 
 column: The effect of time

 I copy a samller example:


 txt - V1a  V2a  V3a  V1b  V2b  V3b   TimeDifferenceDays2.42  72.4 3.75 2.46 
 55.4 4.44 6081.66 89.7 2.54 2.17 94.0 2.15 4192.45 112. 0.46 2.40 129.0 .42 
 7142.58 55.6 5.05 2.44 135.0 5.39 7212.61 332.0 22.6 3.55 238.0 16.4 729
 Data - read.table(textConnection(txt), header = TRUE)
 Thank you
 Jon Toledo, MD

 Postdoctoral fellow
 University of Pennsylvania School of Medicine
 Center for Neurodegenerative Disease Research


        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] WLS regression, lm() with weights as a matrix

2011-07-14 Thread Victor11

Dear All,

I've been trying to run a Weighted Least Squares (WLS) regression:

Dependent variables: a 60*200 matrix (*Rit*) with 200 companies and 60 dates
for each company
Independent variables: a 60*4 matrix (*Ft*) with 4 factors and 60 dates for
each factor
Weights: a 60*200 matrix (*Wit*) with weights for 200 companies and 60 dates
for each company

The WLS regression I would like to run is:

(Wit)*Rit = a*(Wit*F1t) + b*(Wit*F2t) + c*(Wit*F3t) + d*(Wit*F4t) + eit

Ideally, I want to run WLS regressions for each company i (i.e., 200 WLS
regressions in total), in each regression using weights from column i in
matrix *Wit* ,and in the end obtain a 60*4 matrix with coefficients and a
200*60 matrix with residuals.

However, when I run:

/lm(Rit ~ Ft, weights=Wit)/

it fails because weights argument can only be vector not matrix.

I have been searching old posts but couldn't find any solutions. I'm
wondering if there is any other function or way to do this? I would really
appreciate for your help.

Thanks,
Victor

--
View this message in context: 
http://r.789695.n4.nabble.com/WLS-regression-lm-with-weights-as-a-matrix-tp3668577p3668577.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very slow optim(): solved

2011-07-14 Thread Hamazaki, Hamachan (DFG)

After Googling and trial and errors, the major cause of optimization was not 
functions, but data setting. 
Originally, I was using data.frame for likelihood calculation. Then, I changed 
data.frame to vector and matrix for the same likelihood calculation.  Now 
convergence takes ~ 14 sec instead of 25 min. Certainly, I didn't know this 
simple change makes huge computational difference.

 

Toshihide Hamachan Hamazaki, 濱崎俊秀PhD
Alaska Department of Fish and Game: アラスカ州漁業野生動物課
Diivision of Commercial Fisheries: 商業漁業部
333 Raspberry Rd.  Anchorage, AK 99518
Phone:  (907)267-2158
Cell:  (907)440-9934

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Ben Bolker
Sent: Wednesday, July 13, 2011 12:21 PM
To: r-h...@stat.math.ethz.ch
Subject: Re: [R] Very slow optim()

Hamazaki, Hamachan (DFG toshihide.hamazaki at alaska.gov writes:

 
 Dear list, 
 
 I am using optim() function to MLE ~55 parameters, but it is very slow to
converge (~ 25 min), whereas I can do
 the same in ~1 sec. using ADMB, and ~10 sec using MS EXCEL Solver.
 
 Are there any tricks to speed up?
 
 Are there better optimization functions? 
 

  There's absolutely no way to tell without knowing more about your code.  You
might try method=CG:

Method ‘CG’ is a conjugate gradients method based on that by
 Fletcher and Reeves (1964) (but with the option of Polak-Ribiere
 or Beale-Sorenson updates).  Conjugate gradient methods will
 generally be more fragile than the BFGS method, but as they do not
 store a matrix they may be successful in much larger optimization
 problems.

  If ADMB works better, why not use it?  You can use the R2admb
package (on R forge) to wrap your ADMB calls in R code, if you
prefer that workflow.

  Ben

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread Dimitri Liakhovitski

Thanks a lot!

actually, what I tried to do is very simple - just passing tons of
variable names into the formula. Maybe that get thing suggested by
Bert would work...

Dimitri


On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius dwinsem...@comcast.net wrote:
 Dmitri:

 as.matrix makes a matrix out of the dataframe that is passed to it.

 As a further note I attempted and failed for reasons that are unclear to me
 to construct a formula that would (I hoped) preserve the column names which
 are being mangle in the posted effort:

 form - as.formula(paste(
                     cbind(,
                      paste( myvars, collapse=,),
                      ) ~ group+mydate,
                      sep= ) )
 myvars-c(value1,value2)
 example.agg1-aggregate(formula=form,data=example, FUN=sum)
 Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
 traceback()
 2: aggregate.formula(formula = form, data = example, FUN = sum)
 1: aggregate(formula = form, data = example, FUN = sum)

 form
 cbind(value1, value2) ~ group + mydate
 parse(text=form)
 expression(~
 cbind(value1, value2), group + mydate)

 So it seems to be correctly dispatched to aggregate.formula but not passing
 some check or another. Also tried with formula() rather than as.formula with
 identical error message. Also tried including without naming the argument.

 --
 David


 On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote:

 Thank you, David, it does work.
 Could you please explain why? What exactly does changing it to as matrix
 do?
 Thank you!
 Dimitri

 On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:

 Hello!

 I am aggregating using a formula in aggregate - of the type:
 aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

 However, I actually have an object (vector of my variables to be
 aggregated):
 myvars-c(var1,var2,var3)

 I'd like my aggregate formula (its cbind part) to be able to use my
 myvars object. Is it possible?
 Thanks for your help!


 Not sure I have gotten all the way there, but this does work:


 example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)

 example.agg1

  group     mydate example[myvars]    NA
 1 group1 2008-12-01               4   4.2
 2 group2 2008-12-01               6   6.2
 3 group1 2009-01-01              40  40.2
 4 group2 2009-01-01              60  60.2
 5 group1 2009-02-01             400 400.2
 6 group2 2009-02-01             600 600.2

 Dimitri

 Reproducible example:

 mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
 value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
 value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

 example-data.frame(mydate=mydate,value1=value1,value2=value2)


 example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
 example$group-as.factor(example$group)
 (example);str(example)



 example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
 # this works
 (example.agg1)

 ### Building my object (vector of 2 names - in reality, many more):
 myvars-c(value1,value2)
 example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
 ### does not work


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT





 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 David Winsemius, MD
 West Hartford, CT





-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread Dimitri Liakhovitski

David - I tried exactly the thing you did (and after that asked my
question to the forum):
 form - as.formula(paste(
 cbind(,
  paste( myvars, collapse=,),
  ) ~ group+mydate,
  sep= ) )

And it did not work - although it looks clean. At the end I ended up
writing a loop across individual variables with this code in the body:
  myformula-as.formula(paste(i,~myfactor,sep=))
  temp-aggregate(myformula,sum,data=mydata)
 ...

I then it worked. Really don't understand why pasting the cbind(...)
text does not work.
Dimitri

On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius dwinsem...@comcast.net wrote:
 Dmitri:

 as.matrix makes a matrix out of the dataframe that is passed to it.

 As a further note I attempted and failed for reasons that are unclear to me
 to construct a formula that would (I hoped) preserve the column names which
 are being mangle in the posted effort:

 form - as.formula(paste(
                     cbind(,
                      paste( myvars, collapse=,),
                      ) ~ group+mydate,
                      sep= ) )
 myvars-c(value1,value2)
 example.agg1-aggregate(formula=form,data=example, FUN=sum)
 Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
 traceback()
 2: aggregate.formula(formula = form, data = example, FUN = sum)
 1: aggregate(formula = form, data = example, FUN = sum)

 form
 cbind(value1, value2) ~ group + mydate
 parse(text=form)
 expression(~
 cbind(value1, value2), group + mydate)

 So it seems to be correctly dispatched to aggregate.formula but not passing
 some check or another. Also tried with formula() rather than as.formula with
 identical error message. Also tried including without naming the argument.

 --
 David


 On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote:

 Thank you, David, it does work.
 Could you please explain why? What exactly does changing it to as matrix
 do?
 Thank you!
 Dimitri

 On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:

 Hello!

 I am aggregating using a formula in aggregate - of the type:
 aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

 However, I actually have an object (vector of my variables to be
 aggregated):
 myvars-c(var1,var2,var3)

 I'd like my aggregate formula (its cbind part) to be able to use my
 myvars object. Is it possible?
 Thanks for your help!


 Not sure I have gotten all the way there, but this does work:


 example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)

 example.agg1

  group     mydate example[myvars]    NA
 1 group1 2008-12-01               4   4.2
 2 group2 2008-12-01               6   6.2
 3 group1 2009-01-01              40  40.2
 4 group2 2009-01-01              60  60.2
 5 group1 2009-02-01             400 400.2
 6 group2 2009-02-01             600 600.2

 Dimitri

 Reproducible example:

 mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
 value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
 value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

 example-data.frame(mydate=mydate,value1=value1,value2=value2)


 example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
 example$group-as.factor(example$group)
 (example);str(example)



 example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
 # this works
 (example.agg1)

 ### Building my object (vector of 2 names - in reality, many more):
 myvars-c(value1,value2)
 example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
 ### does not work


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT





 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 David Winsemius, MD
 West Hartford, CT





-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] plotting date data over couple of months

2011-07-14 Thread vamshi999

thank you for your reply.. 

As i have told you earlier... i want to plot the total no.of birds counted
for each day and plot total no.of birds for each day.. one level for each
day .. 

i wanted to normalize the data.. since i don't have the data for equal no.of
hours for all days.. for example 
on 2011-04-23 i have a data for 2 hours.. but on 2011-05-03 i have data for
8 hours.. that is why i am dividing the data by the no.of hours of data on
that particular day.. 

so... i will create a dataset with dates from the whole month using the
above command you have mentioned.. .. and plot my data for these values.. 

since my data has some dates missing.. i won't get any bars on those
particular days how can i place a * on those missing days.. 


thank you 

--
View this message in context: 
http://r.789695.n4.nabble.com/plotting-date-data-over-couple-of-months-tp3666298p3668433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LME and overall treatment effects

2011-07-14 Thread Mark Bilton


Ok...lets try again with some code...

---
Hello fellow R users,

I am having a problem finding the estimates for some overall treatment  
effects for my mixed models using 'lme' (package nlme). I hope someone  
can help.


Firstly then, the model:
The data: Plant biomass (log transformed)
Fixed Factors: Treatment(x3 Dry, Wet, Control) Year(x8 2002-2009)
Random Factors: 5 plots per treatment, 10 quadrats per plot (N=1200  
(3*5*10*8)).


I am modelling this in two ways, firstly with year as a continuous variable
(interested in the difference in estimated slope over time in each treatment
'year*treatment'), and secondly with year as a categorical variable
(interested in difference between 'treatments').

--
ie: (with Year as either numeric or factor)

Model-lme(Species~Year*Treatment,random=~1|Plot/Quadrat,na.action =  
na.omit,data=UDD)

-

When using Year as a continuous variable, the output of the lme means  
that I can compare the 3 treatments within my model...
i.e. it takes one of the Treatment*year interactions as the baseline  
and compares (contrasts) the other two to that.

-
ie

Fixed effects: Species ~ Year * Treatment
 Value Std.Error   DF   t-value p-value
(Intercept)  1514.3700  352.7552 1047  4.292978  0.
Year   -0.75190.1759 1047 -4.274786  0.
Treatment0   -461.9500  498.8711   12 -0.925991  0.3727
Treatment1  -1355.0450  498.8711   12 -2.716222  0.0187
Year:Treatment0 0.23050.2488 1047  0.926537  0.3544
Year:Treatment1 0.67760.2488 1047  2.724094  0.0066

so Year:Treatment0 differs from baseline Year:Treatment-1 by 0.2305
and Year:Treatment1 is significantly different (p=0.0066) from  
Year:Treatment-1

--

I can then calculate the overall treatment*year effect using  
'anova.lme(Model)'.

-

anova.lme(Model1)

   numDF denDF   F-value p-value
(Intercept)1  1047 143.15245  .0001
Year   1  1047  19.56663  .0001
Treatment  212   3.73890  0.0547
Year:Treatment 2  1047   3.83679  0.0219

so there is an overall difference in slope between treatments  
(Year:Treatment interaction) p=0.0219

--

However, the problem comes when I use Year as a categorical variable.  
Here, I am interested solely in the Treatment effect (not the  
interaction with year). However, the output for the two labelled  
'Treatment's against the third comparison, are not the overall effect  
but are a comparison within a year (2002).

--
Fixed effects: Species ~ Year * Treatment
Value Std.Error   DF   t-value p-value
(Intercept)  6.42  1.528179 1029  4.201079  0.
Year2003 4.10  1.551578 1029  2.642471  0.0084
Year2004 5.00  1.551578 1029  3.222526  0.0013
Year2005-1.52  1.551578 1029 -0.979648  0.3275
Year2006-3.08  1.551578 1029 -1.985076  0.0474
Year2007-2.40  1.551578 1029 -1.546813  0.1222
Year2008 2.24  1.551578 1029  1.443692  0.1491
Year2009-4.30  1.551578 1029 -2.771372  0.0057
Treatment0   0.46  2.161171   12  0.212848  0.8350
Treatment1   0.50  2.161171   12  0.231356  0.8209
Year2003:Treatment0 -2.46  2.194262 1029 -1.121106  0.2625
Year2004:Treatment0 -1.34  2.194262 1029 -0.610684  0.5415
Year2005:Treatment0  0.34  2.194262 1029  0.154950  0.8769
Year2006:Treatment0  1.60  2.194262 1029  0.729174  0.4661
Year2007:Treatment0  1.76  2.194262 1029  0.802092  0.4227
Year2008:Treatment0 -3.22  2.194262 1029 -1.467464  0.1426
Year2009:Treatment0  1.80  2.194262 1029  0.820321  0.4122
Year2003:Treatment1  0.22  2.194262 1029  0.100261  0.9202
Year2004:Treatment1  3.48  2.194262 1029  1.585954  0.1131
Year2005:Treatment1  5.00  2.194262 1029  2.278670  0.0229
Year2006:Treatment1  4.70  2.194262 1029  2.141950  0.0324
Year2007:Treatment1  6.08  2.194262 1029  2.770863  0.0057
Year2008:Treatment1  2.32  2.194262 1029  1.057303  0.2906
Year2009:Treatment1  5.56  2.194262 1029  2.533881  0.0114

so Treatment0 (in year 2002) is different to baseline Treatment-1 (in  
year 2002)  by 0.46 p=0.8350



I can still get my overall effect (using anova.lme) but how do I  
calculate the estimates (with P-Values if possible) for each seperate  
overall treatment comparison (not within year). I can do this in SAS  
using 'estimate' or 'lsmeans', but for various reasons I want to do it  
in R as well.
I tried 'glht' (package 'multicomp') but this only works if there are  
no interactions present, otherwise again it gives a comparison for one  
particular year.


--
ie
require(multcomp)
summary(glht(Model, linfct = mcp(Treatment = Tukey)))

(sorry, I can't get this to work at home but trust me that it's

[R] nlme gls errors

2011-07-14 Thread Roland Sookias

Hi

I keep getting an error like this:

Error in `coef-.corARMA`(`*tmp*`, value = c(18.3113452983211,
-1.56626248550284,  :
  Coefficient matrix not invertible

or like this:

Error in gls(archlogfl ~ co2, correlation = corARMA(p = 3)) : false
convergence (8)

with the gls function in nlme.

The former example was with the model
gls(archlogflfornma~nma,correlation=corARMA(p=3)) where archlogflfornma is
[1] 2.611840 2.618454 2.503317 2.305531 2.180464 2.185764 2.221760
2.211320 and nma is [1] 138 139 142 148 150 134 137 135.

You can see the model in the latter, and archlogfl is [1] 2.611840 2.618454
2.503317 2.305531 2.180464 2.185764 2.221760 2.211320 [9] 2.105556 2.176747
and co2 is [1]  597.5778  917.9308 1101.0430  679.7803  886.5347  597.0668
873.4995 [8]  816.3483 1427.0190  423.8917.

I'm quite a nube, so sorry, but help most appreciated. I have R 2.13.1.

Roland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] random selection of elements from a matrix

2011-07-14 Thread anaraster

Hi!

How can I make a random selection of n row elements from a matrix. 
The matrix was previously created from a table with different rows and
columns. However I want to keep all the information (columns), I just want
to reduce the number of observations.

Thanks,
Ana

--
View this message in context: 
http://r.789695.n4.nabble.com/random-selection-of-elements-from-a-matrix-tp3668574p3668574.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread William Dunlap

You may find it easier to use the data.frame method for aggregate
instead of the formula method when you are using vectors of column
names.   E.g.,

  responseVars - c(mpg, wt)
  byVars - c(cyl, gear)
  aggregate(mtcars[responseVars], by=mtcars[byVars], FUN=median)

gives the same result as

  aggregate(cbind(mpg, wt) ~ cyl + gear, FUN=median, data=mtcars)

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Dimitri Liakhovitski
Sent: Thursday, July 14, 2011 1:45 PM
To: David Winsemius
Cc: r-help
Subject: Re: [R] cbind in aggregate formula - based on an existing object 
(vector)

Thanks a lot!

actually, what I tried to do is very simple - just passing tons of
variable names into the formula. Maybe that get thing suggested by
Bert would work...

Dimitri


On Thu, Jul 14, 2011 at 4:01 PM, David Winsemius dwinsem...@comcast.net wrote:
 Dmitri:

 as.matrix makes a matrix out of the dataframe that is passed to it.

 As a further note I attempted and failed for reasons that are unclear to me
 to construct a formula that would (I hoped) preserve the column names which
 are being mangle in the posted effort:

 form - as.formula(paste(
                     cbind(,
                      paste( myvars, collapse=,),
                      ) ~ group+mydate,
                      sep= ) )
 myvars-c(value1,value2)
 example.agg1-aggregate(formula=form,data=example, FUN=sum)
 Error in m[[2L]][[2L]] : object of type 'symbol' is not subsettable
 traceback()
 2: aggregate.formula(formula = form, data = example, FUN = sum)
 1: aggregate(formula = form, data = example, FUN = sum)

 form
 cbind(value1, value2) ~ group + mydate
 parse(text=form)
 expression(~
 cbind(value1, value2), group + mydate)

 So it seems to be correctly dispatched to aggregate.formula but not passing
 some check or another. Also tried with formula() rather than as.formula with
 identical error message. Also tried including without naming the argument.

 --
 David


 On Jul 14, 2011, at 3:32 PM, Dimitri Liakhovitski wrote:

 Thank you, David, it does work.
 Could you please explain why? What exactly does changing it to as matrix
 do?
 Thank you!
 Dimitri

 On Thu, Jul 14, 2011 at 3:25 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 On Jul 14, 2011, at 3:05 PM, Dimitri Liakhovitski wrote:

 Hello!

 I am aggregating using a formula in aggregate - of the type:
 aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

 However, I actually have an object (vector of my variables to be
 aggregated):
 myvars-c(var1,var2,var3)

 I'd like my aggregate formula (its cbind part) to be able to use my
 myvars object. Is it possible?
 Thanks for your help!


 Not sure I have gotten all the way there, but this does work:


 example.agg1-aggregate(as.matrix(example[myvars])~group+mydate,sum,data=example)

 example.agg1

  group     mydate example[myvars]    NA
 1 group1 2008-12-01               4   4.2
 2 group2 2008-12-01               6   6.2
 3 group1 2009-01-01              40  40.2
 4 group2 2009-01-01              60  60.2
 5 group1 2009-02-01             400 400.2
 6 group2 2009-02-01             600 600.2

 Dimitri

 Reproducible example:

 mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
 value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
 value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

 example-data.frame(mydate=mydate,value1=value1,value2=value2)


 example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
 example$group-as.factor(example$group)
 (example);str(example)



 example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
 # this works
 (example.agg1)

 ### Building my object (vector of 2 names - in reality, many more):
 myvars-c(value1,value2)
 example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
 ### does not work


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT





 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 David Winsemius, MD
 West Hartford, CT





-- 
Dimitri Liakhovitski
Ninah Consulting
www.ninah.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented,

Re: [R] random selection of elements from a matrix

2011-07-14 Thread anaraster

ups...already found the solution

matrix2 - matrix1[sample(samplenumber,replace=F),]

--
View this message in context: 
http://r.789695.n4.nabble.com/random-selection-of-elements-from-a-matrix-tp3668574p3668594.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Error: non-numeric argument to binary operator

2011-07-14 Thread Anil acharya

Hi 
I am posting in the topic related to the non-numeric argument to binary 
operator  as I got similar problem while running the netcdf code. I have 
attached the file with this post. It is a climate data from NOAA site. The code 
follows as: 

library(survival) 
library(RNetCDF) 
library(ncdf) 

setwd(c:/projects/netcdfcsfiles) 
Conn42 = open.ncdf(128.111.220.111.46.15.32.42.nc); 

# read the time variable, which measures years, and 
# use the length of the vector to estimate the time span 
# 
timeObj = get.var.ncdf(Conn42,time); 
file42YrRangeDays = trunc(length(timeObj)) 

# Process each file separately: 
# get attributes, read the rhum data cube, 
# extract the time series from the cube, 
# and rescale the time series vector 
# 
scaleFact = att.get.ncdf(Conn42,rhum,scale_factor) 
offset= att.get.ncdf(Conn42,rhum,add_offset) 
rhumObj = get.var.ncdf(Conn42,rhum); 
# 
# the Relative Humidity data 'cube' has 4 dimensions: 
#  latitudes (4), longitudes (3), pressure levels (1), 
#  and time (days - several thousand) 
# 
rhCube = array(rhumObj,dim = c(4,3,file42YrRangeDays)) 
rh42Vec = (((rhCube[2,2,1:file42YrRangeDays]) * scaleFact) + offset) 

#(Here is the point where I got the error Error in (rhCube[2, 2, 
1:file24YrRangeDays]) * scaleFact : non-numeric argument to binary operator 


I appreciate if anyone can help me out. 
 Sincerely,

Anil Acharya __
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very slow optim(): solved

2011-07-14 Thread Mike Marchywka

 Date: Thu, 14 Jul 2011 12:44:18 -0800
 From: toshihide.hamaz...@alaska.gov
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Very slow optim(): solved

 After Googling and trial and errors, the major cause of optimization was not 
 functions, but data setting.
 Originally, I was using data.frame for likelihood calculation. Then, I 
 changed data.frame to vector and matrix for the same likelihood calculation. 
 Now convergence takes ~ 14 sec instead of 25 min. Certainly, I didn't know 
 this simple change makes huge computational difference.

Thanks, can you pass along any additional details like google links you found 
or comment on
the resulting limitation( were you CPU limited converting data formats or did 
this cause memory
problems leading to VM thrashing?)? I've often had c++ code that turns out to 
be IO limited
when I expected I was doing real complicated computations, it never hurts to go 
beyond
the usual suspects LOL. 

 Toshihide Hamachan Hamazaki, 濱崎俊秀PhD
 Alaska Department of Fish and Game: アラスカ州漁業野生動物課
 Diivision of Commercial Fisheries: 商業漁業部
 333 Raspberry Rd. Anchorage, AK 99518
 Phone: (907)267-2158
 Cell: (907)440-9934

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Ben Bolker
 Sent: Wednesday, July 13, 2011 12:21 PM
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] Very slow optim()

 Hamazaki, Hamachan (DFG toshihide.hamazaki at alaska.gov writes:

  Dear list,

  I am using optim() function to MLE ~55 parameters, but it is very slow to
 converge (~ 25 min), whereas I can do
  the same in ~1 sec. using ADMB, and ~10 sec using MS EXCEL Solver.

  Are there any tricks to speed up?

  Are there better optimization functions?

 There's absolutely no way to tell without knowing more about your code. You
 might try method=CG:

 Method ‘CG’ is a conjugate gradients method based on that by
 Fletcher and Reeves (1964) (but with the option of Polak-Ribiere
 or Beale-Sorenson updates). Conjugate gradient methods will
 generally be more fragile than the BFGS method, but as they do not
 store a matrix they may be successful in much larger optimization
 problems.

 If ADMB works better, why not use it? You can use the R2admb
 package (on R forge) to wrap your ADMB calls in R code, if you
 prefer that workflow.

 Ben

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Correct behavior of Hmisc::capitalize()?

2011-07-14 Thread Henrik Bengtsson

Hi,

from example(capitalize) of the Hmisc package (v 0.8.3) you get:

 capitalize(c(Hello, bob, daN))
[1] Hello Bob   daN

Is that daN correct?

If so, then this behavior that only *all lowercase strings*, which the
code indicates,  will be capitalized is not documented.

 Hmisc::capitalize
function (string)
{
capped - grep(^[^A-Z]*$, string, perl = TRUE)
substr(string[capped], 1, 1) - toupper(substr(string[capped],
1, 1))
return(string)
}
environment: namespace:Hmisc

There are also some misspelled words in help(capitalize).


 sessionInfo()
R version 2.13.1 Patched (2011-07-09 r56344)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] Hmisc_3.8-3 survival_2.36-9

loaded via a namespace (and not attached):
[1] cluster_1.14.0  grid_2.13.1 lattice_0.19-30 tools_2.13.1


/Henrik
(Hmisc maintainer cc:ed)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cbind in aggregate formula - based on an existing object (vector)

2011-07-14 Thread Dennis Murphy

Hi:

I think Bill's got the right idea for your problem, but for the fun of
it, here's how Bert's suggestion would play out:

# Kind of works, but only for the first variable in myvars...
 aggregate(get(myvars) ~ group + mydate, FUN = sum, data = example)
   group mydate get(myvars)
1 group1 2008-12-01   4
2 group2 2008-12-01   6
3 group1 2009-01-01  40
4 group2 2009-01-01  60
5 group1 2009-02-01 400
6 group2 2009-02-01 600

# Maybe sapply() with get as the function will work...
 aggregate(sapply(myvars, get) ~ group + mydate, FUN = sum, data = example)
   group mydate myvars   get
1 group1 2008-12-01  4   4.2
2 group2 2008-12-01  6   6.2
3 group1 2009-01-01 40  40.2
4 group2 2009-01-01 60  60.2
5 group1 2009-02-01400 400.2
6 group2 2009-02-01600 600.2

Apart from the variable names, it matches example.agg1. OTOH, Bill's
suggestion matches example.agg1 exactly and has an advantage in terms
of code clarity:

byVars - c('group', 'mydate')
 aggregate(example[myvars], by = example[byVars], FUN = sum)
   group mydate value1 value2
1 group1 2008-12-01  44.2
2 group2 2008-12-01  66.2
3 group1 2009-01-01 40   40.2
4 group2 2009-01-01 60   60.2
5 group1 2009-02-01400  400.2
6 group2 2009-02-01600  600.2

FWIW,
Dennis

On Thu, Jul 14, 2011 at 12:05 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I am aggregating using a formula in aggregate - of the type:
 aggregate(cbind(var1,var2,var3)~factor1+factor2,sum,data=mydata)

 However, I actually have an object (vector of my variables to be aggregated):
 myvars-c(var1,var2,var3)

 I'd like my aggregate formula (its cbind part) to be able to use my
 myvars object. Is it possible?
 Thanks for your help!

 Dimitri

 Reproducible example:

 mydate = rep(seq(as.Date(2008-12-01), length = 3, by = month),4)
 value1=c(1,10,100,2,20,200,3,30,300,4,40,400)
 value2=c(1.1,10.1,100.1,2.1,20.1,200.1,3.1,30.1,300.1,4.1,40.1,400.1)

 example-data.frame(mydate=mydate,value1=value1,value2=value2)
 example$group-c(rep(group1,3),rep(group2,3),rep(group1,3),rep(group2,3))
 example$group-as.factor(example$group)
 (example);str(example)

 example.agg1-aggregate(cbind(value1,value2)~group+mydate,sum,data=example)
 # this works
 (example.agg1)

 ### Building my object (vector of 2 names - in reality, many more):
 myvars-c(value1,value2)
 example.agg1-aggregate(cbind(myvars)~group+mydate,sum,data=example)
 ### does not work


 --
 Dimitri Liakhovitski
 Ninah Consulting
 www.ninah.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cost-sensitive classification

2011-07-14 Thread snassimr

Hi , everybody !!!

I want to perform a cost-sensitive classification using the rpart as a base
classifier .

Is it possible ?

Nissim

--
View this message in context: 
http://r.789695.n4.nabble.com/Cost-sensitive-classification-tp3668749p3668749.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2011-07-14 Thread Tyler Rinker



Good Afternoon R Community,
 
I often work with very large data bases and want to search for select cases by 
a particular word or numeric value.  I created the following simple function to 
do just that.  It searchs a particular column for the phrase and returns a data 
frame with the rows that contain that phrase (for a particular column).  
 
Search-function(term, dataframe, column.name, variation=.02,...){
te-substitute(term)
   te-as.character(te)
   cn-substitute(column.name)
  cn-as.character(cn)
  HUNT-agrep(te,dataframe[,cn],ignore.case 
=TRUE,max.distance=variation,...)
   dataframe[c(HUNT),]
}
 
I would like to modify this to search all columns for the phrase keep only the 
unique rows and return a data frame for any columns (minus repeated rows) that 
contain the phrase.
 
I assumed this would be an easy task for me using sapply() and unique() or 
union().  Because this argument takes more than one argument (vector{column} is 
not the only argument) I dont know how to set it up.  Could someone tell me 
how to apply this function to multiple columns and return one data frame with 
all the agrep matches (Ill figure out how to deal with duplicates after that; 
thats the easy part).
 
Thank you in advance for your help,
Tyler Rinker
 
PS if your idea is a for loop please explain it well or provide the code 
because I do not have a programming background and for loops are very difficult 
to wrap my head around.
 
Running windows 7
R version 2.14.0 (beta)   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Splitting one column value into multiple rows

2011-07-14 Thread Madana_Babu

Hi i have the data in the following format:

rent,100,1,common,674
pipe,200,0,usual,864
car,300,1,uncommon,392:jump,700,0,common,664
car,200,1,uncommon,864:snap,900,1,usual,746
stint,600,1,uncommon,257
pull,800,0,usual,594

where as i want the above 6 lines data into 8 lines as below (Spliting row 3
 4 at : and sending to a new row):

rent,100,1,common,674
pipe,200,0,usual,864
car,300,1,uncommon,392
jump,700,0,common,664
car,200,1,uncommon,864
snap,900,1,usual,746
stint,600,1,uncommon,257
pull,800,0,usual,594

Request any one who can help me getting this done.

Regards,
Madana

--
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3668835.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recursive function - finding connections

2011-07-14 Thread Benton, Paul

Sorry bad example. My data is undirected. It's a correlation matrix so probably 
better to look at something like:

foomat-cor(matrix(rnorm(100), ncol=10))
foomat

mine are pvalues from the correlation but same idea.


On 14 Jul 2011, at 11:23, Erich Neuwirth wrote:

 cliques only works for undirected graphs.
 Your matrix is not symmetric, therefore
 the graph is directed.
 
 
 On 7/14/2011 8:53 AM, Benton, Paul wrote:
 Dear all,
 
 I'm having some problems getting my recursive function to work. At first I 
 though that maybe my data was too big and I increase 
 option(expressions=5). Then I thought that I would try it on some 
 smaller data. Still not working. :( 
 I would have thought there should be a function for this already, so any 
 suggestions are welcomed for other methods. I did try igraph but couldn't 
 get cliques() to give anything useful. Also a quick play with hclust and 
 cut, again nothing too useful.
 
 Basically the function is trying to find uniquely connected subgraphs. So 
 the sub-network is only connected by itself and not to other nodes. If 
 everything is connected then the list (connectedList) should be length of 1 
 and have every index in the 1st slot.
 
 cheers,
 
 Paul
 
 
 findconnection-function(mat, cutoff){
  toList-function(mat, connectList, cutoff, i, idx){
  idx-which(mat[,idx]  cutoff)
  if(length(idx) = 1){
  connectList[[i]]-idx
  for(z in 1:length(idx)){
  connectList-toList(mat, connectList, cutoff, 
 i, idx[z])
  }
  }else{
  return(connectList)
  }
  }
  
  connectList-list()
  for(i in 1:ncol(mat)){
  connectList-toList(mat, connectList, cutoff, i, i)
  }
  return(unique(connectList)) 
 }
 
 foomat-matrix(sample(c(1,0.5,0), 100, replace=T), nrow=10) ## example data
 foomat
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]  0.0  0.5  0.0  0.5  0.5  0.0  0.5  1.0  0.5   0.0
 [2,]  0.0  1.0  1.0  0.0  0.0  1.0  0.0  1.0  0.5   1.0
 [3,]  1.0  1.0  1.0  1.0  0.5  0.0  0.5  0.5  0.5   0.5
 [4,]  0.0  0.5  0.0  0.0  0.5  0.5  0.5  0.0  1.0   0.0
 [5,]  0.5  0.5  1.0  1.0  0.5  1.0  1.0  0.5  0.5   0.5
 [6,]  0.0  0.5  0.0  0.5  0.5  0.5  0.5  0.5  1.0   1.0
 [7,]  1.0  1.0  0.0  1.0  0.0  0.5  1.0  1.0  0.5   0.5
 [8,]  0.5  1.0  0.0  0.5  1.0  0.0  1.0  0.0  0.0   0.0
 [9,]  0.0  0.5  0.0  0.0  0.5  0.0  0.5  0.0  0.5   0.5
 [10,]  1.0  1.0  0.5  1.0  0.0  1.0  0.0  0.0  0.0   0.5
 pb-findconnection(foomat, 0.01)
 Error: C stack usage is too close to the limit
 Error during wrapup: 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SQldf with sqlite and H2

2011-07-14 Thread Mandans

Thanks a lot Gabor. It helped a lot. Appreciate your time and effort.

Thanks

--- On Thu, 7/14/11, Gabor Grothendieck ggrothendi...@gmail.com wrote:

 From: Gabor Grothendieck ggrothendi...@gmail.com
 Subject: Re: [R] SQldf with sqlite and H2
 To: Mandans mandan...@yahoo.com
 Cc: r-help@r-project.org
 Date: Thursday, July 14, 2011, 2:22 PM
 On Thu, Jul 14, 2011 at 10:33 AM,
 Mandans mandan...@yahoo.com
 wrote:
  SQldf with sqlite and H2

  I have a large csv file (about 2GB) and wanted to
 import the file into R and do some filtering and analysis.
 Came across sqldf ( a great idea and product) and was trying
 to play around to see what would be the best method of doing
 this. csv file is comma delimited with some columns having
 comma inside the quoation like this John, Doe.

  I tried this first

  ###
  library(sqldf)
  sqldf(attach testdb as new)
  In.File - C:/JP/Temp/2008.csv
  read.csv.sql(In.File, sql = create table table1 as
 select * from file,
   dbname = testdb)

  It errored out with message

  NULL
  Warning message:
  closing unused connection 3 (C:/JP/Temp/2008.csv)

  When this failed, I converted this file from comma
 delimited to tab delimited and used this command

  #
  read.csv.sql(In.File, sql = create table table1 as
 select * from file,
   dbname = testdb, sep = \t)

  and this worked, it created testdb sqlite file with
 the size of 3GB

  now my question is in 3 parts.

  1. Is it possible to create a dataframe with
 appropriate column classes and use that column classes when
 I use the read.csv.sql command to create the table.
 Something like may be create the table from that DF and then
 update with read.csv.sql.?

  Any example code will be really helpful.

 Here is an example of using method = name__class. 
 Note there are
 two underscores in a row.  It appears I neglected to
 document that
 Date2 means convert from character representation whereas
 Date means
 convert from numeric representation.  It would also be
 possible to use
 method = raw and then coerce the columns yourself
 afterwards.

 # create test file
 Lines - 'A__Date2|B
 2000-01-01|x,y
 2000-01-02|c,d
 '
 tf - tempfile()
 cat(Lines, file = tf)

 library(sqldf)
 DF - read.csv.sql(tf, sep = |, method =
 name__class)
 str(DF)

  2. If we use the H2 database instead of default sqlite
 and use the readcsv option, will that be faster and is there
 a way we can specify the above thought of applying a DF
 class to table column properties and update with CSVREAD

  library(RH2)
  something like SELECT * FROM
 CSVREAD('C:/JP/Temp/2008.csv')

  Any example code will be really helpful.

 Sorry, I haven't tested the speed of this.  postgresql
 and mysql, both
 supported by sqldf, also have builtin methods to read
 files. If I had
 to guess I would guess that mysql would be fastest but this
 would have
 to be tested.

  3. How do we specify where the H2 file is saved. Saw
 something like this, when I ran this example from RH2
 package, couldn't find the file in the working directory.

  con - dbConnect(H2(), jdbc:h2:~/test, sa, )

 ~ means your home directory so ~/test means test is in the
 home directory.

 Try

 normalizePath(~)
 normalizePath(~/test)
 etc.

 to see what they refer to.

 Regards.

  Sorry for the long mail. Appreciate all for building a
 great community and for the wonderful software in R.
  Thanks for Gabor Grothendieck for bring sqldf to this
 great community.

  Any help or direction you can provide in this is
 highly appreciated.

  Thanks all.

  __
  R-help@r-project.org
 mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.

 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recursive function - finding connections

2011-07-14 Thread Peter Langfelder

Hi Paul,

I assume you are using the argument cutoff to specify the p-value
below which nodes are considered connected and above which they are
not connected.

I would use single linkage hierarchical clustering. If you have two
groups of nodes and any two nodes between the groups are connected
(i.e. have adjacency =1 or dissimilarity 0), then the groups have
dissimilarity 0. If no two nodes between the two groups are connected,
you will get dissimilarity 1. Thus you can use any tree cut height
between 0 and 1 to get the clusters that correspond to connected. For
large data you will need a large computer to hold your distance
matrix, but you must have observed that already.

subgraphs = function(mat, cut)
{
  disconnected = matcut # Change the inequality if necessary
  tree = hclust(as.dist(disconnected), method = single)
  clusters = cutree(tree, h = 0.5)
  # Clusters is already the answer, but you want it in a different
format, so we reformat it.
  nClusters = max(clusters)
  connectedList = list();
  for (c in 1:nClusters)
connectedList[[c]] = which(clusters==c)
  connectedList
}

Try it and see if this does what you want.

HTH

Peter

On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul
hpaul.bento...@imperial.ac.uk wrote:
 Sorry bad example. My data is undirected. It's a correlation matrix so 
 probably better to look at something like:

 foomat-cor(matrix(rnorm(100), ncol=10))
 foomat

 mine are pvalues from the correlation but same idea.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] recursive function - finding connections

2011-07-14 Thread Peter Langfelder

One more thing - for large data sets, the packages flashClust and
fastcluster provide much faster hierarchical clustering that (at least
for flashClust which I'm the maintainer of) give the exact same
results. Simply insert a

library(flashClust)

before you call the function and your code will run much faster.

Peter

On Thu, Jul 14, 2011 at 4:58 PM, Peter Langfelder
peter.langfel...@gmail.com wrote:
 Hi Paul,

 I assume you are using the argument cutoff to specify the p-value
 below which nodes are considered connected and above which they are
 not connected.

 I would use single linkage hierarchical clustering. If you have two
 groups of nodes and any two nodes between the groups are connected
 (i.e. have adjacency =1 or dissimilarity 0), then the groups have
 dissimilarity 0. If no two nodes between the two groups are connected,
 you will get dissimilarity 1. Thus you can use any tree cut height
 between 0 and 1 to get the clusters that correspond to connected. For
 large data you will need a large computer to hold your distance
 matrix, but you must have observed that already.

 subgraphs = function(mat, cut)
 {
  disconnected = matcut # Change the inequality if necessary
  tree = hclust(as.dist(disconnected), method = single)
  clusters = cutree(tree, h = 0.5)
  # Clusters is already the answer, but you want it in a different
 format, so we reformat it.
  nClusters = max(clusters)
  connectedList = list();
  for (c in 1:nClusters)
    connectedList[[c]] = which(clusters==c)
  connectedList
 }

 Try it and see if this does what you want.

 HTH

 Peter

 On Thu, Jul 14, 2011 at 4:12 PM, Benton, Paul
 hpaul.bento...@imperial.ac.uk wrote:
 Sorry bad example. My data is undirected. It's a correlation matrix so 
 probably better to look at something like:

 foomat-cor(matrix(rnorm(100), ncol=10))
 foomat

 mine are pvalues from the correlation but same idea.




-- 
Sent from my Linux computer. Way better than iPad :)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Font problem

2011-07-14 Thread Noah Silverman

Hi,

I'm running Redhat Linux (I believe it is Fedora 13 or 14)  With the latest 
version of R

Everything works nicely, except for the fonts on some plots.  I see small empty 
boxes instead of the number.  It seems as if this is only the case when the 
fonts are small.

I've installed all of the font packages recommended in the R documentation.

Any steps I can take to diagnose the missing/problem fonts and/or how to 
correct this?

Thanks!

--
Noah Silverman
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2011-07-14 Thread David Winsemius



On Jul 14, 2011, at 6:15 PM, Tyler Rinker wrote:




Good Afternoon R Community,

I often work with very large data bases and want to search for  
select cases by a particular word or numeric value.  I created the  
following simple function to do just that.  It searchs a particular  
column for the phrase and returns a data frame with the rows that  
contain that phrase (for a particular column).


Search-function(term, dataframe, column.name, variation=.02,...){
   te-substitute(term)
  te-as.character(te)
  cn-substitute(column.name)
 cn-as.character(cn)
 HUNT-agrep(te,dataframe[,cn],ignore.case  
=TRUE,max.distance=variation,...)

   ### dataframe[c(HUNT),]


   HUNTL - (1:NROW(dataframe) %in% HUNT)


}



You would make life simpler by keeping your results as logical vectors  
the same length as your dataframe.


Then:

 logHunt -  sapply(dfrmname, Search, term=term, )
 indexL - rowSums(logHunt) =1
dfrmname[indexL, ]

Untested in absence of test data.

--
David.


I would like to modify this to search all columns for the phrase  
keep only the unique rows and return a data frame for any columns  
(minus repeated rows) that contain the phrase.


I assumed this would be an easy task for me using sapply() and  
unique() or union().  Because this argument takes more than one  
argument (vector{column} is not the only argument) I dont know how  
to set it up.  Could someone tell me how to apply this function to  
multiple columns and return one data frame with all the agrep  
matches (Ill figure out how to deal with duplicates after that;  
thats the easy part).


Thank you in advance for your help,
Tyler Rinker

PS if your idea is a for loop please explain it well or provide the  
code because I do not have a programming background and for loops  
are very difficult to wrap my head around.


Running windows 7
R version 2.14.0 (beta) 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting one column value into multiple rows

2011-07-14 Thread David Winsemius



On Jul 14, 2011, at 6:47 PM, Madana_Babu wrote:


Hi i have the data in the following format:

rent,100,1,common,674
pipe,200,0,usual,864
car,300,1,uncommon,392:jump,700,0,common,664
car,200,1,uncommon,864:snap,900,1,usual,746
stint,600,1,uncommon,257
pull,800,0,usual,594

where as i want the above 6 lines data into 8 lines as below  
(Spliting row 3

 4 at : and sending to a new row):


 Lines - readLines(textConnection(rent,100,1,common,674
+ pipe,200,0,usual,864
+ car,300,1,uncommon,392:jump,700,0,common,664
+ car,200,1,uncommon,864:snap,900,1,usual,746
+ stint,600,1,uncommon,257
+ pull,800,0,usual,594))
 closeAllConnections()
 newlines - strsplit(Lines, :)
 newlines2 - unlist(newlines)
 newlines2
[1] rent,100,1,common,674pipe,200,0,usual,864 car, 
300,1,uncommon,392   jump,700,0,common,664
[5] car,200,1,uncommon,864   snap,900,1,usual,746 stint, 
600,1,uncommon,257 pull,800,0,usual,594

 read.table(textConnection(newlines2), sep=,)
 V1  V2 V3   V4  V5
1  rent 100  1   common 674
2  pipe 200  0usual 864
3   car 300  1 uncommon 392
4  jump 700  0   common 664
5   car 200  1 uncommon 864
6  snap 900  1usual 746
7 stint 600  1 uncommon 257
8  pull 800  0usual 594



rent,100,1,common,674
pipe,200,0,usual,864
car,300,1,uncommon,392
jump,700,0,common,664
car,200,1,uncommon,864
snap,900,1,usual,746
stint,600,1,uncommon,257
pull,800,0,usual,594

Request any one who can help me getting this done.

Regards,
Madana

--
View this message in context: 
http://r.789695.n4.nabble.com/Splitting-one-column-value-into-multiple-rows-tp3668835p3668835.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Add coordinates at specific points...

2011-07-14 Thread Greg Snow

To add to what David and Duncan wrote: If you want to plot something at a point 
where the x coordinate is in user coordinates, but the y-coordinate is 
something like the middle of the plot, or 1/5th of the way from the top then 
you can use the grconvertY function along with the text function.

If you are going to have several labels and want to plot them so that they do 
not overlap then there are functions in the plotrix and TeachingDemos packages 
that can be used to spread labels.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of David Winsemius
Sent: Wednesday, July 13, 2011 6:11 PM
To: JIA Pei
Cc: r-help@r-project.org
Subject: Re: [R] Add coordinates at specific points...


On Jul 13, 2011, at 7:59 PM, JIA Pei wrote:


 Thanks David, for your prompt reply.
 Ok... just avoid this mess.
 I'd love to emphasize the key point M(2.1, sin(2.1)), and just write  
 a point coordinate by the side of M .How to do that easily?

Have you worked through the examples on the help(text) page? This does  
not seem to be that difficult. Three essential arguments x,y and  
text. You are supposed to put for effort and show what code you have  
tried.

text(2.1, sin(2.1), labels=M=(2.1, sin(2.1) ))

You can play around with the positioning arguments, which IIRC are  
adj= and something else.

 By the way, if there is another point N(2.2, sin(2.2)), will M and N  
 conflict (overlap) each other?

I don't see how that could be avoided. (Last color wins.) We don't  
have 3d glasses for our screens yet. Transparenct colors are  
available, but you don't seem to be ready for that yet.


 Cheers
 Pei



 On Wed, Jul 13, 2011 at 4:49 PM, David Winsemius dwinsem...@comcast.net 
  wrote:

 On Jul 13, 2011, at 7:42 PM, JIA Pei wrote:




 Hi, Thanks David Winsemius:


 mtext works !! However, in R plot, mtext will automatically  
 overlap/overwrite the existing coordinates, which makes the  
 coordinates a messy.
 Refer to http://www.visionopen.com/Rplot.png , which is produced by  
 only 3 lines.

 dev.new(width = 640, height = 480)
 plot(sin, -pi, 2*pi)
 mtext(text=2.1000, side = 1, line = 1,  at = 2.1)

 In order to avoid the coordinates overlap, can I change the position  
 of 2.1000 from outside the box to inside the box? If possible, how  
 to?
 I tried outer = FALSE, nothing special happened ever !!

 You should use text() rather than mtext() if you are going to plot  
 within the user area.




 cheers
 Pei



 On Wed, Jul 13, 2011 at 1:41 PM, David Winsemius dwinsem...@comcast.net 
  wrote:

 On Jul 13, 2011, at 1:22 PM, JIA Pei wrote:

 Hi, all:

 I used two lines of very simple code to draw a sin curve.

 dev.new(width = 640, height = 480)
 plot(sin, -pi, 2*pi)

 First look at:

 ?mtext   # then try
 mtext(text=2.5000, side = 1, line = 1,  at = 2.5)



 Now, I added a specific line (red line in the picture at
 http://www.visionopen.com/Rplot.png) by using abline.
 However, I still love to add the X-coordinate 2.5 outside the
 rectangle box. How to do it in R?
 Right now, I'm using Windows Paint to add the characters 2.5  
 up to the
 plot drawn by R.


-- 

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] validate survival with val.surv

2011-07-14 Thread zhu yao

Dear R users:

I want to externally validate a model with val.surv.

Can I use only calculated survival (at 1 year) and actual survival?
Or I needed the survival function and actual survival.

Thanks

*Yao Zhu*
*Department of Urology
Fudan University Shanghai Cancer Center
Shanghai, China*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] validate survival with val.surv

2011-07-14 Thread Frank Harrell

The documentation for val.surv tried to be clear about that.

Note that val.surv is only for the case where the predictions were developed
on a separate dataset, i.e., that the validation is truly 'external'.
Frank

yz wrote:
 
 Dear R users:
 
 I want to externally validate a model with val.surv.
 
 Can I use only calculated survival (at 1 year) and actual survival?
 Or I needed the survival function and actual survival.
 
 Thanks
 
 *Yao Zhu*
 *Department of Urology
 Fudan University Shanghai Cancer Center
 Shanghai, China*
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/validate-survival-with-val-surv-tp3669022p3669051.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] computing functions with Euler's number (e^n)

2011-07-14 Thread warmstron1

R 2.11.1 on Mac OS X.  I didn't see the Note.

--
View this message in context: 
http://r.789695.n4.nabble.com/computing-functions-with-Euler-s-number-e-n-tp3655205p3668849.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Non-negative least squares for sparse matrix

2011-07-14 Thread Erik Wright

Hello,

I am attempting to solve the least squares problem Ax = b in R, where A and b 
are known and x is unknown.  It is simple to solve for x using one of a variety 
of methods outlined here:
http://cran.r-project.org/web/packages/Matrix/vignettes/Comparisons.pdf

As far as I can tell, none of these methods will solve for x when A, x, and b 
are constrained to be non-negative (x  0).  Other packages, such as nnls, can 
solve the non-negative least squares problem, but do not work with very large 
sparse matrices.

The matrix A that I am using is 750,000 by 46,000 elements with 99% zeros, and 
matrix b is a dense 750,000 by 1 matrix.  Does an R function exist for solving 
the non-negative least squares problem with a sparse matrix?

Thanks!,
Erik


 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] nnls_1.3   Matrix_0.999375-50 MASS_7.3-12   
[4] lattice_0.19-23   

loaded via a namespace (and not attached):
[1] grid_2.13.0  tools_2.13.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to order each element according to alphabet

2011-07-14 Thread onthetopo

Hi there,

  I have a large amino acid csv file like this:

input.txt:
P,LV,Q,Z
P,VL,Q,Z
P,ML,QL,Z

There is a problem with this file, since LV and VL are in fact the same
thing.   
How do I order each element according to alphabetical order so that the
desired output would look like:

output.txt:
P,LV,Q,Z
P,LV,Q,Z
P,LM,LQ,Z




--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3668997.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Export Unicode characters from R

2011-07-14 Thread Sverre Stausland

Dear helpers,

I am not able to export Unicode characters from R. Below is an example
where the Unicode character is correctly rendered as long as I am stay
within R. When I export it, the character appears only with its basic
code, and the same happens when I import it back into R . I'm using R
2.13.1 in Windows XP.

 funny.g - \u1E21

 funny.g
[1] ḡ

 data.frame (funny.g) - funny.g

 funny.g$funny.g
[1] ḡ
Levels: U+1E21

 write.table (funny.g, file = C:/~funny.g.txt, col.names = FALSE, row.names 
 = FALSE, quote = FALSE, fileEncoding = UTF-8)

 read.table (C:/~funny.g.txt, header = FALSE, encoding = UTF-8) - 
 input.funny.g

 input.funny.g$V1
[1] U+1E21
Levels: U+1E21


Best
Sverre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculate Az (A sub z) with R?

2011-07-14 Thread dd2es

I am looking for (or interested in writing) a function that calculates Az, an
alternative measure of discriminability from SDT (alternative to d', Az). I
have written my own functions for d', A', Bd, and am aware of the 'sdtalt'
package, but I have yet to find a way to calculate Az, since it require the
phi operator.

For a relevant paper/discussion (and formula), please see Verde and
McMillian, 2006 (Measures of sensitivity based on a single hit rate and
false alarm rate: The accuracy, precision, and robustness of d', Az, and Az)

Any help on this would be greatly appreciated!

David Dobolyi
Graduate Student
Cognitive Psychology
University of Virginia

--
View this message in context: 
http://r.789695.n4.nabble.com/Calculate-Az-A-sub-z-with-R-tp3668893p3668893.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-14 Thread David Winsemius



On Jul 14, 2011, at 9:18 PM, onthetopo wrote:


Hi there,

 I have a large amino acid csv file like this:

input.txt:
P,LV,Q,Z
P,VL,Q,Z
P,ML,QL,Z



Are you also asking how to read a comma separated file?

? read.csv   # and read more introductory material


There is a problem with this file, since LV and VL are in fact the  
same

thing.
How do I order each element according to alphabetical order so that  
the

desired output would look like:

output.txt:
P,LV,Q,Z
P,LV,Q,Z
P,LM,LQ,Z



That is not a reproducible example without input code:

Perhaps:

as.data.frame(lapply(input_dfrm, gsub, patt=LV, repl=VL))

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-14 Thread onthetopo

Hi, 

  There are many more patterns than VL to LV.  In fact, too many to be
listed manually. 

For example ML should be ordered as LM, QL should be ordered as LQ.
The order is according to the alphabet. 


--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669130.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-14 Thread David Winsemius



On Jul 14, 2011, at 11:19 PM, onthetopo wrote:


Hi,

 There are many more patterns than VL to LV.  In fact, too many to be
listed manually.




For example ML should be ordered as LM, QL should be ordered as LQ.
The order is according to the alphabet.


A more complete (reproducible) answer would have been appreciated and  
note that local custom dictates that context is offered for ongoing  
threads. Nabble provides a mechanism for doing so.


lets2 - paste(LETTERS[sample(20, replace=TRUE)],
LETTERS[sample(20, replace=TRUE)],
sep=)
 lets2
 [1] IA EP TE IT PS DO RO EJ DR DD LM OF RJ  
OA JD QB AS TG MK IM


 sapply( lapply(
  strsplit(lets2, split=), sort),
  paste, collapse=)

 [1] AI EP ET IT PS DO OR EJ DR DD LM FO JR  
AO DJ BQ AS GT KM IM



--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669130.html
Sent from the R help mailing list archive at Nabble.com.


Nabble is NOT rhelp. So PLEASE, PLEASE, PLEASE:


 do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Bar chart in ascending order for each level of X

2011-07-14 Thread Muhuri, Pradip (SAMHSA/CBHSQ)


Hello List,

The question is how to plot a bar chart in which bars are sorted  in ascending 
order for each level of X.  I would appreciate receiving your advice and help.


Thanks,

Pradip Muhuri

**

The following codes work when producing the chart in which bars are NOT sorted. 
 Please see the output.

* Data File
5.1 8.7 1.6
3.7 7.4 2.8
10.412.03.5
4.4 8.8 1.7
2.0 3.5 0.7
6.7 11.03.1
5.3 6.7 1.8

###
#source(C:/Documents and Settings/pradip.muhuri/My 
Documents/disorders_chart1.R) - Please ignore this line

#R Scripts for bar chart begin here

# Read drug data from tab-delimited data set
drug_data - read.table(C:/Documents and Settings/pradip.muhuri/My 
Documents/xdrug.dat, header=FALSE,
col.names=c(Age_1217, Age_1825, Age_26Plus),
row.names = c(White,Black,Native American/Alaska 
Native,Hawaiian/OPI,Asian, More than One Race, Hispanic),
sep=\t)

# Graph drug use disorder data with adjacent bars using rainbow colors
barplot(as.matrix(drug_data), main=Past-Year Illicit Drug Use Disorders by 
Race/Ethnicity, ylab= Past-Year Use Disorder Rate (%), beside=TRUE, 
col=rainbow(7))
legend(topright, c(White,Black,Native American/Alaska 
Native,Hawaiian/OPI,Asian, More than One Race, Hispanic), cex=0.6, 
bty=n, fill=rainbow(7));

Bar_Graph.pdf
Description: Bar_Graph.pdf
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-14 Thread onthetopo

Thank you very much for your reply doctor.

I tried to apply your command to my table but couldn't
Would you please enlighten me on how to do this when 'lets2' is a 4X4 matrix
for example.


--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669162.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to order each element according to alphabet

2011-07-14 Thread David Winsemius



On Jul 14, 2011, at 11:56 PM, onthetopo wrote:


Thank you very much for your reply doctor.

I tried to apply your command to my table but couldn't
Would you please enlighten me on how to do this when 'lets2' is a  
4X4 matrix

for example.

The message doesn't seem to be getting through. Let's see if higher  
volume works:


**
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
 **

AND
*
and provide commented, minimal, self-contained, reproducible code.
and provide commented, minimal, self-contained, reproducible code.
*

We have yet to see this 4X4 matrix that you are suggesting as an  
example.




--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-order-each-element-according-to-alphabet-tp3668997p3669162.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

97 matches

Mail list logo