[R] browser() can not stop the execution

2009-11-15 Thread chao83

I need to use browser() to stop a while loop to input some value for the
loop. But the browser() just will not stop until the last line of the code.
Does anyone know the possible reason? I use ggobi in the loop, and open a
few ggobi windows before the browser(), will that be the reason?

Thanks A LOT!
-- 
View this message in context: 
http://old.nabble.com/browser%28%29-can-not-stop-the-execution-tp26356069p26356069.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re move row if the column date_abandoned has a date in it

2009-11-15 Thread frenchcr

sorry David,

im really new to R (my first week) and appreciate your help. Also I dont
always know what info to give people on the forum (although im starting to
catch the drift).

heres what i get...

summary(new_data4$date_abandoned) 
 Min.1st Qu.Median Mean  3rd Qu. Max.  NA's 
1601 1998 2001 1993 2004  2009   315732 


 ls()
[1] data  new_data  new_data2 new_data3 new_data4
 small - head(new_data4, 20)
 dump(small, 20)
Error in dump(small, 20) : cannot write to this connection
 

frenchcr





David Winsemius wrote:
 
 
 On Nov 14, 2009, at 5:24 PM, frenchcr wrote:
 


 I tried the following but it does the opposite of what i want:

 new_data5 - subset(new_data4, date_abandoned  0101)

 I want to remove the rows with dates and leave just the rows without  
 a date.

 This removes all the rows that dont have a date in the  
 date_abandoned column

 ...on a positive note, as i did this next...

 dim(new_data5)
 [1] 263  80

 i now know that i have 263 dates in that column :)

 I want to remove the 263 rows with dates and leave just the rows  
 without a
 date.
 
 Con=me on frenchcr. Stop making us guess. Give us enough information  
 to work with. You asked for something which I construed as saying you  
 wanted dates greater than the the first day of the year 101. You did  
 not address this question.
 
 What do you get with str(new_data4) and  
 summary(new_data4$date_abandoned) ? In order to know what sort of  
 comparison to use we need to know what the data looks like.
 
 Even better if you offered the output from:
 
 small - head(new_data4, 20)
 dump(small, 20),
 
 -- 
 David
 







 David Winsemius wrote:


 On Nov 14, 2009, at 1:21 PM, frenchcr wrote:



 I want to go through a column in data called

 Bad name for a data.frame. Fortunes, dog and all that.

 date_abandoneddata[date_abandoned]and remove all the rows
 that
 have numbers greater than 1,010,000.

 Are you doing archeology? Given what you say next I wondered what
 range you were really asking for.


 The dates are in the format 20091114 so i'm just going to treat them
 as
 numbers for clean up purposes.


 I know that i use subset but not sure how to proceed from there.

 subdata - subset(data, date_abandoned  0101()


 The problem with  101 is that your specified minimum point had
 an insufficient number of places to be in MMDD format.

 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -- 
 View this message in context:
 http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26355689.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R crashing

2009-11-15 Thread Dimitri Szerman
Hello,

This is what I am trying to do: I wrote a little function that takes
addresses (coordinates) as input, and returns the road distance between
every two points using Google Maps. Catch is, there are 2000 addresses, so I
have to get around 2x10^6 addresses. On my first go, this is what I did:

#

getRoadDist = function(X,complete=F){# X must be a matrix or data frame
of coordinates; lat and lon
require(RCurl)
Y = apply( X, 1, function(x){ paste(x[1], ,, x[2], sep=) } )
grid = expand.grid(Y,Y,KEEP.OUT.ATTRS=F)
grid = apply(grid,1,function(x){paste(x[1],daddr=,x[2],sep=)})
grid = matrix(grid,ncol=length(Y),dimnames=list(names(Y),names(Y)))
grid[upper.tri(grid,T)] = NA

Distances = function(x){
if (is.na(x)) {
NA
}
else {
URL = getURL(paste(http://maps.google.com/maps?saddr=
,x,sep=))
y = strsplit(URL, divb)
y = strsplit(y[[1]][2], #160;mi/b )[[1]][1]
as.numeric(y)
}
}

dists = sapply(grid,Distances)
dists = matrix(dists,ncol=ncol(grid),dimnames=dimnames(grid))
if (complete) {
diag(dists)=0
dists[upper.tri(dists)]=dists[lower.tri(dists)]
dists
}
else {
dists
}
}

#

But R was crashing after 1 hour or so -- it either said Reached total
allocation of 1535Mb or, became unresponsive. Then, I tried to modify the
procedure to avoid big matrices at the. What I did was, I got the distances
and, one by one, appended them to a file in the hope that this would use
less memory:

##

# X is the matrix of addresses, as before

require(RCurl)
Y = apply( X, 1, function(x){ paste(x[1], ,, x[2], sep=) } )
grid = expand.grid(Y,Y,KEEP.OUT.ATTRS=F)
grid = apply(grid,1,function(x){paste(x[1],daddr=,x[2],sep=)})
grid = matrix(grid,ncol=length(Y),dimnames=list(names(Y),names(Y)))
grid[upper.tri(grid,T)] = NA

Distances = function(x){
if (is.na(x)) {
NA
}
else {
URL = getURL(paste(http://maps.google.com/maps?saddr=
,x,sep=))
y = strsplit(URL, divb)
y = strsplit(y[[1]][2], #160;mi/b )[[1]][1]
as.numeric(y)
}
}


grid2=grid[!is.na(grid)]
n = length(grid2)
for (i in 1:n) {
temp = Distances(grid2[i])
write.table(temp,distances.csv,col.names=F,row.names=F,append=T)
}

##

But R still crashes after 2 hours (all I got was around 20.000 distances).
It doesn't really matter how long this will take me (I can always use more
than one machine), but I'd really like to get this done. Any thoughts?

Many many thanks,

Dimitri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashing

2009-11-15 Thread Barry Rowlingson
On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman dimitri...@gmail.com wrote:
 Hello,

 This is what I am trying to do: I wrote a little function that takes
 addresses (coordinates) as input, and returns the road distance between
 every two points using Google Maps. Catch is, there are 2000 addresses, so I
 have to get around 2x10^6 addresses. On my first go, this is what I did:

 I hope on your first go you didn't run it with 2000 addresses. You
did test it with 13 addresses first didn't you?

 Another idea is to replace your Distance function with a function
that returns runif(1). This will either make your code fail much much
quicker or identify that the problem is in the Distance function (some
memory leak there).

 Also, you should check the return value from your google query - I've
seen google get a bit upset about repeated automated queries and
return a message saying This looks like an automated query and a
CAPTCHA test.


 grid2=grid[!is.na(grid)]
 n = length(grid2)
 for (i in 1:n) {
 temp = Distances(grid2[i])
 write.table(temp,distances.csv,col.names=F,row.names=F,append=T)
 }

This won't work - you're overwriting distances.csv with the new value
of 'temp' every time. Another good reason to test with 13 values
before waiting and failing after six hours, and then having to hammer
google's map server again.

I'd write this as a simple loop, and dump all the apply stuff. And
rewrite Distance to be a function of two lat-longs:

Distance=function(lat1,lon1,lat2,lon2){

return(distance)
}

Then (untested):

Dmat = matrix(NA,nrow(X),nrow(X))

for(i in 2:nrow(X)){
 for(j in 1:i){
  d = Distance(X[i,1],X[i,2],X[j,1],X[j,2])
  Dmat[i,j]=d
}
}

 I'm not sure apply wins much here.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashing

2009-11-15 Thread Dimitri Szerman
2009/11/15 Barry Rowlingson b.rowling...@lancaster.ac.uk

 On Sun, Nov 15, 2009 at 11:10 AM, Dimitri Szerman dimitri...@gmail.com
 wrote:
  Hello,
 
  This is what I am trying to do: I wrote a little function that takes
  addresses (coordinates) as input, and returns the road distance between
  every two points using Google Maps. Catch is, there are 2000 addresses,
 so I
  have to get around 2x10^6 addresses. On my first go, this is what I did:

  I hope on your first go you didn't run it with 2000 addresses. You
 did test it with 13 addresses first didn't you?


I did, and it worked well.


  Another idea is to replace your Distance function with a function
 that returns runif(1). This will either make your code fail much much
 quicker or identify that the problem is in the Distance function (some
 memory leak there).

  Also, you should check the return value from your google query - I've
 seen google get a bit upset about repeated automated queries and
 return a message saying This looks like an automated query and a
 CAPTCHA test.


Mmmm, I weren't aware of that.


  grid2=grid[!is.na(grid)]
  n = length(grid2)
  for (i in 1:n) {
  temp = Distances(grid2[i])
  write.table(temp,distances.csv,col.names=F,row.names=F,append=T)
  }

 This won't work - you're overwriting distances.csv with the new value
 of 'temp' every time.


No, I am not, because append=TRUE. I did this, and I managed to get 20.000
distances or so.


 Another good reason to test with 13 values
 before waiting and failing after six hours, and then having to hammer
 google's map server again.

 I'd write this as a simple loop, and dump all the apply stuff. And
 rewrite Distance to be a function of two lat-longs:

 Distance=function(lat1,lon1,lat2,lon2){
 
 return(distance)
 }

 Then (untested):

 Dmat = matrix(NA,nrow(X),nrow(X))

 for(i in 2:nrow(X)){
  for(j in 1:i){
  d = Distance(X[i,1],X[i,2],X[j,1],X[j,2])
  Dmat[i,j]=d
 }
 }

  I'm not sure apply wins much here.


Thanks. The reason I didn't want to do something like that is because, in
the event of a crash, I'll loose everything that was done. That's why I
though of appending the results often.


 Barry


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] browser() can not stop the execution

2009-11-15 Thread jim holtman
How are you executing your script?  Are you doing a cut/paste into the
command window?  On WIndows using Tinn-R, this procedure will invoke
'browser' and it will then continue to read the rest of the script
since it is coming from the standard input.  The way around it is to
put the script in a file and then 'source' it.  Try this and report
back.

On Sun, Nov 15, 2009 at 1:22 AM, chao83 chaohan1...@yahoo.com wrote:

 I need to use browser() to stop a while loop to input some value for the
 loop. But the browser() just will not stop until the last line of the code.
 Does anyone know the possible reason? I use ggobi in the loop, and open a
 few ggobi windows before the browser(), will that be the reason?

 Thanks A LOT!
 --
 View this message in context: 
 http://old.nabble.com/browser%28%29-can-not-stop-the-execution-tp26356069p26356069.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] naive collinear weighted linear regression

2009-11-15 Thread Mauricio O Calvao
Peter Dalgaard p.dalgaard at biostat.ku.dk writes:
 
 The point is that R (as well as almost all other mainstream statistical 
 software) assumes that a weight means that the variance of the 
 corresponding observation is the general variance divided by the weight 
 factor. The general variance is still determined from the residuals, and 
 if they are zero to machine precision, well, there you go. I suspect you 
   get closer to the mark with glm, which allows you to assume that the 
 dispersion is known:
 
   summary(glm(y~x,family=gaussian),dispersion=0.3^2)
 
 or
 
   summary(glm(y~x,family=gaussian,weights=1/error^2),dispersion=1)
 

Excellent; any of these commands provide Std. Errors which now coincide with my
naive expectation: though the data fall perfectly in a straight line, since they
have some associated uncertainties (only) in the response variables
(homoskedasticity), the estimated coefficients should have some kind of
nonvanishing uncertainties as well, should they not??

Now, forgive me, but I did not get the explanation for the distinct meanings of
Std. Error when calling simply summary(lm(y~x,weights=1/error^2), which I had
done before, and your suggested calls; could you rephrase and dwell a little bit
more upon this point. What does the option dispersion exactly mean?

Also, could you suggest some specific reference for me to read about this? I
have your excellent book Introductory statistics with R, 1st edition, but was
not able (perhaps I have missed some point) to find this kind of distinction
there... Does this theme is specifically what statisticians call really
generalized linear models (glm) as opposed to (ordinary) linear models? If so,
which good references could you please suggest?? I thought of the following
books and would feel much obliged should you give me your impressions about
them, if any, or about any other relevant references at all:

1) Faraway, Linear models with R
2) Faraway, Extending the linear model with R: generalized linear...
3) Fox, An R and S-Plus companion..
4) Uusipaikka, Confidence intervals in generalized linear regression models

Thank you very much!!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashing

2009-11-15 Thread Barry Rowlingson
On Sun, Nov 15, 2009 at 11:57 AM, Dimitri Szerman dimitri...@gmail.com wrote:


 Thanks. The reason I didn't want to do something like that is because, in
 the event of a crash, I'll loose everything that was done. That's why I
 though of appending the results often.

 Oops yes, I missed the 'append=TRUE' flag. That's a good idea.

 Last time I did something similar to this I used a relational
database for saving. I created a table of all the i,j pairs with
columns i,j,distance and 'ok'. 'ok' was set to False initially. Then
I'd query the db for a row with 'ok=False', and go about getting the
distance. If I got a good distance back I set 'ok=True' and never
bothered getting that again.

  This was in Python with SQLite as the database engine, but you can
do something similar in R. With a distributed database you could
easily split the queries between as many servers as you can get your
hands on.

 Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Presentation of data in Graphical format

2009-11-15 Thread Sunita22

Hello

My data contains following columns:

1st column: Posts (GM, Secretary, AM, Office Boy)   
2nd Column: Dept (Finance, HR, ...)
3rd column: Tasks (Open the door, Fix an appointment, Fill the register,
etc.) depending on the post
4th column: Average Time required to do the task

So the sample data would look like
PostsDeptTask   Average time
Office Boy  HR   Open the door  00:00:09
Secretary   FinanceFix an appointment00.00.30
.  ..

I am trying to represent this data in Graphical format, I tried graphs like
Mosaic plot, etc. But it does not represent the data correctly. My aim is to
check the amount of time and its variability for groups of tasks

Thank you in advance
Regards
Sunita

-- 
View this message in context: 
http://old.nabble.com/Presentation-of-data-in-Graphical-format-tp26358857p26358857.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lme model specification

2009-11-15 Thread Green, Gerwyn (greeng6)
Dear all

this is a question of model specification in lme which I'd for which I'd 
greatly appreciate some guidance.

Suppose I have data in long format

gene treatment rep Y
11  1  4.32
11  2  4.67
11  3  5.09
..  ..
..  ..
..  ..
14  1   3.67
14  2   4.64
14  3   4.87
..  ..
..  ..
..  .. 
2000 1  1  5.12
2000 1  2  2.87
2000 1  3  7.23
..  ..
..  ..
..  ..
2000 4  1   2.48
2000 4  2   3.93
2000 4  3   5.17


that is, I have data Y_{gtr} for g (gene) =1,...,2000t (treatment) = 
1,...,4 andr (replicate) = 1,...,3

I would like to fit the following linear mixed model using lme

Y_{gtr} = \mu_{g} +  W_{gt} + Z_{gtr}

where the \mu_{g}'s are fixed gene effects, W_{gt} ~ N(0, \sigma^{2}) 
gene-treatment interactions, and residual errors Z_{gtr} ~ N(0,\tau^{2}). (Yes, 
I know I'm specifying an interaction between gene and treatment without 
specifying a treatment main effect ! - there is good reason for this)


I know that specifying

model.1 - lme(Y ~ -1 + factor(gene), data=data, random= ~1|gene/treatment)

fits Y_{gtr} = \mu_{g} +  U_{g} + W_{gt} + Z_{gtr}

with \mu_{g}, W_{gt}  and Z_{gtr} as previous and U_{g} ~ N(0,\gamma^{2}), but 
I do NOT want to specify a random gene effect. I have scoured Bates and 
Pinheiro without coming across a parallel example. 


Any help would be greatly appreciated

Best


Gerwyn Green
School of Health and Medicine
Lancaster Uinversity

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] naive collinear weighted linear regression

2009-11-15 Thread Mauricio O Calvao
David Winsemius dwinsemius at comcast.net writes:


 
 It's really not that difficult to get the variance covariance matrix.  
 What is not so clear is why you think differential weighting of a set  
 that has a perfect fit should give meaningfully different results than  
 a fit that has no weights.

Again, David, what I have in mind is: since there are errors or uncertainties in
the response variables (despite the perfect collinearity of the data), which I
assume are Gaussian, if I make a large enough number of simulations of four
response values, there will undoubtedly be a dispersion in the best fit
intercept and slope obtained from a usual unweighted least squares procedure,
right? Then, if I calculate the arithmetic mean of these simulated intercept and
slope, I would certainly check that they would be 0 and 2, respectively.
However, and THAT IS THE POINT, there will also be a standard deviation
associated with each one of these two coefficients, right??, and that is what I
would assign as the measure of uncertainty in the estimation of the
coefficients. This is not, as Dalgaard has called attention to, what the simple
command summary(lm(y~x,weights=1/err^2)) provides in its Std. Error. However, as
Dalgaard also recalled, the command
summary(glm(y~x,family=gaussian,weights=1/err^2),dispersion=1) does provide Std.
Errors in the coefficients which look plausible (at least to me) and, at any
rate, which do coincide with results from other packages (Numerical Recipes,
ROOT and possibly GSL...)

 
 ?lm
 ?vcov
 
y - c(2,4,6,8) # response vect
   fit_mod - lm(y~x,weights=1/error^2)
 Error in eval(expr, envir, enclos) : object 'error' not found
   error - c(0.3,0.3,0.3,0.3)
   fit_mod - lm(y~x,weights=1/error^2)
   vcov(fit_mod)
(Intercept) x
 (Intercept)  2.396165e-30 -7.987217e-31
 x   -7.987217e-31  3.194887e-31
 
 Numerically those are effectively zero.
 
   fit_mod - lm(y~x)
   vcov(fit_mod)
  (Intercept) x
 (Intercept)   0 0
 x 0 0


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re move row if the column date_abandoned has a date in it

2009-11-15 Thread David Winsemius


On Nov 14, 2009, at 8:43 PM, frenchcr wrote:



sorry David,

im really new to R (my first week) and appreciate your help. Also I  
dont
always know what info to give people on the forum (although im  
starting to

catch the drift).

heres what i get...

summary(new_data4$date_abandoned)
Min.1st Qu.Median Mean  3rd Qu.  
Max.  NA's

1601 1998 2001 1993 2004  2009   315732


So new_data4$data_abandoned is not of type Date and is instead a  
character vector.


If you are resisting turning it into a date and want to work with  
characters, you can, you just need to deal somehow with the items that  
are not 8 characters wide. What does 315732 represent? How were we  
supposed to interpret the starting date you gave of 0101?


 nchar(101)
[1] 7

What does table(nchar(new_data4$date_abandoned)) give you?



ls()

[1] data  new_data  new_data2 new_data3 new_data4

small - head(new_data4, 20)
dump(small, 20)

Error in dump(small, 20) : cannot write to this connection



Well, sorry, I meant to type dump(small, stdout())   ... As per the  
Posting Guide.


--
David.


David Winsemius wrote:



On Nov 14, 2009, at 5:24 PM, frenchcr wrote:




I tried the following but it does the opposite of what i want:

new_data5 - subset(new_data4, date_abandoned  0101)

I want to remove the rows with dates and leave just the rows without
a date.

This removes all the rows that dont have a date in the
date_abandoned column

...on a positive note, as i did this next...

dim(new_data5)
[1] 263  80

i now know that i have 263 dates in that column :)

I want to remove the 263 rows with dates and leave just the rows
without a
date.


Con=me on frenchcr. Stop making us guess. Give us enough information
to work with. You asked for something which I construed as saying you
wanted dates greater than the the first day of the year 101. You did
not address this question.

What do you get with str(new_data4) and
summary(new_data4$date_abandoned) ? In order to know what sort of
comparison to use we need to know what the data looks like.

Even better if you offered the output from:

small - head(new_data4, 20)
dump(small, 20),

--
David









David Winsemius wrote:



On Nov 14, 2009, at 1:21 PM, frenchcr wrote:




I want to go through a column in data called


Bad name for a data.frame. Fortunes, dog and all that.

date_abandoneddata[date_abandoned]and remove all the  
rows

that
have numbers greater than 1,010,000.


Are you doing archeology? Given what you say next I wondered what
range you were really asking for.



The dates are in the format 20091114 so i'm just going to treat  
them

as
numbers for clean up purposes.


I know that i use subset but not sure how to proceed from there.


subdata - subset(data, date_abandoned  0101()


The problem with  101 is that your specified minimum point  
had

an insufficient number of places to be in MMDD format.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context:
http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26355689.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] update.lm question

2009-11-15 Thread Karsten Weinert
Hello,
at the Rgui command line I can easily remove a term from a fitted lm
object, like

fit - lm(y~x1+x2+x3, data=myData)
update(fit, .~.-x1)

However, I would like to do this in a function with term given as string, like

removeTerm - function(linModel, termName) { ??? }
removeTerm(fit, x1)

but I can not fill the ???. I already tried

removeTerm - function(linModel, termName) { update(linModel, .~. - termName },
removeTerm - function(linModel, termName) { update(linModel, .~. -
as.name(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval.parent(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
get(termName) },

but these attempts produce error messages.

Can you advise me here?

Kind regards,
Karsten

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error running lda example from Help File (MASS library )

2009-11-15 Thread David Winsemius


On Nov 14, 2009, at 7:02 PM, Greg Riddick wrote:


Hello all,

I'm trying to run lda() from the MASS library but the Help example  
generates

the following error:

#Code from example in lda Help file
no code included
# Resulting Error

Error in if (targetlist[i] == stringname) { : argument is of length  
zero


Cannot reproduce on setup possibly similar to yours:

 sessionInfo()
R version 2.10.0 Patched (2009-10-29 r50258)
x86_64-apple-darwin9.8.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] splines   stats graphics  grDevices utils datasets   
methods   base


other attached packages:
[1] MASS_7.3-3  plyr_0.1.9  survey_3.18 Design_2.3-0 
Hmisc_3.7-0

[6] survival_2.35-7 lattice_0.17-26

loaded via a namespace (and not attached):
[1] cluster_1.12.1 grid_2.10.0tools_2.10.0




My Current R Installation:
MacOSX: 10.5.8
R: 2.10.0


What is your sessionInfo(), requested in Posting Guide.



--
Gregory Riddick, PhD.
CRTA Research Fellow

National Institutes of Health
National Cancer Institute, Neuro-Oncology Branch
http://home.ccr.cancer.gov/nob/

37 Convent Drive
Building 37, Room 1142
Bethesda, MD 20892-8202

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.lm question

2009-11-15 Thread David Winsemius


You need to review:

?update
?update.formula
?as.formula

The way should be clear at that point. If not, then include a  
reproducible data example to work with.


--
David
On Nov 15, 2009, at 9:23 AM, Karsten Weinert wrote:


Hello,
at the Rgui command line I can easily remove a term from a fitted lm
object, like

fit - lm(y~x1+x2+x3, data=myData)
update(fit, .~.-x1)

However, I would like to do this in a function with term given as  
string, like


removeTerm - function(linModel, termName) { ??? }
removeTerm(fit, x1)

but I can not fill the ???. I already tried

removeTerm - function(linModel, termName) { update(linModel, .~. -  
termName },

removeTerm - function(linModel, termName) { update(linModel, .~. -
as.name(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval.parent(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
get(termName) },

but these attempts produce error messages.

Can you advise me here?

Kind regards,
Karsten

--
David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] where is a value in my list

2009-11-15 Thread Grzes

I heve got a list: 

lista=list() 
a=c(2,4,5,5,6) 
b=c(3,5,4,2) 
c=c(1,1,1,8) 
lista[[1]]=a 
lista[[2]]=b 
lista[[3]]=c

  lista 
[[1]] 
[1] 2 4 5 5 6 

[[2]] 
[1] 3 5 4 2 

[[3]] 
[1] 1 1 1 8 

I would like to know where is number 5 (which line)?

For example I have got a loop:

  k= vector(mode = integer, length = 3)

 for(i in 1:3)
{
 for (j in 1:length(lista[[i]])){
if ((lista[[i]][j])==5   k[i]= [i])
}
}

This loop is wrong but I would like to get in my vector k sth like this:

k = lista[[1]][1], lista[[2]][1] ...or sth similar
-- 
View this message in context: 
http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26359843.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Relase positive with log and zero of negative with 0

2009-11-15 Thread rkevinburton
This is a very simple question but I couldn't form a site search quesry that 
would return a reasonable result set.

Say I have a vector:

x - c(0,2,3,4,5,-1,-2)

I want to replace all of the values in 'x' with the log of x. Naturally this 
runs into problems since some of the values are negative or zero. So how can I 
replace all of the positive elements of x with the log(x) and the rest with 
zero?

Thank you.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Presentation of data in Graphical format

2009-11-15 Thread milton ruser
Google R graph grallery
Google R ggplot2
Google R lattice

and good luck

milton
On Sun, Nov 15, 2009 at 7:48 AM, Sunita22 sunita...@gmail.com wrote:


 Hello

 My data contains following columns:

 1st column: Posts (GM, Secretary, AM, Office Boy)
 2nd Column: Dept (Finance, HR, ...)
 3rd column: Tasks (Open the door, Fix an appointment, Fill the register,
 etc.) depending on the post
 4th column: Average Time required to do the task

 So the sample data would look like
 PostsDeptTask   Average time
 Office Boy  HR   Open the door  00:00:09
 Secretary   FinanceFix an appointment00.00.30
 .  ..

 I am trying to represent this data in Graphical format, I tried graphs like
 Mosaic plot, etc. But it does not represent the data correctly. My aim is
 to
 check the amount of time and its variability for groups of tasks

 Thank you in advance
 Regards
 Sunita

 --
 View this message in context:
 http://old.nabble.com/Presentation-of-data-in-Graphical-format-tp26358857p26358857.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where is a value in my list

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 10:01 AM, Grzes wrote:



I heve got a list:

lista=list()
a=c(2,4,5,5,6)
b=c(3,5,4,2)
c=c(1,1,1,8)
lista[[1]]=a
lista[[2]]=b
lista[[3]]=c


lista

[[1]]
[1] 2 4 5 5 6

[[2]]
[1] 3 5 4 2

[[3]]
[1] 1 1 1 8

I would like to know where is number 5 (which line)?

For example I have got a loop:

 k= vector(mode = integer, length = 3)

for(i in 1:3)
{
for (j in 1:length(lista[[i]])){
if ((lista[[i]][j])==5   k[i]= [i])
}
}

This loop is wrong but I would like to get in my vector k sth like  
this:


k = lista[[1]][1], lista[[2]][1] ...or sth similar


I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's  
also unclear what type of output you expect ... character, list,  
numeric?


See if these take you any further to your vaguely expressed goal:

 lapply(lista, %in%, 5)
[[1]]
[1] FALSE FALSE  TRUE  TRUE FALSE

[[2]]
[1] FALSE  TRUE FALSE FALSE

[[3]]
[1] FALSE FALSE FALSE FALSE

 lapply(lista, function(x) which(x == 5) )
[[1]]
[1] 3 4

[[2]]
[1] 2

[[3]]
integer(0)


--



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relase positive with log and zero of negative with 0

2009-11-15 Thread Berwin A Turlach
G'day Kevin,

On Sun, 15 Nov 2009 7:18:18 -0800
rkevinbur...@charter.net wrote:

 This is a very simple question but I couldn't form a site search
 quesry that would return a reasonable result set.
 
 Say I have a vector:
 
 x - c(0,2,3,4,5,-1,-2)
 
 I want to replace all of the values in 'x' with the log of x.
 Naturally this runs into problems since some of the values are
 negative or zero. So how can I replace all of the positive elements
 of x with the log(x) and the rest with zero?

If you do not mind a warning message:

R x - c(0,2,3,4,5,-1,-2)
R x - ifelse(x = 0,0, log(x))
Warning message:
In log(x) : NaNs produced
R x
[1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000

If you do mind, then:

R x - c(0,2,3,4,5,-1,-2)
R ind - x0
R x[!ind] - 0
R x[ind] - log(x[ind])
R x
[1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000 0.000

HTH.

Cheers,

Berwin

== Full address 
Berwin A Turlach  Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019)+61 (8) 6488 3383 (self)
The University of Western Australia   FAX : +61 (8) 6488 1028
35 Stirling Highway   
Crawley WA 6009e-mail: ber...@maths.uwa.edu.au
Australiahttp://www.maths.uwa.edu.au/~berwin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Relase positive with log and zero of negative with 0

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 10:18 AM, rkevinbur...@charter.net wrote:

This is a very simple question but I couldn't form a site search  
quesry that would return a reasonable result set.


Say I have a vector:

x - c(0,2,3,4,5,-1,-2)

I want to replace all of the values in 'x' with the log of x.  
Naturally this runs into problems since some of the values are  
negative or zero. So how can I replace all of the positive elements  
of x with the log(x) and the rest with zero?


 x - c(0,2,3,4,5,-1,-2)
 x - ifelse(x0, log(x), 0)
Warning message:
In log(x) : NaNs produced
 x
[1] 0.000 0.6931472 1.0986123 1.3862944 1.6094379 0.000  
0.000


The warning is harmless as you can see, but if you wanted to avoid it,  
then:


 x[x=0] - 0; x[x0] -log(x[x0])

In the second command, you need to have the logical test on both sides  
to avoid replacement  out of synchrony.



--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] package tm fails to remove the with remove stopwords

2009-11-15 Thread Ingo Feinerer
On Thu, Nov 12, 2009 at 11:29:50AM -0500, Mark Kimpel wrote:
 I am using code that previously worked to remove stopwords using package tm.

Thanks for reporting. This is a bug in the removeWords() function in
tm version 0.5-1 available from CRAN:

 require(tm)
 myDocument - c(the rain in Spain, falls mainly on the plain, jack and 
 jill ran up the hill, to fetch a pail of water)
 text.corp - Corpus(VectorSource(myDocument))
 #
 text.corp - tm_map(text.corp, stripWhitespace)
 text.corp - tm_map(text.corp, removeNumbers)
 text.corp - tm_map(text.corp, removePunctuation)
 ## text.corp - tm_map(text.corp, stemDocument)
 text.corp - tm_map(text.corp, removeWords, c(the, stopwords(english)))
 dtm - DocumentTermMatrix(text.corp)
 dtm
 dtm.mat - as.matrix(dtm)
 dtm.mat
 
  dtm.mat
 Terms
 Docs falls fetch hill jack jill mainly pail plain rain ran spain the water
1 0 0000  00 01   0 1   1 0
2 1 0000  10 10   0 0   0 0
3 0 0111  00 00   1 0   0 0
4 0 1000  01 00   0 0   0 1

The function removeWords() fails to remove patterns at the beginning or at the 
end
of a line.

This bug is fixed in the latest development version on R-Forge, and
the fix will be included in the next CRAN release.

Please see
https://r-forge.r-project.org/plugins/scmsvn/viewcvs.php/pkg/inst/NEWS?root=tmview=markup
for a list of all bug fixes and changes between each tm version.

Best regards, Ingo Feinerer

-- 
Ingo Feinerer
Vienna University of Technology
http://www.dbai.tuwien.ac.at/staff/feinerer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sum(row1==y) if row2=x

2009-11-15 Thread Knut Krueger

Thanks to all
R is fantastic but ... not easy to know all possible terms ;-)

Knut

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.lm question

2009-11-15 Thread Karsten Weinert
Hello David,

of course I read those help pages and searched the r-help archive.

I agree that the way should be clear, but I am stuck and looking for
help. Here is reproducible code

removeTerm1 - function(linModel, termName) { update(linModel, .~. - termName) }
removeTerm2 - function(linModel, termName) { update(linModel, .~. -
as.name(termName)) }
removeTerm3 - function(linModel, termName) { update(linModel, .~. -
eval(termName)) }
removeTerm4 - function(linModel, termName) { update(linModel, .~. -
eval.parent(termName)) }
removeTerm5 - function(linModel, termName) { update(linModel, .~. -
get(termName)) }

myData - data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10), y=rnorm(10))
fit - lm(y~x1+x2+x3, data=myData)

# all this does not work, as I am expecting the function to return a
lm object with formula y~x2+x3
removeTerm1(fit, x1)
removeTerm2(fit, x1)
removeTerm3(fit, x1)
removeTerm4(fit, x1)
removeTerm5(fit, x1)


Any help appreciated,
kind regards,
Karsten Weinert



2009/11/15 David Winsemius dwinsem...@comcast.net:

 You need to review:

 ?update
 ?update.formula
 ?as.formula

 The way should be clear at that point. If not, then include a reproducible
 data example to work with.

 --
 David

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.lm question

2009-11-15 Thread Duncan Murdoch

On 15/11/2009 9:23 AM, Karsten Weinert wrote:

Hello,
at the Rgui command line I can easily remove a term from a fitted lm
object, like

fit - lm(y~x1+x2+x3, data=myData)
update(fit, .~.-x1)

However, I would like to do this in a function with term given as string, like

removeTerm - function(linModel, termName) { ??? }
removeTerm(fit, x1)

but I can not fill the ???. I already tried

removeTerm - function(linModel, termName) { update(linModel, .~. - termName },
removeTerm - function(linModel, termName) { update(linModel, .~. -
as.name(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval.parent(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
get(termName) },

but these attempts produce error messages.

Can you advise me here?


There are two problems:

1. .~. is different from . ~ ..

2.  You need to construct the formula . ~ . - x1, and none of your 
expressions do that.  You need to use substitute() or bquote() to edit a 
formula. For example, I think both of these should work:


removeTerm - function(linModel, termName)
   update(linModel, bquote(. ~ . - .(as.name(termName


removeTerm - function(linModel, termName)
   update(linModel, substitute(. ~ . - x, list(x=as.name(termName

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crashing

2009-11-15 Thread Dimitri Szerman
So here's the funny thing: I've now ran my function 5 times, and in each one
of them R crashes after I got around 20.000 distances. It could be google,
but then as soon as I quit and launch R again, I manage to get another
20.000 distances. So maybe it does have something to do with the memory
usage. Do you think that adding

 gc(reset=T)

at the end of each loop would do good?

Thanks again,
Dimitri



2009/11/15 Barry Rowlingson b.rowling...@lancaster.ac.uk

 On Sun, Nov 15, 2009 at 11:57 AM, Dimitri Szerman dimitri...@gmail.com
 wrote:

 
  Thanks. The reason I didn't want to do something like that is because, in
  the event of a crash, I'll loose everything that was done. That's why I
  though of appending the results often.

  Oops yes, I missed the 'append=TRUE' flag. That's a good idea.

  Last time I did something similar to this I used a relational
 database for saving. I created a table of all the i,j pairs with
 columns i,j,distance and 'ok'. 'ok' was set to False initially. Then
 I'd query the db for a row with 'ok=False', and go about getting the
 distance. If I got a good distance back I set 'ok=True' and never
 bothered getting that again.

  This was in Python with SQLite as the database engine, but you can
 do something similar in R. With a distributed database you could
 easily split the queries between as many servers as you can get your
 hands on.

  Barry


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where is a value in my list

2009-11-15 Thread Grzes

But it's not what I wont

I need get a number of line my list
5 is in: list[[1]][1] and list[[2]][1] so
I would like to get a vector k = 1,2


David Winsemius wrote:
 
 
 On Nov 15, 2009, at 10:01 AM, Grzes wrote:
 

 I heve got a list:

 lista=list()
 a=c(2,4,5,5,6)
 b=c(3,5,4,2)
 c=c(1,1,1,8)
 lista[[1]]=a
 lista[[2]]=b
 lista[[3]]=c

 lista
 [[1]]
 [1] 2 4 5 5 6

 [[2]]
 [1] 3 5 4 2

 [[3]]
 [1] 1 1 1 8

 I would like to know where is number 5 (which line)?

 For example I have got a loop:

  k= vector(mode = integer, length = 3)

 for(i in 1:3)
 {
 for (j in 1:length(lista[[i]])){
 if ((lista[[i]][j])==5   k[i]= [i])
 }
 }

 This loop is wrong but I would like to get in my vector k sth like  
 this:

 k = lista[[1]][1], lista[[2]][1] ...or sth similar
 
 I am a bit confused, since clearly lista[[1]][1] does _not_ == 5. It's  
 also unclear what type of output you expect ... character, list,  
 numeric?
 
 See if these take you any further to your vaguely expressed goal:
 
   lapply(lista, %in%, 5)
 [[1]]
 [1] FALSE FALSE  TRUE  TRUE FALSE
 
 [[2]]
 [1] FALSE  TRUE FALSE FALSE
 
 [[3]]
 [1] FALSE FALSE FALSE FALSE
 
   lapply(lista, function(x) which(x == 5) )
 [[1]]
 [1] 3 4
 
 [[2]]
 [1] 2
 
 [[3]]
 integer(0)
 
 --
 
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360251.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where is a value in my list

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 10:47 AM, Grzes wrote:



But it's not what I wont

I need get a number of line my list
5 is in: list[[1]][1] and list[[2]][1] so
I would like to get a vector k = 1,2



I am sorry. I do not understand what you want the second solution  
offered gave you numbers (and they were the numbers that were for  
5's rather than one that were not for 5's as your offered solution.


If you just want to know which lists contain a 5, but not the position  
within the list (which was not what you appeared to be asking..):


 which(sapply(lista, function(x) any(x == 5)))
[1] 1 2





David Winsemius wrote:



On Nov 15, 2009, at 10:01 AM, Grzes wrote:



I heve got a list:

lista=list()
a=c(2,4,5,5,6)
b=c(3,5,4,2)
c=c(1,1,1,8)
lista[[1]]=a
lista[[2]]=b
lista[[3]]=c


lista

[[1]]
[1] 2 4 5 5 6

[[2]]
[1] 3 5 4 2

[[3]]
[1] 1 1 1 8

I would like to know where is number 5 (which line)?

For example I have got a loop:

k= vector(mode = integer, length = 3)

for(i in 1:3)
{
for (j in 1:length(lista[[i]])){
if ((lista[[i]][j])==5   k[i]= [i])
}
}

This loop is wrong but I would like to get in my vector k sth like
this:

k = lista[[1]][1], lista[[2]][1] ...or sth similar


I am a bit confused, since clearly lista[[1]][1] does _not_ == 5.  
It's

also unclear what type of output you expect ... character, list,
numeric?

See if these take you any further to your vaguely expressed goal:


lapply(lista, %in%, 5)

[[1]]
[1] FALSE FALSE  TRUE  TRUE FALSE

[[2]]
[1] FALSE  TRUE FALSE FALSE

[[3]]
[1] FALSE FALSE FALSE FALSE


lapply(lista, function(x) which(x == 5) )

[[1]]
[1] 3 4

[[2]]
[1] 2

[[3]]
integer(0)


--



David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360251.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.lm question

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 11:15 AM, Karsten Weinert wrote:


Hello David,

of course I read those help pages and searched the r-help archive.


But did you look at the as.formula page?



I agree that the way should be clear, but I am stuck and looking for
help. Here is reproducible code


And here is what seemed like the obvious extension of your example but  
using as.formula:


myData - data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10),  
y=rnorm(10))

fit - lm(y~x1+x2+x3, data=myData)
remvterm - function(ft, termname) update(ft, as.formula(paste(.~.-,  
termname, sep=)))

remvterm(fit, x3)

Call:
lm(formula = y ~ x1 + x2, data = myData)

Coefficients:
(Intercept)   x1   x2
-0.2598  -0.0290  -0.2645

--
David



removeTerm1 - function(linModel, termName) { update(linModel, .~. -  
termName) }

removeTerm2 - function(linModel, termName) { update(linModel, .~. -
as.name(termName)) }
removeTerm3 - function(linModel, termName) { update(linModel, .~. -
eval(termName)) }
removeTerm4 - function(linModel, termName) { update(linModel, .~. -
eval.parent(termName)) }
removeTerm5 - function(linModel, termName) { update(linModel, .~. -
get(termName)) }

myData - data.frame(x1=rnorm(10), x2=rnorm(10), x3=rnorm(10),  
y=rnorm(10))

fit - lm(y~x1+x2+x3, data=myData)

# all this does not work, as I am expecting the function to return a
lm object with formula y~x2+x3
removeTerm1(fit, x1)
removeTerm2(fit, x1)
removeTerm3(fit, x1)
removeTerm4(fit, x1)
removeTerm5(fit, x1)


Any help appreciated,
kind regards,
Karsten Weinert



2009/11/15 David Winsemius dwinsem...@comcast.net:


You need to review:

?update
?update.formula
?as.formula

The way should be clear at that point. If not, then include a  
reproducible

data example to work with.

--
David


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] JGR GUI for R-2.10.0 Help Print

2009-11-15 Thread Bob Meglen
I have updated R 2.9.1 to 2.10.0. and JGR GUI 1.7.  I am running Windows XP. 
I can't seem to get the JGR Print or Help functions to work. The system 
locks and requires me to stop the process.


In the past I have preferred the opreation and feel of JGR GUI. I realize 
that this help forum is for R;  but,  I am hoping that some other R-user is 
a JGR GUI user and might have a hint about this.


At one point I received the following:

Loading required package: rJava
Loading required package: JavaGD
Loading required package: iplots

Attaching package: 'utils'


The following object(s) are masked from package:rJava :

 head,
 str,
 tail

starting httpd help server ...Error in tools:::startDynamicHelp() : could 
not find function runif

Loading required package: stats
Loading required package: graphics
Loading Tcl/Tk interface ... done
During startup - Warning message:
package JGR in options(defaultPackages) was not found
Loading required package: JGR



starting httpd help server ... done
q()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] resampling problem counting number of means above a specific value

2009-11-15 Thread Graham Smith
I am trying to modify some code from Good 2005.

I am trying  to resample the mean of 8 values and then count how many
times the resampled mean is greater than 10. But my count of means
above 10 is coming out as zero, which I know isn't correct.

I would appreciate it if someone could look at the code below and tell
me what I am doing wrong.

Many thanks,

Graham

 LL- c(12.5,17,12,11.5,9.5,15.5,16,14)
 N-1000
 n-length(LL)
 threshold-10
 cnt-0
 for(i in 1:N){
+ LLb - sample (LL, n, replace=TRUE)
+ if (mean(LLb)=threshold) cnt-cnt+1
+ }
 cnt
[1] 0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.lm question

2009-11-15 Thread Duncan Murdoch

On 15/11/2009 11:28 AM, Duncan Murdoch wrote:

On 15/11/2009 9:23 AM, Karsten Weinert wrote:

Hello,
at the Rgui command line I can easily remove a term from a fitted lm
object, like

fit - lm(y~x1+x2+x3, data=myData)
update(fit, .~.-x1)

However, I would like to do this in a function with term given as string, like

removeTerm - function(linModel, termName) { ??? }
removeTerm(fit, x1)

but I can not fill the ???. I already tried

removeTerm - function(linModel, termName) { update(linModel, .~. - termName },
removeTerm - function(linModel, termName) { update(linModel, .~. -
as.name(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
eval.parent(termName) },
removeTerm - function(linModel, termName) { update(linModel, .~. -
get(termName) },

but these attempts produce error messages.

Can you advise me here?


There are two problems:

1. .~. is different from . ~ ..


Oops, wrong.  Those are the same.  Sorry...

Duncan Murdoch



2.  You need to construct the formula . ~ . - x1, and none of your 
expressions do that.  You need to use substitute() or bquote() to edit a 
formula. For example, I think both of these should work:


removeTerm - function(linModel, termName)
update(linModel, bquote(. ~ . - .(as.name(termName


removeTerm - function(linModel, termName)
update(linModel, substitute(. ~ . - x, list(x=as.name(termName

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] resampling problem counting number of means above a specific value

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 12:12 PM, Graham Smith wrote:


I am trying to modify some code from Good 2005.

I am trying  to resample the mean of 8 values and then count how many
times the resampled mean is greater than 10. But my count of means
above 10 is coming out as zero, which I know isn't correct.


If that is your goal, then why are you using = and not  in your  
test?


 for(i in 1:N){
+  LLb - sample (LL, n, replace=TRUE)
+  if (mean(LLb)  threshold) cnt-cnt+1
+  }
 cnt
[1] 1000





I would appreciate it if someone could look at the code below and tell
me what I am doing wrong.

Many thanks,

Graham


LL- c(12.5,17,12,11.5,9.5,15.5,16,14)
N-1000
n-length(LL)
threshold-10
cnt-0
for(i in 1:N){

+ LLb - sample (LL, n, replace=TRUE)
+ if (mean(LLb)=threshold) cnt-cnt+1
+ }

cnt

[1] 0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] update.lm question

2009-11-15 Thread Karsten Weinert
Thanks Duncan and David

for opening my eyes :-). It took quite a while but I think I learned a
lot about lm today. I used your advice to produce added variable
plots as mentioned here [1], [2]. I would bet someone did it in R
already (may leverage.plot in car) but it was worth doing it myself.

Kind regards,
Karsten.

[1] http://www.minitab.com/support/documentation/answers/AVPlots.pdf
[2] 
http://www.mathworks.com/access/helpdesk/help/toolbox/stats/addedvarplot.html

plotAddedVar.lm - function(
linModel,
termName,
main=,
xlab=paste(termName,  | andere),
ylab=paste(colnames(linModel$model)[1],  | andere),
cex=0.7, ...) {

oldpar - par(no.readonly = TRUE); on.exit(par(oldpar))
par(mar=c(3,4,0.4,0)+0.1, las=1, cex=cex)

yData = residuals(update(linModel, substitute(. ~ . - x,
list(x=as.name(termName)
xData = residuals(update(linModel, substitute(x ~ . - x,
list(x=as.name(termName)

plot(xData, yData, main=main, xlab=, ylab=)
mtext(side=2, text=ylab, line=3, las=0, cex=cex)
mtext(side=1, text=xlab, line=2, las=0, cex=cex)
abline(h=0)
abline(a=0, b=coefficients(linModel)[termName], col=blue)
}

plotAddedVar - function(linModel,...) UseMethod(plotAddedVar)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] resampling problem counting number of means above a specific value

2009-11-15 Thread Dimitris Rizopoulos

try the following:

LL - c(12.5,17,12,11.5,9.5,15.5,16,14)
n - length(LL)
N - 1000
threshold - 10

smpls - sample(LL, N*n, replace = TRUE)
dim(smpls) - c(n, N)
cnt - sum(colMeans(smpls)  threshold)
cnt


I hope it helps.

Best,
Dimitris


Graham Smith wrote:

I am trying to modify some code from Good 2005.

I am trying  to resample the mean of 8 values and then count how many
times the resampled mean is greater than 10. But my count of means
above 10 is coming out as zero, which I know isn't correct.

I would appreciate it if someone could look at the code below and tell
me what I am doing wrong.

Many thanks,

Graham


LL- c(12.5,17,12,11.5,9.5,15.5,16,14)
N-1000
n-length(LL)
threshold-10
cnt-0
for(i in 1:N){

+ LLb - sample (LL, n, replace=TRUE)
+ if (mean(LLb)=threshold) cnt-cnt+1
+ }

cnt

[1] 0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] resampling problem counting number of means above a specific value

2009-11-15 Thread Graham Smith
David,

Thanks, its me getting mixed up I actually meant less than or equal to
10.  That apart, I guess the code is OK, I just expected, especially
as I increased N that I might have got some means less than 10, but
having gone back to it , I see I need a million iterations before
getting two means less than 10.

It seems I misjudged the probabilities.

Thanks again.

Graham



2009/11/15 David Winsemius dwinsem...@comcast.net:

 On Nov 15, 2009, at 12:12 PM, Graham Smith wrote:

 I am trying to modify some code from Good 2005.

 I am trying  to resample the mean of 8 values and then count how many
 times the resampled mean is greater than 10. But my count of means
 above 10 is coming out as zero, which I know isn't correct.

 If that is your goal, then why are you using = and not  in your test?

 for(i in 1:N){
 +  LLb - sample (LL, n, replace=TRUE)
 +  if (mean(LLb)  threshold) cnt-cnt+1
 +  }
 cnt
 [1] 1000


 I would appreciate it if someone could look at the code below and tell
 me what I am doing wrong.

 Many thanks,

 Graham

 LL- c(12.5,17,12,11.5,9.5,15.5,16,14)
 N-1000
 n-length(LL)
 threshold-10
 cnt-0
 for(i in 1:N){

 + LLb - sample (LL, n, replace=TRUE)
 + if (mean(LLb)=threshold) cnt-cnt+1
 + }

 cnt

 [1] 0

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lme model specification

2009-11-15 Thread Douglas Bates
On Sun, Nov 15, 2009 at 7:19 AM, Green, Gerwyn (greeng6)
g.gre...@lancaster.ac.uk wrote:
 Dear all

 this is a question of model specification in lme which I'd for which I'd 
 greatly appreciate some guidance.

 Suppose I have data in long format

 gene treatment rep Y
 1        1      1  4.32
 1        1      2  4.67
 1        1      3  5.09
 .        .      .    .
 .        .      .    .
 .        .      .    .
 1        4      1   3.67
 1        4      2   4.64
 1        4      3   4.87
 .        .      .    .
 .        .      .    .
 .        .      .    .
 2000     1      1  5.12
 2000     1      2  2.87
 2000     1      3  7.23
 .        .      .    .
 .        .      .    .
 .        .      .    .
 2000     4      1   2.48
 2000     4      2   3.93
 2000     4      3   5.17


 that is, I have data Y_{gtr} for g (gene) =1,...,2000    t (treatment) = 
 1,...,4 and    r (replicate) = 1,...,3

 I would like to fit the following linear mixed model using lme

 Y_{gtr} = \mu_{g} +  W_{gt} + Z_{gtr}

 where the \mu_{g}'s are fixed gene effects, W_{gt} ~ N(0, \sigma^{2}) 
 gene-treatment interactions, and residual errors Z_{gtr} ~ N(0,\tau^{2}). 
 (Yes, I know I'm specifying an interaction between gene and treatment without 
 specifying a treatment main effect ! - there is good reason for this)

You are going to end up estimating 2000 fixed-effects parameters for
gene, which will take up a lot of memory (one copy of the model matrix
for the fixed-effects will be 24000 by 2000 double precision numbers
or about 400 MB).  You might be able to fit that in lme as

lme(Y ~ -1 + factor(gene), data = data, random = ~ 1|gene:treatment)

but it will probably take a long time or run out of memory.  There is
an alternative which is to use the development branch of the lme4
package that allows for a sparse model matrix for the fixed-effects
parameters.  Or ask yourself if you really need to model the genes as
fixed effects instead of random effects.  We have seen situations
where users do not want the shrinkage involved with random effects but
it is rare.

If you want to follow up on the development branch (for which binary
packages are not currently available, i.e. you need to compile it
yourself) then we can correspond off-list.


 I know that specifying

 model.1 - lme(Y ~ -1 + factor(gene), data=data, random= ~1|gene/treatment)

 fits Y_{gtr} = \mu_{g} +  U_{g} + W_{gt} + Z_{gtr}

 with \mu_{g}, W_{gt}  and Z_{gtr} as previous and U_{g} ~ N(0,\gamma^{2}), 
 but I do NOT want to specify a random gene effect. I have scoured Bates and 
 Pinheiro without coming across a parallel example.


 Any help would be greatly appreciated

 Best


 Gerwyn Green
 School of Health and Medicine
 Lancaster Uinversity

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] resampling problem counting number of means above a specific value

2009-11-15 Thread Graham Smith
Dimitris,

Thanks, I shall give this a try as an alternative.

Graham

2009/11/15 Dimitris Rizopoulos d.rizopou...@erasmusmc.nl:
 try the following:

 LL - c(12.5,17,12,11.5,9.5,15.5,16,14)
 n - length(LL)
 N - 1000
 threshold - 10

 smpls - sample(LL, N*n, replace = TRUE)
 dim(smpls) - c(n, N)
 cnt - sum(colMeans(smpls)  threshold)
 cnt


 I hope it helps.

 Best,
 Dimitris


 Graham Smith wrote:

 I am trying to modify some code from Good 2005.

 I am trying  to resample the mean of 8 values and then count how many
 times the resampled mean is greater than 10. But my count of means
 above 10 is coming out as zero, which I know isn't correct.

 I would appreciate it if someone could look at the code below and tell
 me what I am doing wrong.

 Many thanks,

 Graham

 LL- c(12.5,17,12,11.5,9.5,15.5,16,14)
 N-1000
 n-length(LL)
 threshold-10
 cnt-0
 for(i in 1:N){

 + LLb - sample (LL, n, replace=TRUE)
 + if (mean(LLb)=threshold) cnt-cnt+1
 + }

 cnt

 [1] 0

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Dimitris Rizopoulos
 Assistant Professor
 Department of Biostatistics
 Erasmus University Medical Center

 Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
 Tel: +31/(0)10/7043478
 Fax: +31/(0)10/7043014


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gregmisc library (Mandriva)

2009-11-15 Thread Paul Johnson
On Sat, Oct 3, 2009 at 9:18 AM, chi ball c...@hotmail.it wrote:



 Hi, I'm not able to find a rpm of gregmisc library (2.0.0)  for Linux 
 Mandriva 2008 Spring.
 Any suggestion?
 Thanks

If you don't find an up to date RPM, either you have to learn how to
build an RPM or just install the package yourself.

You can install the package yourself in a number of ways, I think R
FAQ outlines it.

To update and download a whole bunch of packages, I use a script.

It should be easy for you to see how this works. I scan the system to
update what packages there are, and then install a lot of others if
they are not installed yet.

as root run R CMD BATCH R_installFaves-2.R

or inside R as root you could type

source(R_installFaves-2.R)

On Ubuntu, if you do this as root it installes the packages into
/usr/local/lib/R, but on Fedora it installs them under /usr/lib/R.  I
do not know where they will go with Mandriva.

I used to run the script to get ALL packages, but when the CRAN list
accumulated more than 600 packages, my systems just spent all day
building packages. So I had to narrow my sites.

I'm looking at administering a cluster computer on which I'll need to
make RPMs for many packages, and so I'm in the same boat as you are if
you are wanting RPMs.  You could check back with me in about a month
to find out if I have packages for you.

pj

I think  gregmisc is a bundle, those are deprecated.  Instead, you
install gdata, gmodels, and so forth.


-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with unstack() function

2009-11-15 Thread Tariq Perwez
Hi Everyone,

I am trying to understand the unstack() function but after struggling for
two days, I have given up. More specifically, I am trying the exercises at
the end of  Chapter 1 of Data Analysis and Graphics Using R by Maindonald
and Braun, 2nd ed. Exercise 18 (p. 41) asks to unstack the Rabbit data frame
from the MASS package to get a certain data frame that is shown in the
exercise. Authors suggest to use the unstack() three times but I am so new
to R that I have absolutely no clue as to what is to be done each of those
times. Sadly for me, the help page for unstack() does not give much help
either. For example, the statement in the help page regarding the argument
form, a two-sided formula whose left side evaluates to the vector to be
unstacked and whose right side evaluates to the indicator of the groups to
create is very cryptic to me.  Basically, I have tried things like:

unstack(Rabbit, Dose ~ Animal)

but notice that what I get is a data frame in which other columns of the
Rabbit data frame disappear. I would appreciate if someone could help me
understand this function. On page 17 of the same book, there is an example
of unstack() function but that one uses a very simple data frame (only two
columns). I would like to know how to handle more complex data frames as in
the exercise. BTW, this is not a school assignment; I am learning R using
this book on my own. Thanks for any help. Regards,

Tariq

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] wilcox.test loop through variable names

2009-11-15 Thread Jacob Wegelin


Often I perform the same task on a series of variables in a dataframe,
by looping through a character vector that holds the names and using
paste(), eval(), and parse() inside the loop.

For instance:

rm(environmental)
thesevars-names(environmental)
environmental$ToyReal -rnorm(nrow(environmental)) 
environmental$ToyDichot- environmental$ToyReal  0.53


tableOfResults-data.frame(var=thesevars)

tableOfResults$p_wilcox - NA

tableOfResults$Beta_lm - NA

rownames(tableOfResults)-thesevars

for( thisvar in thesevars) {
thiscommand- paste(thiswilcox - wilcox.test (, thisvar,  ~ ToyDichot , 
data=environmental))
eval(parse(text=thiscommand))
tableOfResults[thisvar, p_wilcox] - thiswilcox$p.value
thislm-lm( environmental[ c( ToyReal, thisvar )])
tableOfResults[thisvar, Beta_lm] - coef(thislm)[thisvar]
}

print(tableOfResults)

Of course, the loop above is a toy example. In real life I might first figure 
out whether the variable is
continuous, dichotomous, or categorical taking on several values, then perform 
an operation depending on
its type.

The use of paste(), eval(), and parse() seems awkward.  As Gabor Grothendieck 
showed
(http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we
are calling a regression function such as lm() we can avoid using
paste(), as shown above.

But is there a way to avoid paste() and eval() when one uses t.test()
or wilcox.test()?

Thanks

Jacob A. Wegelin
Department of Biostatistics
Virginia Commonwealth University
Richmond VA 23298-0032
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multivariate meta-analysis with the metafor package

2009-11-15 Thread Viechtbauer Wolfgang (STAT)
Dear Antonio,

Yes, I am currently working on these extensions. It will take some time before 
those functions are sufficiently tested and documented and can become part of 
the metafor package. My goal is to do so within the next year, but I cannot 
give any specific date for when this will be finished.

To answer your second question, I will have to take a closer look at the mvmeta 
command (I wasn't aware of this command until you mentioned it). I will try to 
do so when I get a chance. I guess an immediate advantage of an rma.multi (or 
whatever it will be called) command is that it will run under R =)

Best,

--
Wolfgang Viechtbauerhttp://www.wvbauer.com/
Department of Methodology and StatisticsTel: +31 (0)43 388-2277
School for Public Health and Primary Care   Office Location:
Maastricht University, P.O. Box 616 Room B2.01 (second floor)
6200 MD Maastricht, The Netherlands Debyeplein 1 (Randwyck)


Original Message
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of antonio.gasparr...@lshtm.ac.uk Sent: Friday, November 13, 2009
19:47 To: r-help@r-project.org
Subject: [R] multivariate meta-analysis with the metafor package

 Dear Wolfgang Viechtbauer and R users,

 I have few questions regarding the development of the package 'metafor.
 As you suggested , I post to the R-help mailing list.

 I read you're planning an extension of this method to the multivariate
 case. I think it would be a useful tool. I'm currently performing some
 analyses with R on multiple outcomes, using the Stata command mvmeta to
 get meta-analytic multivariate estimates, then coming back to R to use
 these results. Obviously, it's irritating to switch software every time.

 Briefly:
 - Are you still planning this extension? And in this case, do you have a
 planned date?
 - What are likely to be the advantages and limitations of a potential
 'rma.multi' if compared to Stata's 'mvmeta'?

 Thank you for your time
 Regards,

 Antonio Gasparrini
 Public and Environmental Health Research Unit (PEHRU)
 London School of Hygiene  Tropical Medicine
 Keppel Street, London WC1E 7HT, UK
 Office: 0044 (0)20 79272406 - Mobile: 0044 (0)79 64925523
 Skype contact: a.gasparrini
 http://www.lshtm.ac.uk/people/gasparrini.antonio (
 http://www.lshtm.ac.uk/pehru/ )

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] re move row if the column date_abandoned has a date in it

2009-11-15 Thread frenchcr


this works perfectly...

new_data5 - new_data4[nchar(new_data4$date_abandoned) != 8, ]

...and i can now think of a few different ways to manipulate my data with
what ive learned from these tricks, thanks alot David!




David Winsemius wrote:
 
 
 On Nov 15, 2009, at 11:00 AM, frenchcr wrote:
 


 Yes they are not in date format, theyre just characters.

 the earliest date is 1601 i originally had one of  0101 00 00  
 (101 years
 BC)...this was a software problem.

 table(nchar(new_data4$date_abandoned))

 2  8
 315732263

 The 315732 are empty fields i thought.
 
 They are actually 2 characters wide.
 
 The 263 are dates, i want to remove their rows.
 
 If you want to remove the ones that are _not_ 8 characters long, then:
 
 new_data5 - new_data4[nchar(new_data4$date_abandoned) != 8, ]
 
 or:
 
 new_data5 - subset(new_data4, date_abandoned != 8)
 
 -- 
 David.
 



 David Winsemius wrote:


 On Nov 14, 2009, at 8:43 PM, frenchcr wrote:


 sorry David,

 im really new to R (my first week) and appreciate your help. Also I
 dont
 always know what info to give people on the forum (although im
 starting to
 catch the drift).

 heres what i get...

 summary(new_data4$date_abandoned)
 Min.1st Qu.Median Mean  3rd Qu.
 Max.  NA's
 1601 1998 2001 1993 2004  2009   315732

 So new_data4$data_abandoned is not of type Date and is instead a
 character vector.

 If you are resisting turning it into a date and want to work with
 characters, you can, you just need to deal somehow with the items  
 that
 are not 8 characters wide. What does 315732 represent? How were we
 supposed to interpret the starting date you gave of 0101?

 nchar(101)
 [1] 7

 What does table(nchar(new_data4$date_abandoned)) give you?

 ls()
 [1] data  new_data  new_data2 new_data3 new_data4
 small - head(new_data4, 20)
 dump(small, 20)
 Error in dump(small, 20) : cannot write to this connection


 Well, sorry, I meant to type dump(small, stdout())   ... As per the
 Posting Guide.

 -- 
 David.

 David Winsemius wrote:


 On Nov 14, 2009, at 5:24 PM, frenchcr wrote:



 I tried the following but it does the opposite of what i want:

 new_data5 - subset(new_data4, date_abandoned  0101)

 I want to remove the rows with dates and leave just the rows  
 without
 a date.

 This removes all the rows that dont have a date in the
 date_abandoned column

 ...on a positive note, as i did this next...

 dim(new_data5)
 [1] 263  80

 i now know that i have 263 dates in that column :)

 I want to remove the 263 rows with dates and leave just the rows
 without a
 date.

 Con=me on frenchcr. Stop making us guess. Give us enough  
 information
 to work with. You asked for something which I construed as saying  
 you
 wanted dates greater than the the first day of the year 101. You  
 did
 not address this question.

 What do you get with str(new_data4) and
 summary(new_data4$date_abandoned) ? In order to know what sort of
 comparison to use we need to know what the data looks like.

 Even better if you offered the output from:

 small - head(new_data4, 20)
 dump(small, 20),

 -- 
 David








 David Winsemius wrote:


 On Nov 14, 2009, at 1:21 PM, frenchcr wrote:



 I want to go through a column in data called

 Bad name for a data.frame. Fortunes, dog and all that.

 date_abandoneddata[date_abandoned]and remove all the
 rows
 that
 have numbers greater than 1,010,000.

 Are you doing archeology? Given what you say next I wondered what
 range you were really asking for.


 The dates are in the format 20091114 so i'm just going to treat
 them
 as
 numbers for clean up purposes.


 I know that i use subset but not sure how to proceed from there.

 subdata - subset(data, date_abandoned  0101()


 The problem with  101 is that your specified minimum point
 had
 an insufficient number of places to be in MMDD format.

 --

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible  
 code.



 -- 
 View this message in context:
 http://old.nabble.com/remove-row-if-the-column-%22date_abandoned%22-has-a-date-in-it-tp26352457p26354446.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting 

[R] Finding Largest(or smallest) values

2009-11-15 Thread alphaace

Hi,

I am trying to find a model of best fit, with 8 parameters. As such, I have
created a table with 2^8=256 rows with every variable either in or out of
the model(denoted by 1 or 0), and for each row, I have computed the Adjusted
R^2, AIC, CP and Press. I know I can use the leaps package to find the best
model (for every number of parameters n=1...8) for the Adjusted R2 and CP,
but not for AIC and PRESS. I was wondering, if anyone has any code to say
find the minimum press when the model has 1 parameter, 2 parameters, 3
parameters, etc...

I have attached a copy of my table below for reference. Thank you for your
help!

  DF
P X1 X2 X3 X4 X5 X6 X7 X8   Adjusted R2 AICCP PRESS
1   0  0  0  0  0  0  0  0  0  0.00  0.99464282 14.367679  95.01075
2   1  1  0  0  0  0  0  0  0  0.0246074998 -0.36362381 12.634643  93.70900
3   1  0  1  0  0  0  0  0  0  0.0382916889 -1.69172730 11.194739  92.46010
4   1  0  0  1  0  0  0  0  0 -0.0005184148  2.02713507 15.278491  96.11336
5   1  0  0  0  1  0  0  0  0 -0.002134  2.24957370 15.527913  96.38092
6   1  0  0  0  0  1  0  0  0  0.0272855281 -0.62206426 12.352850  92.64962
7   1  0  0  0  0  0  1  0  0  0.0111851503  0.92108831 14.046995  94.47172
8   1  0  0  0  0  0  0  1  0  0.0020990063  1.78090275 15.003075  94.60974
9   1  0  0  0  0  0  0  0  1 -0.0108209410  2.99012118 16.362563  97.09546
10  2  1  1  0  0  0  0  0  0  0.0810965349 -4.99889125  7.639659  89.49671
11  2  1  0  1  0  0  0  0  0  0.0266267059  0.41424802 13.308890  94.64288
12  2  0  1  1  0  0  0  0  0  0.0315413664 -0.06156976 12.797371  94.04593
13  2  1  0  0  1  0  0  0  0  0.0145736386  1.57108162 14.563375  95.37943
14  2  0  1  0  1  0  0  0  0  0.0479608482 -1.66893313 11.088428  92.39152
15  2  0  0  1  1  0  0  0  0  0.0005123888  2.90290718 16.026873  97.24004
16  2  1  0  0  0  1  0  0  0  0.0541222887 -2.27926275 10.447144  91.16842
17  2  0  1  0  0  1  0  0  0  0.0754976703 -4.42788850  8.222390  88.96712
18  2  0  0  1  0  1  0  0  0  0.0241535771  0.65277858 13.566293  93.96203
19  2  0  0  0  1  1  0  0  0  0.0275121981  0.32869589 13.216727  93.77620
20  2  1  0  0  0  0  1  0  0  0.0245589269  0.61372449 13.524104  94.06972
21  2  0  1  0  0  0  1  0  0  0.0622492592 -3.09039933  9.601287  90.89229
22  2  0  0  1  0  0  1  0  0  0.0085444331  2.14445635 15.190896  95.73424
23  2  0  0  0  1  0  1  0  0  0.0089907346  2.10213293 15.15  95.79122
24  2  0  0  0  0  1  1  0  0  0.0407860378 -0.96318114 11.835183  91.76667
25  2  1  0  0  0  0  0  1  0  0.0244877014  0.62058800 13.531518  93.36514
26  2  0  1  0  0  0  0  1  0  0.0407457114 -0.95922936 11.839381  92.16665
27  2  0  0  1  0  0  0  1  0 -0.0047937805  3.40062281 16.579140  96.5
28  2  0  0  0  1  0  0  1  0  0.0020858488  2.75480952 15.863107  95.82804
29  2  0  0  0  0  1  0  1  0  0.0248637496  0.58434515 13.492378  92.73311
30  2  0  0  0  0  0  1  1  0  0.0171446547  1.32551143 14.295783  93.74595
31  2  1  0  0  0  0  0  0  1  0.0138889013  1.63637615 14.634643  95.78017
32  2  0  1  0  0  0  0  0  1  0.0277245197  0.30817080 13.194629  94.26947
33  2  0  0  1  0  0  0  0  1 -0.0115063328  4.02650409 17.277784  98.26103
34  2  0  0  0  1  0  0  0  1 -0.0130170902  4.16679510 17.435024  98.57288
35  2  0  0  0  0  1  0  0  1  0.0193924225  1.11028937 14.061835  94.52438
36  2  0  0  0  0  0  1  0  1  0.0004693207  2.90695757 16.031355  96.66280
37  2  0  0  0  0  0  0  1  1 -0.0087026647  3.76559547 16.985978  96.73581
38  3  1  1  1  0  0  0  0  0  0.0754624308 -3.46299014  9.168628  90.98724
39  3  1  1  0  1  0  0  0  0  0.0761667614 -3.53462845  9.096127  90.47049
40  3  1  0  1  1  0  0  0  0  0.0179377080  2.21094883 15.090020  96.18373
41  3  0  1  1  1  0  0  0  0  0.0443170822 -0.34853563 12.374620  93.90532
42  3  1  1  0  0  1  0  0  0  0.1233192905 -8.45916839  4.242412  85.48181
43  3  1  0  1  0  1  0  0  0  0.0532048878 -1.22682152 11.459741  92.37654
44  3  0  1  1  0  1  0  0  0  0.0668617200 -2.59257716 10.053955  90.62446
45  3  1  0  0  1  1  0  0  0  0.0452547720 -0.44081112 12.278098  92.71992
46  3  0  1  0  1  1  0  0  0  0.0924857470 -5.20992520  7.416308  88.22170
47  3  0  0  1  1  1  0  0  0  0.0281772097  1.22570975 14.036002  94.88753
48  3  1  1  0  0  0  1  0  0  0.0890014298 -4.84971194  7.774972  89.34392
49  3  1  0  1  0  0  1  0  0  0.0245049716  1.58023923 14.414009  95.16454
50  3  0  1  1  0  0  1  0  0  0.0535554539 -1.26163298 11.423655  92.56181
51  3  1  0  0  1  0  1  0  0  0.0152798441  2.46500779 15.363611  95.72452
52  3  0  1  0  1  0  1  0  0  0.0755765040 -3.47458896  9.156886  90.73695
53  3  0  0  1  1  0  1  0  0  0.0099403328  2.9710 15.913241  96.85582
54  3  1  0  0  0  1  1  0  0  0.0555455543 -1.45949602 11.218801  91.36679
55  3  0  1  0  0  1  1  0  0  0.1041509877 -6.42603949  6.215530  86.77444
56  3  0  0  1  0  1  1  0  0  0.0357196889  0.49331421 13.259606  93.21327
57  3  0  0  0  1  1  1  0  0  

Re: [R] where is a value in my list

2009-11-15 Thread Grzes

It's excellent! 
Now, if I have a vector k=c( TRUE, TRUE, FALSE) how I may get lines from
list?
which (list ?? k) ? 



David Winsemius wrote:
 
 
 On Nov 15, 2009, at 10:47 AM, Grzes wrote:
 

 But it's not what I wont

 I need get a number of line my list
 5 is in: list[[1]][1] and list[[2]][1] so
 I would like to get a vector k = 1,2

 
 I am sorry. I do not understand what you want the second solution  
 offered gave you numbers (and they were the numbers that were for  
 5's rather than one that were not for 5's as your offered solution.
 
 If you just want to know which lists contain a 5, but not the position  
 within the list (which was not what you appeared to be asking..):
 
   which(sapply(lista, function(x) any(x == 5)))
 [1] 1 2
 
 
 

 David Winsemius wrote:


 On Nov 15, 2009, at 10:01 AM, Grzes wrote:


 I heve got a list:

 lista=list()
 a=c(2,4,5,5,6)
 b=c(3,5,4,2)
 c=c(1,1,1,8)
 lista[[1]]=a
 lista[[2]]=b
 lista[[3]]=c

 lista
 [[1]]
 [1] 2 4 5 5 6

 [[2]]
 [1] 3 5 4 2

 [[3]]
 [1] 1 1 1 8

 I would like to know where is number 5 (which line)?

 For example I have got a loop:

 k= vector(mode = integer, length = 3)

 for(i in 1:3)
 {
 for (j in 1:length(lista[[i]])){
 if ((lista[[i]][j])==5   k[i]= [i])
 }
 }

 This loop is wrong but I would like to get in my vector k sth like
 this:

 k = lista[[1]][1], lista[[2]][1] ...or sth similar

 I am a bit confused, since clearly lista[[1]][1] does _not_ == 5.  
 It's
 also unclear what type of output you expect ... character, list,
 numeric?

 See if these take you any further to your vaguely expressed goal:

 lapply(lista, %in%, 5)
 [[1]]
 [1] FALSE FALSE  TRUE  TRUE FALSE

 [[2]]
 [1] FALSE  TRUE FALSE FALSE

 [[3]]
 [1] FALSE FALSE FALSE FALSE

 lapply(lista, function(x) which(x == 5) )
 [[1]]
 [1] 3 4

 [[2]]
 [1] 2

 [[3]]
 integer(0)

 --


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 -- 
 View this message in context:
 http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360251.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/where-is-a-value-in-my-list-tp26359843p26360930.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] where is a value in my list

2009-11-15 Thread David Winsemius
 The term line in R refers to a sequence of text ending in and  
EOL marker, which I doubt is what you meant. Please stop referring  
to the items or elements of a list as lines if you hope to  
communicate with R-users.


 lista[1:2]
[[1]]
[1] 2 4 5 5 6

[[2]]
[1] 3 5 4 2

 lista[ which(sapply(lista, function(x) any(x == 5))) ]
[[1]]
[1] 2 4 5 5 6

[[2]]
[1] 3 5 4 2

--
David.

On Nov 15, 2009, at 11:57 AM, Grzes wrote:



It's excellent!
Now, if I have a vector k=c( TRUE, TRUE, FALSE) how I may get lines  
from

list?
which (list ?? k) ?



David Winsemius wrote:



On Nov 15, 2009, at 10:47 AM, Grzes wrote:



But it's not what I wont

I need get a number of line my list
5 is in: list[[1]][1] and list[[2]][1] so
I would like to get a vector k = 1,2



I am sorry. I do not understand what you want the second solution
offered gave you numbers (and they were the numbers that were for
5's rather than one that were not for 5's as your offered  
solution.


If you just want to know which lists contain a 5, but not the  
position

within the list (which was not what you appeared to be asking..):


which(sapply(lista, function(x) any(x == 5)))

[1] 1 2





David Winsemius wrote:



On Nov 15, 2009, at 10:01 AM, Grzes wrote:



I heve got a list:

lista=list()
a=c(2,4,5,5,6)
b=c(3,5,4,2)
c=c(1,1,1,8)
lista[[1]]=a
lista[[2]]=b
lista[[3]]=c


lista

[[1]]
[1] 2 4 5 5 6

[[2]]
[1] 3 5 4 2

[[3]]
[1] 1 1 1 8

I would like to know where is number 5 (which line)?

For example I have got a loop:

k= vector(mode = integer, length = 3)

for(i in 1:3)
{
for (j in 1:length(lista[[i]])){
if ((lista[[i]][j])==5   k[i]= [i])
}
}

This loop is wrong but I would like to get in my vector k sth like
this:

k = lista[[1]][1], lista[[2]][1] ...or sth similar


I am a bit confused, since clearly lista[[1]][1] does _not_ == 5.
It's
also unclear what type of output you expect ... character, list,
numeric?

See if these take you any further to your vaguely expressed goal:


lapply(lista, %in%, 5)

[[1]]
[1] FALSE FALSE  TRUE  TRUE FALSE

[[2]]
[1] FALSE  TRUE FALSE FALSE

[[3]]
[1] FALSE FALSE FALSE FALSE


lapply(lista, function(x) which(x == 5) )

[[1]]
[1] 3 4

[[2]]
[1] 2

[[3]]
integer(0)


--







David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to permute, simulate Markov chain

2009-11-15 Thread martin08

Hi all,

I am new to R. Can someone please give me some hints in how to do the
following things:

1- Get ONE permutation of a set. I have looked at the gregmisc package's
permutations() method, but I just want to get one permutation at a time. 

2- Simulate a Markov chain in R. For instance, I want to simulate the simple
random walk problem, in which a person can walk randomly around 4 places. I
know how to set up the transition matrix in R. I'm stuck at what to do next.

I'm grateful if someone can give me hint or a pointer.

Thanks.

Martin 
-- 
View this message in context: 
http://old.nabble.com/how-to-permute%2C-simulate-Markov-chain-tp26363411p26363411.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to permute, simulate Markov chain

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 4:11 PM, martin08 wrote:



Hi all,

I am new to R. Can someone please give me some hints in how to do the
following things:

1- Get ONE permutation of a set. I have looked at the gregmisc  
package's
permutations() method, but I just want to get one permutation at a  
time.


Assuming that set1 is your set object  and x is = number of  
permutations of z things from set1, then


permutations(set1, z)[x] will give your the x-th permutation.

the [ operator/function can often be appended to the end of a  
function to extract the desired subset.




2- Simulate a Markov chain in R. For instance, I want to simulate  
the simple
random walk problem, in which a person can walk randomly around 4  
places. I
know how to set up the transition matrix in R. I'm stuck at what to  
do next.


Learn to search for yourself:

http://search.r-project.org/cgi-bin/namazu.cgi?query=%22transition+matrix%22+markovmax=100result=normalsort=scoreidxname=Rhelp08idxname=Rhelp02



I'm grateful if someone can give me hint or a pointer.

Thanks.

Martin
--
View this message in context: 
http://old.nabble.com/how-to-permute%2C-simulate-Markov-chain-tp26363411p26363411.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to get the string '\'?

2009-11-15 Thread Peng Yu
I can not get the string '\'. Could somebody let me know how to get it?

 print('\')
+
+
 print('\\')
[1] \\

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding Largest(or smallest) values

2009-11-15 Thread Joe King
There are probably better ways but cant you subset each parameter? So create
new variables for parameter 1, 2, ... and look at the summary data for those
which will include a min and max for all variables.

Joe King
206-913-2912
j...@joepking.com
Never throughout history has a man who lived a life of ease left a name
worth remembering. --Theodore Roosevelt

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of alphaace
Sent: Sunday, November 15, 2009 8:54 AM
To: r-help@r-project.org
Subject: [R] Finding Largest(or smallest) values


Hi,

I am trying to find a model of best fit, with 8 parameters. As such, I have
created a table with 2^8=256 rows with every variable either in or out of
the model(denoted by 1 or 0), and for each row, I have computed the Adjusted
R^2, AIC, CP and Press. I know I can use the leaps package to find the best
model (for every number of parameters n=1...8) for the Adjusted R2 and CP,
but not for AIC and PRESS. I was wondering, if anyone has any code to say
find the minimum press when the model has 1 parameter, 2 parameters, 3
parameters, etc...

I have attached a copy of my table below for reference. Thank you for your
help!

  DF
P X1 X2 X3 X4 X5 X6 X7 X8   Adjusted R2 AICCP PRESS
1   0  0  0  0  0  0  0  0  0  0.00  0.99464282 14.367679  95.01075
2   1  1  0  0  0  0  0  0  0  0.0246074998 -0.36362381 12.634643  93.70900
3   1  0  1  0  0  0  0  0  0  0.0382916889 -1.69172730 11.194739  92.46010
4   1  0  0  1  0  0  0  0  0 -0.0005184148  2.02713507 15.278491  96.11336
5   1  0  0  0  1  0  0  0  0 -0.002134  2.24957370 15.527913  96.38092
6   1  0  0  0  0  1  0  0  0  0.0272855281 -0.62206426 12.352850  92.64962
7   1  0  0  0  0  0  1  0  0  0.0111851503  0.92108831 14.046995  94.47172
8   1  0  0  0  0  0  0  1  0  0.0020990063  1.78090275 15.003075  94.60974
9   1  0  0  0  0  0  0  0  1 -0.0108209410  2.99012118 16.362563  97.09546
10  2  1  1  0  0  0  0  0  0  0.0810965349 -4.99889125  7.639659  89.49671
11  2  1  0  1  0  0  0  0  0  0.0266267059  0.41424802 13.308890  94.64288
12  2  0  1  1  0  0  0  0  0  0.0315413664 -0.06156976 12.797371  94.04593
13  2  1  0  0  1  0  0  0  0  0.0145736386  1.57108162 14.563375  95.37943
14  2  0  1  0  1  0  0  0  0  0.0479608482 -1.66893313 11.088428  92.39152
15  2  0  0  1  1  0  0  0  0  0.0005123888  2.90290718 16.026873  97.24004
16  2  1  0  0  0  1  0  0  0  0.0541222887 -2.27926275 10.447144  91.16842
17  2  0  1  0  0  1  0  0  0  0.0754976703 -4.42788850  8.222390  88.96712
18  2  0  0  1  0  1  0  0  0  0.0241535771  0.65277858 13.566293  93.96203
19  2  0  0  0  1  1  0  0  0  0.0275121981  0.32869589 13.216727  93.77620
20  2  1  0  0  0  0  1  0  0  0.0245589269  0.61372449 13.524104  94.06972
21  2  0  1  0  0  0  1  0  0  0.0622492592 -3.09039933  9.601287  90.89229
22  2  0  0  1  0  0  1  0  0  0.0085444331  2.14445635 15.190896  95.73424
23  2  0  0  0  1  0  1  0  0  0.0089907346  2.10213293 15.15  95.79122
24  2  0  0  0  0  1  1  0  0  0.0407860378 -0.96318114 11.835183  91.76667
25  2  1  0  0  0  0  0  1  0  0.0244877014  0.62058800 13.531518  93.36514
26  2  0  1  0  0  0  0  1  0  0.0407457114 -0.95922936 11.839381  92.16665
27  2  0  0  1  0  0  0  1  0 -0.0047937805  3.40062281 16.579140  96.5
28  2  0  0  0  1  0  0  1  0  0.0020858488  2.75480952 15.863107  95.82804
29  2  0  0  0  0  1  0  1  0  0.0248637496  0.58434515 13.492378  92.73311
30  2  0  0  0  0  0  1  1  0  0.0171446547  1.32551143 14.295783  93.74595
31  2  1  0  0  0  0  0  0  1  0.0138889013  1.63637615 14.634643  95.78017
32  2  0  1  0  0  0  0  0  1  0.0277245197  0.30817080 13.194629  94.26947
33  2  0  0  1  0  0  0  0  1 -0.0115063328  4.02650409 17.277784  98.26103
34  2  0  0  0  1  0  0  0  1 -0.0130170902  4.16679510 17.435024  98.57288
35  2  0  0  0  0  1  0  0  1  0.0193924225  1.11028937 14.061835  94.52438
36  2  0  0  0  0  0  1  0  1  0.0004693207  2.90695757 16.031355  96.66280
37  2  0  0  0  0  0  0  1  1 -0.0087026647  3.76559547 16.985978  96.73581
38  3  1  1  1  0  0  0  0  0  0.0754624308 -3.46299014  9.168628  90.98724
39  3  1  1  0  1  0  0  0  0  0.0761667614 -3.53462845  9.096127  90.47049
40  3  1  0  1  1  0  0  0  0  0.0179377080  2.21094883 15.090020  96.18373
41  3  0  1  1  1  0  0  0  0  0.0443170822 -0.34853563 12.374620  93.90532
42  3  1  1  0  0  1  0  0  0  0.1233192905 -8.45916839  4.242412  85.48181
43  3  1  0  1  0  1  0  0  0  0.0532048878 -1.22682152 11.459741  92.37654
44  3  0  1  1  0  1  0  0  0  0.0668617200 -2.59257716 10.053955  90.62446
45  3  1  0  0  1  1  0  0  0  0.0452547720 -0.44081112 12.278098  92.71992
46  3  0  1  0  1  1  0  0  0  0.0924857470 -5.20992520  7.416308  88.22170
47  3  0  0  1  1  1  0  0  0  0.0281772097  1.22570975 14.036002  94.88753
48  3  1  1  0  0  0  1  0  0  0.0890014298 -4.84971194  7.774972  89.34392
49  3  1  0  1  0  0  1  0  0  

Re: [R] How to get the string '\'?

2009-11-15 Thread David Winsemius

?cat

 cat(\\)
\


On Nov 15, 2009, at 5:30 PM, Peng Yu wrote:

I can not get the string '\'. Could somebody let me know how to get  
it?



print('\')

+
+

print('\\')

[1] \\

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Basic Question about local

2009-11-15 Thread Saptarshi Guha
I have some beginner's questions regarding local, in the docs, it says that
local evaluates an expression in a local environment.

Q1: why is B different from A? In B, is a-a+1 getting evaluated
before eval proceeds?

#A
a=0
eval(quote(a-a+1),new.env())
a # 0

#B
a=0
eval(a-a+1,new.env())
a # 1

Q2: Why does mlocal behave differently?

#C
local
#function (expr, envir = new.env())
#eval.parent(substitute(eval(quote(expr), envir)))
#environment: namespace:base

a=0
local(a-a+1)
a #0



mlocal - function (expr, envir = new.env())
  eval(quote(expr), envir)

a=0
mlocal(a-a+1)
a #1


Thank you
S

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pairs

2009-11-15 Thread cindy Guo
Hi, All,

I have an n by m matrix with each entry between 1 and 15000. I want to know
the frequency of each pair in 1:15000 that occur together in rows. So for
example, if the matrix is
2 5 1 6
1 7 8 2
3 7 6 2
9 8 5 7
Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
the value 2 for this pair as well as that for all pairs. Is there a fast way
to do this avoiding loops? Loops take too long.

Thank you,

Cindy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the string '\'?

2009-11-15 Thread Peng Yu
My question was from replacing a pattern by '\\'. How to replace '/'
in string by '\'?

string='abc/efg'
gsub('/','\\',string)

On Sun, Nov 15, 2009 at 5:07 PM, David Winsemius dwinsem...@comcast.net wrote:
 ?cat

 cat(\\)
 \


 On Nov 15, 2009, at 5:30 PM, Peng Yu wrote:

 I can not get the string '\'. Could somebody let me know how to get it?

 print('\')

 +
 +

 print('\\')

 [1] \\

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the string '\'?

2009-11-15 Thread Linlin Yan
Regular expression needs double the '\' again, so try this:
gsub('/','',string)

On Mon, Nov 16, 2009 at 7:35 AM, Peng Yu pengyu...@gmail.com wrote:
 My question was from replacing a pattern by '\\'. How to replace '/'
 in string by '\'?

 string='abc/efg'
 gsub('/','\\',string)
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the string '\'?

2009-11-15 Thread David Winsemius


On Nov 15, 2009, at 6:35 PM, Peng Yu wrote:


My question was from replacing a pattern by '\\'. How to replace '/'
in string by '\'?

string='abc/efg'
gsub('/','\\',string)


No,  that was most definitely _not_ your posed question. If you want  
now to change your question and supply a reproducible example, that's  
fine, just don't claim that your mind should have been read more  
properly that it was,  please.


The problem with your _second_ question is that the printed  
representation of \ is a problem because of its special use as an  
escape symbol. So sometimes it needs to be displayed as \\. What  
gets written to the screen may be different that the internal  
representation. Look at the results of:

 string='abc/efg'
 cat(gsub('/','',string), file=test.txt)

You should see:
abc\efg

...although at the screen you would see:

 string='abc/efg'
 gsub('/','',string)
[1] abc\\efg

The first \ escapes second \ which in turn allows whatever follows  
to be interpreted as escaped, while the third \ escapes the 4th  
\ so that it can be examined by the R interpreter as a real \.



--
David.


On Sun, Nov 15, 2009 at 5:07 PM, David Winsemius dwinsem...@comcast.net 
 wrote:

?cat


cat(\\)

\


On Nov 15, 2009, at 5:30 PM, Peng Yu wrote:

I can not get the string '\'. Could somebody let me know how to  
get it?



print('\')


+
+


print('\\')


[1] \\

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs

2009-11-15 Thread cls59



cindy Guo wrote:
 
 Hi, All,
 
 I have an n by m matrix with each entry between 1 and 15000. I want to
 know
 the frequency of each pair in 1:15000 that occur together in rows. So for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
 the value 2 for this pair as well as that for all pairs. Is there a fast
 way
 to do this avoiding loops? Loops take too long.
 
 Thank you,
 
 Cindy
 

Use %in% to check for the presence of the numbers in a row and apply() to
efficiently execute the test for each row:

 tstMatrix - matrix( c(2,5,1,6,
1,7,8,2,
3,7,6,2,
9,8,5,7), nrow=4, byrow=T )

  matches - apply( tstMatrix, 1, function( row ){
   
if( 2 %in% row  6 %in% row ){

  return( 2 )

} else {

  return( 0 )

}

  })

  matches
  [1] 2 0 2 0

If you have more than one pair, it gets a little tricky.  Say you are also
looking for the pair (7,8).  Store them as a list:

  pairList - list( c(2,6), c(7,8) )

Then use sapply() to efficiently iterate over the pair list and execute the
apply() test:

  matchMatrix - sapply( pairList, function( pair ){

matches - apply( tstMatrix, 1, function( row ){

  if( pair[1] %in% row  pair[2] %in% row ){

return( pair[1] )

  } else {

return( 0 )

  }

})

return( matches )

  })

  matchMatrix

   [,1] [,2]
  [1,]20
  [2,]07
  [3,]20
  [4,]07



If you're looking to apply the above method to every possible permutation of
2 numbers that may be generated from the range of numbers 1:15000... that's
225,000,000 pairs. expand.grid() can generate the required pair list-- but
that step alone causes a memory allocation of ~6 GB on my machine.

If you don't have a pile of CPU cores and RAM at your disposal, you can
probably:

  1. Restrict the upper end of your range to the maximal entry present in
your matrix since all other combinations have zero occurrences.

  2. Break the list of pairs up into several sublists, run the tests, and
aggregate the results.

Either way, the analysis will take some time despite the efficiencies of the
apply family of functions due to the shear size of the problem.  If you have
more than one CPU, I would recommend taking a look at parallelized apply
functions, perhaps using a package like snowfall,  as the testing of the
pairs is an embarrassingly parallel problem.

Hopefully I'm misunderstanding the scope of your problem.


Good luck!

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://old.nabble.com/pairs-tp26364801p26365206.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs

2009-11-15 Thread Linlin Yan
Hope this help:

 m - matrix(c(2,1,3,9,5,7,7,8,1,8,6,5,6,2,2,7),4,4)
 p - c(2, 6)

 apply(m == p[1], 1, any)  apply(m == p[2], 1, any)
[1]  TRUE FALSE  TRUE FALSE

If you want the number of rows which contain the pair, sum() could be used:

 sum(apply(m == p[1], 1, any)  apply(m == p[2], 1, any))
[1] 2

On Mon, Nov 16, 2009 at 6:26 AM, cindy Guo cindy.g...@gmail.com wrote:
 Hi, All,

 I have an n by m matrix with each entry between 1 and 15000. I want to know
 the frequency of each pair in 1:15000 that occur together in rows. So for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
 the value 2 for this pair as well as that for all pairs. Is there a fast way
 to do this avoiding loops? Loops take too long.

 Thank you,

 Cindy

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the string '\'?

2009-11-15 Thread Peng Yu
On Sun, Nov 15, 2009 at 6:05 PM, David Winsemius dwinsem...@comcast.net wrote:

 On Nov 15, 2009, at 6:35 PM, Peng Yu wrote:

 My question was from replacing a pattern by '\\'. How to replace '/'
 in string by '\'?

 string='abc/efg'
 gsub('/','\\',string)

 No,  that was most definitely _not_ your posed question. If you want now to
 change your question and supply a reproducible example, that's fine, just
 don't claim that your mind should have been read more properly that it was,
  please.

Sorry for the misunderstanding. I realized that the answer to the
first question could not solve my original question (but I thought it
could). So I stated my original question.

 The problem with your _second_ question is that the printed representation
 of \ is a problem because of its special use as an escape symbol. So
 sometimes it needs to be displayed as \\. What gets written to the screen
 may be different that the internal representation. Look at the results of:
 string='abc/efg'
 cat(gsub('/','',string), file=test.txt)

 You should see:
 abc\efg

 ...although at the screen you would see:

 string='abc/efg'
 gsub('/','',string)
 [1] abc\\efg

 The first \ escapes second \ which in turn allows whatever follows to be
 interpreted as escaped, while the third \ escapes the 4th \ so that it
 can be examined by the R interpreter as a real \.


 --
 David.

 On Sun, Nov 15, 2009 at 5:07 PM, David Winsemius dwinsem...@comcast.net
 wrote:

 ?cat

 cat(\\)

 \


 On Nov 15, 2009, at 5:30 PM, Peng Yu wrote:

 I can not get the string '\'. Could somebody let me know how to get it?

 print('\')

 +
 +

 print('\\')

 [1] \\

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Normalization of Data

2009-11-15 Thread Abhishek Pratap
Hi All

I am looking for some resource to learn data normalization. I understand I
am talking very broad here, I need something like a primer to give me a jump
start. If you happen to know any good resource please do let me know.

Cheers,
-Abhi

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs

2009-11-15 Thread cindy Guo
Hi, Charlie,

Thank you for the reply. Maybe I don't need the frequency of each pair. I
only need the top, say 50, pairs with the highest frequency. Is there anyway
which can avoid calculating for all the pairs?

Thanks,

Cindy
On Sun, Nov 15, 2009 at 4:18 PM, cls59 ch...@sharpsteen.net wrote:




 cindy Guo wrote:
 
  Hi, All,
 
  I have an n by m matrix with each entry between 1 and 15000. I want to
  know
  the frequency of each pair in 1:15000 that occur together in rows. So for
  example, if the matrix is
  2 5 1 6
  1 7 8 2
  3 7 6 2
  9 8 5 7
  Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
  the value 2 for this pair as well as that for all pairs. Is there a fast
  way
  to do this avoiding loops? Loops take too long.
 
  Thank you,
 
  Cindy
 

 Use %in% to check for the presence of the numbers in a row and apply() to
 efficiently execute the test for each row:

  tstMatrix - matrix( c(2,5,1,6,
1,7,8,2,
3,7,6,2,
9,8,5,7), nrow=4, byrow=T )

  matches - apply( tstMatrix, 1, function( row ){

if( 2 %in% row  6 %in% row ){

  return( 2 )

} else {

  return( 0 )

}

  })

  matches
  [1] 2 0 2 0

 If you have more than one pair, it gets a little tricky.  Say you are also
 looking for the pair (7,8).  Store them as a list:

  pairList - list( c(2,6), c(7,8) )

 Then use sapply() to efficiently iterate over the pair list and execute the
 apply() test:

  matchMatrix - sapply( pairList, function( pair ){

matches - apply( tstMatrix, 1, function( row ){

  if( pair[1] %in% row  pair[2] %in% row ){

return( pair[1] )

  } else {

return( 0 )

  }

})

return( matches )

  })

  matchMatrix

   [,1] [,2]
  [1,]20
  [2,]07
  [3,]20
  [4,]07



 If you're looking to apply the above method to every possible permutation
 of
 2 numbers that may be generated from the range of numbers 1:15000... that's
 225,000,000 pairs. expand.grid() can generate the required pair list-- but
 that step alone causes a memory allocation of ~6 GB on my machine.

 If you don't have a pile of CPU cores and RAM at your disposal, you can
 probably:

  1. Restrict the upper end of your range to the maximal entry present in
 your matrix since all other combinations have zero occurrences.

  2. Break the list of pairs up into several sublists, run the tests, and
 aggregate the results.

 Either way, the analysis will take some time despite the efficiencies of
 the
 apply family of functions due to the shear size of the problem.  If you
 have
 more than one CPU, I would recommend taking a look at parallelized apply
 functions, perhaps using a package like snowfall,  as the testing of the
 pairs is an embarrassingly parallel problem.

 Hopefully I'm misunderstanding the scope of your problem.


 Good luck!

 -Charlie

 -
 Charlie Sharpsteen
 Undergraduate
 Environmental Resources Engineering
 Humboldt State University
 --
 View this message in context:
 http://old.nabble.com/pairs-tp26364801p26365206.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs

2009-11-15 Thread David Winsemius
I could of course be wrong but have you yet specified the number of  
columns for this pairing exercise?


On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:


Hi, All,

I have an n by m matrix with each entry between 1 and 15000. I want  
to know
the frequency of each pair in 1:15000 that occur together in rows.  
So for

example, if the matrix is
2 5 1 6
1 7 8 2
3 7 6 2
9 8 5 7
Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to  
return
the value 2 for this pair as well as that for all pairs. Is there a  
fast way

to do this avoiding loops? Loops take too long.

and provide commented, minimal, self-contained, reproducible code.

   ^^

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to name a tag in a list or a data.frame from a string?

2009-11-15 Thread Peng Yu
Suppose I have a string variable

string='some_string'

Now I want to have a list, where tag is the same as the string in
the variable string. I'm wondering if this is possible in R.

list(tag=1:3)
data.frame(tag=1:3)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs

2009-11-15 Thread David Winsemius

Assuming that the number of columns is 4, then consider this approach:

 prs -scan()
1: 2 5 1 6
5: 1 7 8 2
9: 3 7 6 2
13: 9 8 5 7
17:
Read 16 items
prmtx - matrix(prs, 4,4, byrow=T)

#Now make copus of x.y and y.x

pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,],  
2), 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,],  
2), 2,function(x) paste(x[2],x[1], sep=.))) )

tpair -table(pair.str)

# This then gives you a duplicated list
 tpair[tpair1]
pair.str
1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
  2   2   2   2   2   2   2   2

# So only take the first half of the pairs:
 head(tpair[tpair1], sum(tpair1)/2)

pair.str
1.2 2.1 2.6 2.7
  2   2   2   2

--
David.


On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

I could of course be wrong but have you yet specified the number of  
columns for this pairing exercise?


On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:


Hi, All,

I have an n by m matrix with each entry between 1 and 15000. I want  
to know
the frequency of each pair in 1:15000 that occur together in rows.  
So for

example, if the matrix is
2 5 1 6
1 7 8 2
3 7 6 2
9 8 5 7
Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to  
return
the value 2 for this pair as well as that for all pairs. Is there a  
fast way

to do this avoiding loops? Loops take too long.

and provide commented, minimal, self-contained, reproducible code.

  ^^

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to name a tag in a list or a data.frame from a string?

2009-11-15 Thread Duncan Murdoch

On 15/11/2009 8:15 PM, Peng Yu wrote:

Suppose I have a string variable

string='some_string'

Now I want to have a list, where tag is the same as the string in
the variable string. I'm wondering if this is possible in R.

list(tag=1:3)
data.frame(tag=1:3)


The most straightforward way is

x - list(1:3)
names(x) - string

y - data.frame(dummy=1:3)
names(y) - string

You can also build expressions and parse and evaluate them, but the 
lines above are the easiest way.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to make all elements all elements lower-cap ?

2009-11-15 Thread RON70

I have a vector of letters like c(a, B, c). Is there any R function to
force all elements to lower-cap ?

Thanks,
-- 
View this message in context: 
http://old.nabble.com/How-to-make-all-elements-all-elements-lower-cap---tp26365794p26365794.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make all elements all elements lower-cap ?

2009-11-15 Thread jim holtman
try this:

 x - c(a, B, c)
 ?tolower
 tolower(x)
[1] a b c



On Sun, Nov 15, 2009 at 8:45 PM, RON70 ron_michae...@yahoo.com wrote:

 I have a vector of letters like c(a, B, c). Is there any R function to
 force all elements to lower-cap ?

 Thanks,
 --
 View this message in context: 
 http://old.nabble.com/How-to-make-all-elements-all-elements-lower-cap---tp26365794p26365794.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to make all elements all elements lower-cap ?

2009-11-15 Thread Jorge Ivan Velez
Hi Ron,

Yes. Take a look at

?tolower

HTH,
Jorge


On Sun, Nov 15, 2009 at 8:45 PM, RON70  wrote:


 I have a vector of letters like c(a, B, c). Is there any R function
 to
 force all elements to lower-cap ?

 Thanks,
 --
 View this message in context:
 http://old.nabble.com/How-to-make-all-elements-all-elements-lower-cap---tp26365794p26365794.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to generate dependency file that can be used by gnu make?

2009-11-15 Thread Peng Yu
gcc has options like -MM, which can generate the dependence files for
a C/C++ file that I can be used by gnu make. I'm wondering if there is
a tool that can generate dependence file for an R script.

For example, I have an R script test.R

#test.R
load('input.RData')
save.image('output.RData')


I want to generate a dependence file like the following. Is there a
tool to do so?

output.RData:test.R input.RData

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] wilcox.test loop through variable names

2009-11-15 Thread Jacob Wegelin
On Sun, 15 Nov 2009 14:33 -0500, Jacob Wegelin
jacobwege...@fastmail.fm wrote:
 
 Often I perform the same task on a series of variables in a dataframe,
 by looping through a character vector that holds the names and using
 paste(), eval(), and parse() inside the loop.
 
 For instance:
 
 rm(environmental)
 thesevars-names(environmental)
 environmental$ToyReal -rnorm(nrow(environmental)) 
 environmental$ToyDichot- environmental$ToyReal  0.53
 
 tableOfResults-data.frame(var=thesevars)
 
 tableOfResults$p_wilcox - NA
 
 tableOfResults$Beta_lm - NA
 
 rownames(tableOfResults)-thesevars
 
 for( thisvar in thesevars) {
   thiscommand- paste(thiswilcox - wilcox.test (, thisvar,  ~ 
 ToyDichot , data=environmental))
   eval(parse(text=thiscommand))
   tableOfResults[thisvar, p_wilcox] - thiswilcox$p.value
   thislm-lm( environmental[ c( ToyReal, thisvar )])
   tableOfResults[thisvar, Beta_lm] - coef(thislm)[thisvar]
 }
 
 print(tableOfResults)
 
 Of course, the loop above is a toy example. In real life I might first
 figure out whether the variable is
 continuous, dichotomous, or categorical taking on several values, then
 perform an operation depending on
 its type.
 
 The use of paste(), eval(), and parse() seems awkward.  As Gabor
 Grothendieck showed
 (http://tolstoy.newcastle.edu.au/R/e8/help/09/11/4520.html), if we
 are calling a regression function such as lm() we can avoid using
 paste(), as shown above.
 
 But is there a way to avoid paste() and eval() when one uses t.test()
 or wilcox.test()?

Here is a solution:

rm(environmental)
thesevars-names(environmental)
environmental$ToyReal -rnorm(nrow(environmental))
environmental$ToyDichot- environmental$ToyReal  0.53

ThisList-
lapply( environmental[thesevars], function( OneVar ) {
   c(
  p_wilcox= wilcox.test( OneVar ~ environmental$ToyDichot )$p.value
 ,
  Beta_lm = as.numeric(coef(lm( environmental$ToyReal ~ OneVar
  ))[OneVar])
   )
   }
)

do.call(rbind, ThisList)

Jacob A. Wegelin 
Department of Biostatistics 
Virginia Commonwealth University 
Richmond VA 23298-0032 
U.S.A.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs

2009-11-15 Thread cindy Guo
Hi, David,

The matrix has 20 columns.
Thank you very much for your help. I think it's right, but it seems I need
some time to figure it out. I am a green hand. There are so many functions
here I never used before. :)

Cindy

On Sun, Nov 15, 2009 at 5:19 PM, David Winsemius dwinsem...@comcast.netwrote:

 Assuming that the number of columns is 4, then consider this approach:

  prs -scan()
 1: 2 5 1 6
 5: 1 7 8 2
 9: 3 7 6 2
 13: 9 8 5 7
 17:
 Read 16 items
 prmtx - matrix(prs, 4,4, byrow=T)

 #Now make copus of x.y and y.x

 pair.str - sapply(1:nrow(prmtx), function(z) c(apply(combn(prmtx[z,], 2),
 2,function(x) paste(x[1],x[2], sep=.)) , apply(combn(prmtx[z,], 2),
 2,function(x) paste(x[2],x[1], sep=.))) )
 tpair -table(pair.str)

 # This then gives you a duplicated list
  tpair[tpair1]
 pair.str
 1.2 2.1 2.6 2.7 6.2 7.2 7.8 8.7
  2   2   2   2   2   2   2   2

 # So only take the first half of the pairs:
  head(tpair[tpair1], sum(tpair1)/2)

 pair.str
 1.2 2.1 2.6 2.7
  2   2   2   2

 --
 David.



 On Nov 15, 2009, at 8:06 PM, David Winsemius wrote:

   I could of course be wrong but have you yet specified the number of
 columns for this pairing exercise?

 On Nov 15, 2009, at 5:26 PM, cindy Guo wrote:

 Hi, All,

 I have an n by m matrix with each entry between 1 and 15000. I want to
 know
 the frequency of each pair in 1:15000 that occur together in rows. So for
 example, if the matrix is
 2 5 1 6
 1 7 8 2
 3 7 6 2
 9 8 5 7
 Pair (2,6) (un-ordered) occurs together in rows 1 and 3. I want to return
 the value 2 for this pair as well as that for all pairs. Is there a fast
 way
 to do this avoiding loops? Loops take too long.

 and provide commented, minimal, self-contained, reproducible code.

  ^^

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Normalization of Data

2009-11-15 Thread Kenneth Roy Cabrera Torres
What do you mean with normalization?

Maybe you are looking for scale function on R.


El dom, 15-11-2009 a las 16:29 -0800, Abhishek Pratap escribió:
 Hi All
 
 I am looking for some resource to learn data normalization. I understand I
 am talking very broad here, I need something like a primer to give me a jump
 start. If you happen to know any good resource please do let me know.
 
 Cheers,
 -Abhi
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to use SQL code in R

2009-11-15 Thread sdlywjl666
Dear All,
  How to use SQL code in R?
Thanks!
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simple if else statement problem

2009-11-15 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 13.11.2009 18:54:05:

 
 Ok Jim it worked, thank you! it´s funny because it worked with the first
 syntax in some cases...

you can use another approach in this case

P-max(c(P1,P2))

Regards
Petr


 
 
 anna_l wrote:
  
  Hello, I am getting an error with the following code:
  if( P2  P1)
  + {
  + P-P2
  + }
  else
  Erro: unexpected 'else' in else
  {
  + P-P1
  + }
  
  I checked the syntax so I don´t understand, I have other if else
  statements with the same syntax working. Thanks in advance
  
 
 -- 
 View this message in context: 
http://old.nabble.com/Simple-if-else-statement-
 problem-tp26340336p26340642.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Normalization of Data

2009-11-15 Thread Abhishek Pratap
Hey

Sorry if it was not very clear, I assumed few things. Normalization is
actually subjective, I am basically comparing data from two similar
experiments and would like to normalize for the systematic errors which
might affect the actual result and inference.

What I am looking for is some examples where people have normalized data and
some standard methods of doing so.

Thanks,
-Abhi

On Sun, Nov 15, 2009 at 8:46 PM, Kenneth Roy Cabrera Torres 
krcab...@une.net.co wrote:

 What do you mean with normalization?

 Maybe you are looking for scale function on R.


 El dom, 15-11-2009 a las 16:29 -0800, Abhishek Pratap escribió:
  Hi All
 
  I am looking for some resource to learn data normalization. I understand
 I
  am talking very broad here, I need something like a primer to give me a
 jump
  start. If you happen to know any good resource please do let me know.
 
  Cheers,
  -Abhi
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ARMAX model fitting with arima

2009-11-15 Thread Rafael Laboissiere
I am trying to understand how to fit an ARMAX model with the arima
function from the stats package.  I tried the simple data below, where
the time series (vector x) is generated by filtering a step function
(vector u, the exogenous signal) through a lowpass filter with AR
coefficient equal to 0.8.  The input gain is 0.3 and there is a 0.01
normal white noise added to the output:

x - u - c (rep (0, 50), rep (1, 50))
x [1] - 0
set.seed (0)
for (i in 2 : length (x)) {
x [i] - 0.3 * u [i] + 0.8 * x [i - 1] + 0.01 * rnorm (1)
}

Then, I fit the model:

arima (x, c (1, 0, 0), xreg = u, include.mean = FALSE, method = ML)
Coefficients:
 ar1   u
  0.9988  0.2995
   
Why don't I get ar1 close to 0.8?  If I use lm to regress the data, it works:

lm (x [2 : length (x)] ~ x [1 : (length (x) - 1)] + u [2 : length (u)] - 1)
Coefficients:
x[1:(length(x) - 1)]u[2:length(u)]  
  0.79890.3015  
  
Any help will be appreciated.

Best,

-- 
Rafael Laboissiere

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.