date:20091110

Is this what you want:

 df=data.frame(x=c(3,6,7),y=c(2,7,4))
 df
  x y
1 3 2
2 6 7
3 7 4
 as.list(df)
$x
[1] 3 6 7

$y
[1] 2 7 4


On Tue, Nov 10, 2009 at 6:05 AM, Grzes gregori...@gmail.com wrote:

 Hi,
 I have got a data frame:

 df=data.frame(x=c(3,6,7),y=c(2,7,4))

 and I would like to write my values from data frame to list using loop, for
 example:

 lista=list()
 for (i in 1: length(?)){
        lista[[?]][?] = df [?]
 }
 But I havn't got any idea what I should put in places where I put a question
 mark?

 As a rusult I would like to get a list like this:
 list:
 [[1]]
 3,6,7
 [[2]]
 2,7,4

 I wll be very happy if somebody can help me


 --
 View this message in context: 
 http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Models for Discrete Choice in R

2009-11-10 Thread Frank E Harrell Jr


Iuri Gavronski wrote:

Frank,

I certainly can't speak for Emmanuel. I don't know his reasons.

The reason I've posted this question is the fact that (as far as I
understood), ordinal regression is based on logistic regression (or
probit), and logistic regression expects a formula like dichotomous ~
ratio1 + ratio2 + ... + ration. However, most examples I've found in
Design, MASS and VGAM test models like ordinal ~ categorical1 +
categorical2 + ... + categoricaln.

I wonder if it is just coincidence or I have just found the wrong functions.


Logistic regression includes binary, ordinal, and polytomous 
(multinomial) cases.  Binary logistic regression needs a binary 
response.  Ordinal logistic regression (usually proportional odds but 
can be other flavors such as continuation ratio model) needs an ordinal 
response.  The polr and lrm functions work this way.


Frank



Best,

Iuri.

On Mon, Nov 9, 2009 at 11:43 AM, Frank E Harrell Jr
f.harr...@vanderbilt.edu wrote:

Emmanuel Charpentier wrote:

Le dimanche 08 novembre 2009 à 19:05 -0600, Frank E Harrell Jr a écrit :

Emmanuel Charpentier wrote:

Le dimanche 08 novembre 2009 à 17:07 -0200, Iuri Gavronski a écrit :

Hi,

I would like to fit Logit models for ordered data, such as those
suggested by Greene (2003), p. 736.

Does anyone suggests any package in R for that?

look up the polr function in package MASS (and read the relevant pages
in VR4 and some quoted references...) or the slightly more
sophisticated (larger range of models) lrm function in F. Harrell's
Design (now rms) packge (but be aware that Design is a huge beast witch
carries its own computing universe, based on (strong) Harrell's view
of what a regression analysis should be : reading his book is, IMHO,
necessary to understand his choices and agree (or disgree) with them).

If you have a multilevel model (a. k. a. one random effect grouping),
the repolr packge aims at that, but I've been unable to use it
recently (numerical exceptions).


By the way, my dependent variable is ordinal and my independent
variables are ratio/intervalar.

Numeric ? Then maybe some recoding/transformation is in order ... in
which case Design/rms might or might not be useful.

I'm not clear on what recoding or transformation is needed for an ordinal
dependent variable and ratio/interval independent variables, nor why
rms/Design would not be useful.

I was thinking about transformations/recoding of the *independent*
variables...
   Emmanuel Charpentier

I realize that; still unclear.
Frank


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size...

A little simple math.  You have 3M rows with 100 items on each row.
If read in this would be 300M items.  If numeric, 8 bytes/item, this
is 2.4GB.  Given that you are probably using a 32 bit version of R,
you are probably out of luck.  A rule of thumb is that your largest
object should consume at most 25% of your memory since you will
probably be making copies as part of your processing.

Given that, is you want to read in 100 variables at a time, I would
say your limit would be about 500K rows to be reasonable.  So you have
a choice; read in fewer rolls, read in all 3M rows but at 20 columns
per read, put the data in a database and extract what you need.
Unless you go to a 64-bit version of R you will probably not be able
to have the whole file in memory at one time.

On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote:

 I'm trying to import a table into R the file is about 700MB. Here's my first
 try:

 DD-read.table(01uklicsam-20070301.dat,header=TRUE)

 Error: cannot allocate vector of size 15.6 Mb
 In addition: Warning messages:
 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)
 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)
 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)
 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
  Reached total allocation of 1535Mb: see help(memory.size)

 Then I tried

 memory.limit(size=4095)
  and got

 DD-read.table(01uklicsam-20070301.dat,header=TRUE)
 Error: cannot allocate vector of size 11.3 Mb

 but no additional errors. Then optimistically to clear up the workspace:

 rm()
 DD-read.table(01uklicsam-20070301.dat,header=TRUE)
 Error: cannot allocate vector of size 15.6 Mb

 Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
 I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
 memory is usually 2Gb. Surely they mean GB?

 The file I'm importing has about 3 million cases with 100 variables that I
 want to crosstabulate each with each. Is this completely unrealistic?

 Thanks!

 Maja
 --
 View this message in context: 
 http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Compile package in version R-2.10.0 (Windows XP)

2009-11-10 Thread Uwe Ligges




Paul Ruppen wrote:
Under version 2.7.0 I could without problems compile R-packages, I wrote 
myself (Windows XP).


In the relevant path I have at the beginning of the path the following 
enterings:
PATH=c:\Rtools\bin;c:\Rtools\perl\bin;c:\Rtools\MinGW\bin;C:\Programme\MiKTeX~1\ 

miktex\bin;C:\Programme\MiKTeX 2.5\miktex\bin;c:\Programme\HTML Help 
Workshop;c:

\programme\R\bin;c:\windows;c:\windows\system32;

which works for 2.7.0 - but no longer for 2.10.0
What can I do?


At least:

- update Rtools
- update MikTeX

Uwe Ligges






Paul Ruppen, Pädagogische Hochschule Wallis (Switzerland)




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process gets killed spontaneously

2009-11-10 Thread Uwe Ligges




Peng Yu wrote:

My R process has been killed for a few times, although the system
administrator did not do so. It happened when R attempted to allocate
a lot of memory. I'm wondering whether R would spontaneously kill
itself if it can not allocate enough memory?



It does not kill itself. If it was not some out-of-memory kill process 
of your OS, then you may have seen some segfault. We need reproducible 
code and all the details described in the posting guide.


Uwe Ligges



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Numeric formatting question

2009-11-10 Thread Michael Pearmain

Hi All,

I have am using Sweave and the \Sexpr{} to place some numeric variables in
my tex document. I want to format the number prior to entry so they read
slightly more elegantly.
Say i have the following numbers
x - 0.00487324
y - 0.00432
z - 0.567

I would like to have the numbers displayed as follows

x1 - 0.0049
y1 - 0.0043
z1 - 0.57

I've seen i can use sprintf(%.3f, pi) for example to get the formating
after the decimal place, but i can't figure out an elegant way to find the
position of the first non-zero entry to allow me to substitute this value
into the sprintf command.

Can anyone offer any advise?

Thanks in advance

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Numeric formatting question

2009-11-10 Thread Dieter Menne




Michael Pearmain-2 wrote:
 
 Hi All,
 
 I have am using Sweave and the \Sexpr{} to place some numeric variables in
 my tex document. I want to format the number prior to entry so they read
 slightly more elegantly.
 Say i have the following numbers
 x - 0.00487324
 y - 0.00432
 z - 0.567
 
 I would like to have the numbers displayed as follows
 
 x1 - 0.0049
 y1 - 0.0043
 z1 - 0.57
 
 

\Sexpr{round(x,2)}

Dieter


-- 
View this message in context: 
http://old.nabble.com/Numeric-formatting-question-tp26283386p26283422.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Numeric formatting question

2009-11-10 Thread Marc Schwartz


On Nov 10, 2009, at 7:23 AM, Michael Pearmain wrote:


Hi All,

I have am using Sweave and the \Sexpr{} to place some numeric  
variables in
my tex document. I want to format the number prior to entry so they  
read

slightly more elegantly.
Say i have the following numbers
x - 0.00487324
y - 0.00432
z - 0.567

I would like to have the numbers displayed as follows

x1 - 0.0049
y1 - 0.0043
z1 - 0.57

I've seen i can use sprintf(%.3f, pi) for example to get the  
formating
after the decimal place, but i can't figure out an elegant way to  
find the
position of the first non-zero entry to allow me to substitute this  
value

into the sprintf command.

Can anyone offer any advise?

Thanks in advance

Mike



It looks like you want the numbers formatted to 2 significant digits,  
rather than to a fixed number of decimal places. Thus, use:


 format(x, digits = 2, scientific = FALSE)
[1] 0.0049

 format(y, digits = 2, scientific = FALSE)
[1] 0.0043

 format(z, digits = 2, scientific = FALSE)
[1] 0.57


See ?format and note the 'digits' and 'scientific' arguments.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] problem with executing r-function from windows command prompt

2009-11-10 Thread venkata kirankumar

Hi all,
I am trying to execute one function from windows command prompt
and I am trying to execute queries like source(myRfile.R)  but itis
throughing runtime errors can any one help me how can i run my
 *.R file and how can i call a function in that *.R file

thanks in advance

kiran.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write data frame in a list

2009-11-10 Thread Grzes


Thanks jholtman for advice but I need do it using a loop. (Because  my code
is a little more complicated).



jholtman wrote:
 
 Is this what you want:
 
 df=data.frame(x=c(3,6,7),y=c(2,7,4))
 df
   x y
 1 3 2
 2 6 7
 3 7 4
 as.list(df)
 $x
 [1] 3 6 7
 
 $y
 [1] 2 7 4
 
 
 On Tue, Nov 10, 2009 at 6:05 AM, Grzes gregori...@gmail.com wrote:

 Hi,
 I have got a data frame:

 df=data.frame(x=c(3,6,7),y=c(2,7,4))

 and I would like to write my values from data frame to list using loop,
 for
 example:

 lista=list()
 for (i in 1: length(?)){
        lista[[?]][?] = df [?]
 }
 But I havn't got any idea what I should put in places where I put a
 question
 mark?

 As a rusult I would like to get a list like this:
 list:
 [[1]]
 3,6,7
 [[2]]
 2,7,4

 I wll be very happy if somebody can help me


 --
 View this message in context:
 http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26283178.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size...

2009-11-10 Thread maiya


OK, it's the simple math that's confusing me :)

So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
different?

And lets say I could potentially live with e.g. 1/3 of the cases - that
would make it .8GB, which should be fine? But then my question is if there
is any way to sample the rows in read.table? Or what would be the best way
of importing a random third of my cases?

Thanks!

M.



jholtman wrote:
 
 A little simple math.  You have 3M rows with 100 items on each row.
 If read in this would be 300M items.  If numeric, 8 bytes/item, this
 is 2.4GB.  Given that you are probably using a 32 bit version of R,
 you are probably out of luck.  A rule of thumb is that your largest
 object should consume at most 25% of your memory since you will
 probably be making copies as part of your processing.
 
 Given that, is you want to read in 100 variables at a time, I would
 say your limit would be about 500K rows to be reasonable.  So you have
 a choice; read in fewer rolls, read in all 3M rows but at 20 columns
 per read, put the data in a database and extract what you need.
 Unless you go to a 64-bit version of R you will probably not be able
 to have the whole file in memory at one time.
 
 On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote:

 I'm trying to import a table into R the file is about 700MB. Here's my
 first
 try:

 DD-read.table(01uklicsam-20070301.dat,header=TRUE)

 Error: cannot allocate vector of size 15.6 Mb
 In addition: Warning messages:
 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)
 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)
 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)
 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)

 Then I tried

 memory.limit(size=4095)
  and got

 DD-read.table(01uklicsam-20070301.dat,header=TRUE)
 Error: cannot allocate vector of size 11.3 Mb

 but no additional errors. Then optimistically to clear up the workspace:

 rm()
 DD-read.table(01uklicsam-20070301.dat,header=TRUE)
 Error: cannot allocate vector of size 15.6 Mb

 Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
 I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
 memory is usually 2Gb. Surely they mean GB?

 The file I'm importing has about 3 million cases with 100 variables that
 I
 want to crosstabulate each with each. Is this completely unrealistic?

 Thanks!

 Maja
 --
 View this message in context:
 http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write data frame in a list

If you really want a loop, then do:

  df=data.frame(x=c(3,6,7),y=c(2,7,4))
  new.list - list()
  for (i in names(df)){
+ new.list[[i]] - df[[i]]
+ }

 new.list
$x
[1] 3 6 7

$y
[1] 2 7 4



On Tue, Nov 10, 2009 at 8:12 AM, Grzes gregori...@gmail.com wrote:

 Thanks jholtman for advice but I need do it using a loop. (Because  my code
 is a little more complicated).



 jholtman wrote:

 Is this what you want:

 df=data.frame(x=c(3,6,7),y=c(2,7,4))
 df
   x y
 1 3 2
 2 6 7
 3 7 4
 as.list(df)
 $x
 [1] 3 6 7

 $y
 [1] 2 7 4


 On Tue, Nov 10, 2009 at 6:05 AM, Grzes gregori...@gmail.com wrote:

 Hi,
 I have got a data frame:

 df=data.frame(x=c(3,6,7),y=c(2,7,4))

 and I would like to write my values from data frame to list using loop,
 for
 example:

 lista=list()
 for (i in 1: length(?)){
        lista[[?]][?] = df [?]
 }
 But I havn't got any idea what I should put in places where I put a
 question
 mark?

 As a rusult I would like to get a list like this:
 list:
 [[1]]
 3,6,7
 [[2]]
 2,7,4

 I wll be very happy if somebody can help me


 --
 View this message in context:
 http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26281645.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://old.nabble.com/write-data-frame-in-a-list-tp26281645p26283178.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Akaike weight in R

2009-11-10 Thread Sunny

I am using lm simulation in R and try to find the AICc and Akaike weight of
the model. I searched out that using package AICcmodavg  AICc is easily to
get. I wonder how can I get the Akaike weight, any function I can use to
generate it? Thanks in advance.

Sunny

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size...

2009-11-10 Thread Peter Dalgaard


maiya wrote:

OK, it's the simple math that's confusing me :)

So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
different?


700_MB_, I assume!

In a nutshell, a single column and a spacer takes 2 bytes per subject, 
but a floating point variable takes 8, and R is not good at detecting 
things that can be compressed. At best, you can ensure that variables 
are read as integers or factors, bringing the storage requirements down 
to four bytes.


The ff package might be something for you. The most strongly 
compressed items have trouble with storing NA values, though.


-p



And lets say I could potentially live with e.g. 1/3 of the cases - that
would make it .8GB, which should be fine? But then my question is if there
is any way to sample the rows in read.table? Or what would be the best way
of importing a random third of my cases?

Thanks!

M.



jholtman wrote:

A little simple math.  You have 3M rows with 100 items on each row.
If read in this would be 300M items.  If numeric, 8 bytes/item, this
is 2.4GB.  Given that you are probably using a 32 bit version of R,
you are probably out of luck.  A rule of thumb is that your largest
object should consume at most 25% of your memory since you will
probably be making copies as part of your processing.

Given that, is you want to read in 100 variables at a time, I would
say your limit would be about 500K rows to be reasonable.  So you have
a choice; read in fewer rolls, read in all 3M rows but at 20 columns
per read, put the data in a database and extract what you need.
Unless you go to a 64-bit version of R you will probably not be able
to have the whole file in memory at one time.

On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote:

I'm trying to import a table into R the file is about 700MB. Here's my
first
try:


DD-read.table(01uklicsam-20070301.dat,header=TRUE)

Error: cannot allocate vector of size 15.6 Mb
In addition: Warning messages:
1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)
2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)
3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)
4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)

Then I tried


memory.limit(size=4095)

 and got


DD-read.table(01uklicsam-20070301.dat,header=TRUE)

Error: cannot allocate vector of size 11.3 Mb

but no additional errors. Then optimistically to clear up the workspace:


rm()
DD-read.table(01uklicsam-20070301.dat,header=TRUE)

Error: cannot allocate vector of size 15.6 Mb

Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
memory is usually 2Gb. Surely they mean GB?

The file I'm importing has about 3 million cases with 100 variables that
I
want to crosstabulate each with each. Is this completely unrealistic?

Thanks!

Maja
--
View this message in context:
http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.







--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size...

Check out:

http://www.mail-archive.com/r-h...@stat.math.ethz.ch/msg79590.html

for sampling a large file.

On Tue, Nov 10, 2009 at 8:32 AM, maiya maja.zaloz...@gmail.com wrote:

 OK, it's the simple math that's confusing me :)

 So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
 different?

 And lets say I could potentially live with e.g. 1/3 of the cases - that
 would make it .8GB, which should be fine? But then my question is if there
 is any way to sample the rows in read.table? Or what would be the best way
 of importing a random third of my cases?

 Thanks!

 M.



 jholtman wrote:

 A little simple math.  You have 3M rows with 100 items on each row.
 If read in this would be 300M items.  If numeric, 8 bytes/item, this
 is 2.4GB.  Given that you are probably using a 32 bit version of R,
 you are probably out of luck.  A rule of thumb is that your largest
 object should consume at most 25% of your memory since you will
 probably be making copies as part of your processing.

 Given that, is you want to read in 100 variables at a time, I would
 say your limit would be about 500K rows to be reasonable.  So you have
 a choice; read in fewer rolls, read in all 3M rows but at 20 columns
 per read, put the data in a database and extract what you need.
 Unless you go to a 64-bit version of R you will probably not be able
 to have the whole file in memory at one time.

 On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote:

 I'm trying to import a table into R the file is about 700MB. Here's my
 first
 try:

 DD-read.table(01uklicsam-20070301.dat,header=TRUE)

 Error: cannot allocate vector of size 15.6 Mb
 In addition: Warning messages:
 1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)
 2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)
 3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)
 4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
  :
  Reached total allocation of 1535Mb: see help(memory.size)

 Then I tried

 memory.limit(size=4095)
  and got

 DD-read.table(01uklicsam-20070301.dat,header=TRUE)
 Error: cannot allocate vector of size 11.3 Mb

 but no additional errors. Then optimistically to clear up the workspace:

 rm()
 DD-read.table(01uklicsam-20070301.dat,header=TRUE)
 Error: cannot allocate vector of size 15.6 Mb

 Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
 I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
 memory is usually 2Gb. Surely they mean GB?

 The file I'm importing has about 3 million cases with 100 variables that
 I
 want to crosstabulate each with each. Is this completely unrealistic?

 Thanks!

 Maja
 --
 View this message in context:
 http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem that you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] when vectorising does not work: silent function fail?

2009-11-10 Thread Federico Calboli


Dear All,

I'm using apply to do some genetic association analysis along a chromosome, with 
many thousands markers. For each marker the analysis is the same, so I was 
planning to use apply(chrom, 2, somefunction)


In the specific case I do:

my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc + age 
+ sex + mri))[1,3]})


This is all good and well in theory, but in practice the lrm() model will fail 
for some markers and wreck the whole process. Failure for some markers is no 
problem for me, but *predicting* which markers will fail can be hugely problematic.


I then though of creating some fucntion to catch the error messages that would 
otherwise scr*w things over:


my.lrm = function(x){
pol = NULL
pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)
if(length(pol)  0)
rez = anova(pol)[1,3]
if(length(pol)  == 0)
rez = 1
rez}

my.results = apply(chr, 2, my.lrm)

Still no joy, even adding try() in the evaluation and 
options(show.error.messages = F)


I am at loss on how to get the darn function to bail out *silently* if needs be 
so I can just smack a replacement value in --which would also have the benefit 
of keeping the order of the markers.


Any idea will be gratefully acknowledged.

Best,

Federico


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] when vectorising does not work: silent function fail?

Have you tried something like this:

my.results = apply(chr, 2, function(x){
result - try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3])
if (inherits(result, try-error)) return(NULL)
result
})

This should catch the error and have NULL in that list element.

On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli
f.calb...@imperial.ac.uk wrote:
 Dear All,

 I'm using apply to do some genetic association analysis along a chromosome,
 with many thousands markers. For each marker the analysis is the same, so I
 was planning to use apply(chrom, 2, somefunction)

 In the specific case I do:

 my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc +
 age + sex + mri))[1,3]})

 This is all good and well in theory, but in practice the lrm() model will
 fail for some markers and wreck the whole process. Failure for some markers
 is no problem for me, but *predicting* which markers will fail can be hugely
 problematic.

 I then though of creating some fucntion to catch the error messages that
 would otherwise scr*w things over:

 my.lrm = function(x){
 pol = NULL
 pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)
 if(length(pol)  0)
 rez = anova(pol)[1,3]
 if(length(pol)  == 0)
 rez = 1
 rez}

 my.results = apply(chr, 2, my.lrm)

 Still no joy, even adding try() in the evaluation and
 options(show.error.messages = F)

 I am at loss on how to get the darn function to bail out *silently* if needs
 be so I can just smack a replacement value in --which would also have the
 benefit of keeping the order of the markers.

 Any idea will be gratefully acknowledged.

 Best,

 Federico


 --
 Federico C. F. Calboli
 Department of Epidemiology and Public Health
 Imperial College, St Mary's Campus
 Norfolk Place, London W2 1PG

 Tel  +44 (0)20 7594 1602     Fax (+44) 020 7594 3193

 f.calboli [.a.t] imperial.ac.uk
 f.calboli [.a.t] gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] drop unused levels in subset.data.frame

2009-11-10 Thread baptiste auguie

Dear list,

subset has a 'drop' argument that I had often mistaken for the one in
[.factor which removes unused levels.
Clearly it doesn't work that way, as shown below,

d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
s - subset(d, y==A, drop=TRUE)
str(s)
'data.frame':   5 obs. of  2 variables:
 $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13
 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1

The subset still retains all the unused factor levels. I wonder how
people usually get rid of all unused levels in a data.frame after
subsetting? I came up with this but I may have missed a better
built-in solution,

dropit - function (d, columns = names(d), ...)
{
d[columns] = lapply(d[columns], [, drop=TRUE, ...)
d
}

str(dropit(s))
'data.frame':   5 obs. of  2 variables:
 $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5
 $ y: Factor w/ 1 level A: 1 1 1 1 1


Best regards,

baptiste

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop unused levels in subset.data.frame



On Nov 10, 2009, at 10:49 AM, baptiste auguie wrote:


Dear list,

subset has a 'drop' argument that I had often mistaken for the one in
[.factor which removes unused levels.
Clearly it doesn't work that way, as shown below,

d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
s - subset(d, y==A, drop=TRUE)
str(s)
'data.frame':   5 obs. of  2 variables:
$ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13
$ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1

The subset still retains all the unused factor levels. I wonder how
people usually get rid of all unused levels in a data.frame after
subsetting? I came up with this but I may have missed a better
built-in solution,

dropit - function (d, columns = names(d), ...)
{
   d[columns] = lapply(d[columns], [, drop=TRUE, ...)
   d
}



If you are looking for a one-liner, then consider:

data.frame(lapply(s, function(x) if (is.factor(x)){ factor(x)} else  
{x}))


I added a numeric column to make sure I had not clobbered a non-factor  
variable.


 d - data.frame(x = factor(letters[1:15]), y =  
factor(LETTERS[1:3]), N=1:15)

 s - subset(d, y==A, drop=TRUE)
 str( data.frame(lapply(s, function(x) if (is.factor(x)){ factor(x)}  
else {x})) )

'data.frame':   5 obs. of  3 variables:
 $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5
 $ y: Factor w/ 1 level A: 1 1 1 1 1
 $ N: int  1 4 7 10 13



str(dropit(s))
'data.frame':   5 obs. of  2 variables:
$ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5
$ y: Factor w/ 1 level A: 1 1 1 1 1


Best regards,

baptiste

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop unused levels in subset.data.frame

2009-11-10 Thread Marc Schwartz


On Nov 10, 2009, at 9:49 AM, baptiste auguie wrote:


Dear list,

subset has a 'drop' argument that I had often mistaken for the one in
[.factor which removes unused levels.
Clearly it doesn't work that way, as shown below,

d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
s - subset(d, y==A, drop=TRUE)
str(s)
'data.frame':   5 obs. of  2 variables:
$ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13
$ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1

The subset still retains all the unused factor levels. I wonder how
people usually get rid of all unused levels in a data.frame after
subsetting? I came up with this but I may have missed a better
built-in solution,

dropit - function (d, columns = names(d), ...)
{
   d[columns] = lapply(d[columns], [, drop=TRUE, ...)
   d
}

str(dropit(s))
'data.frame':   5 obs. of  2 variables:
$ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5
$ y: Factor w/ 1 level A: 1 1 1 1 1


There is a page in the R wiki here:

  http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels

that has some approaches.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Comparison of vectors in a matrix

2009-11-10 Thread esterhazy


Hi,

I have a matrix with two columns, and the elements of the matrix are
vectors.

So for example, in line 3 of column 1 I have a vector v31=(marc, robert,
marie).

What I need to do is to compare all vectors in column 1 and 2, so as to get,
for example setdiff(v31,v32) into a new column.

Is there a way to do this in R?

Thanks!
-- 
View this message in context: 
http://old.nabble.com/Comparison-of-vectors-in-a-matrix-tp26284855p26284855.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calculating the percentage of explained deviance in lmer

2009-11-10 Thread Fabio Bulleri

Dear all,
I am trying to calculate some measure of the amount of variability in the 
response variable that is explained by a model fitted in lmer 
m1-lmer(response-var ~ Condition+(1|Site/Area/Transect),family=binomial) . 
I've seen from the literature that the precentage of explained deviance is a 
common measure. How can I calculate it?
Thanks a lot for your help, I hope this is not too shallow.
Best
Fabio
--
Fabio Bulleri, PhD
Dipartimento di Biologia
Università di Pisa
Via A. Volta 6, I-56126, Pisa, Italy
Tel. +39 050 2211448
Fax +39 050 2211410
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 2 significant digits

2009-11-10 Thread carol white

Hi,
How to represent a rounded number ending with 0 with 2-significant digits? If I 
have for ex, 0.8031 and I use signif or round with digits = 2, I'll get 0.8. If 
I use format, I get character type (even if I pass number as parameter) and if 
I convert with as.numeric, I'll lose one significant digit (0):

  format(13.7, nsmall = 2)
[1] 13.70
 as.numeric( format(13.7, nsmall = 2))
[1] 13.7


Regards,

Carol

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Implementation of the Shuffled Complex Evol ution (SCE-UA) Algorithm

2009-11-10 Thread Simon Seibert

Good evening list,
I'm looking for an R implementation of the Shuffled Complex
Evolutionâ (SCE-UA) algorithm after Duan et al. (1993). Does anybody
know if there is an extension/ package existing that contains it?
Thanks very much for your help! Cheers, Simon
Duan QY, Gupta KV, Sorooshian S (1993) Shuffled Complex Evolution
Approach for Effective and Efficient Global Minimization. In Jour. of
optimization theorie and applications. Vol 76, 3, 501-521
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NetCDF output in R

2009-11-10 Thread nana

Dear CSAG R users,

I will be glad if someone can point out what I am doing wrong or not doing at 
all in this.

I am trying to write out netcdf file in R. I have 26 time step but only the 
first time step is written.

For example:
library(ncdf)
path - '/home/work/'
forecast - open.ncdf(paste(path,'cam.1980.2005.nc',sep=))
 fore - get.var.ncdf(forecast,'ppt')
 lon - get.var.ncdf(forecast,'lon')
 lat - get.var.ncdf(forecast,'lat')
dim(fore)[3]
26
 times - 1:dim(fore)[3]
 write.netcdf.time(paste(path,'cam_fore.nc',sep=), fore,lon,lat,times)
[1] put.var.ncdf: warning: you asked to write 1440 values, but the passed data 
array has 37440 entries!
[[1]]
[1] 6

Warning message:
In 1:nt : numerical expression has 26 elements: only the first used


# function for writing out the netcdf file #
write.netcdf.time - function(filename='outputfile.nc',data,lons,lats,nt){
lon - dim.def.ncdf('lon','degrees_east',lons)
lat - dim.def.ncdf('lat','degrees_north',lats)
times - 1:nt
tdim - dim.def.ncdf('time','days since 1980-01-01', times, unlim=TRUE)
# levs - dim.def.ncdf('lev','pressure',levs)
var - var.def.ncdf('data','unitless',list(lon,lat,tdim),-999.9)
ncid - create.ncdf(filename,list(var))
put.var.ncdf(ncid, var, data)
close.ncdf(ncid)
}

##end of function###

Thank you.
Nana Browne



There is no key to happiness. The door is always open.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 2 significant digits

2009-11-10 Thread Marc Schwartz


On Nov 10, 2009, at 10:22 AM, carol white wrote:


Hi,
How to represent a rounded number ending with 0 with 2-significant  
digits? If I have for ex, 0.8031 and I use signif or round with  
digits = 2, I'll get 0.8. If I use format, I get character type  
(even if I pass number as parameter) and if I convert with  
as.numeric, I'll lose one significant digit (0):



format(13.7, nsmall = 2)

[1] 13.70

as.numeric( format(13.7, nsmall = 2))

[1] 13.7


Regards,

Carol




See ?sprintf if you want to force two decimal places, as opposed to  
two significant digits:


 sprintf(%.2f, 0.8031)
[1] 0.80

A trailing zero is not a significant digit, hence it will not be  
displayed if you are formatting based upon significant digits, as  
opposed to a fixed number of decimal places.


If you want to force the formatting of a numeric value for output, it  
will end up being a character, which should not be a problem, since  
you are presumably looking to display or output the number for  
reading, not to be used for subsequent calculations.


Conceptually, you need to separate how R stores numeric values  
internally, versus how numeric values are displayed (printed) during  
output. By default, R stores numeric values internally as double  
precision floats to allow for mathematical calculations to be  
performed on those values.


The printing of those numbers for output in an R console or GUI, is by  
default based upon significant digits (see ?print.default), but you  
can alter that behavior using various functions such as sprintf, to  
enable pretty output for formatted reports and tables, etc.


HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] polygon kills X-server

2009-11-10 Thread Barry Rowlingson

2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de:

 This one is extraordinary dangerous: it also killed my kind of X server
 called Windows completely so that I had to reset the machine.

 Perhaps it should be debugged on the Linux side with less serious side
 effects

 The best way to debug this might be via a nested X server - something
like Xnest or Xephyr - or possibly on a virtual machine in the hope
that the virtual machine's display system crashes before the host's
does.

 I haven't tried this yet!

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop unused levels in subset.data.frame

2009-11-10 Thread baptiste auguie

Neat, I reinvented the wheel! Would that seem like a useful example at
the end of the help page for ?subset ? (it currently has very little
to say about drop).

Thanks also to David for the alternative idea.

Best regards,

baptiste


2009/11/10 Marc Schwartz marc_schwa...@me.com:
 On Nov 10, 2009, at 9:49 AM, baptiste auguie wrote:

 Dear list,

 subset has a 'drop' argument that I had often mistaken for the one in
 [.factor which removes unused levels.
 Clearly it doesn't work that way, as shown below,

 d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
 s - subset(d, y==A, drop=TRUE)
 str(s)
 'data.frame':   5 obs. of  2 variables:
 $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13
 $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1

 The subset still retains all the unused factor levels. I wonder how
 people usually get rid of all unused levels in a data.frame after
 subsetting? I came up with this but I may have missed a better
 built-in solution,

 dropit - function (d, columns = names(d), ...)
 {
   d[columns] = lapply(d[columns], [, drop=TRUE, ...)
   d
 }

 str(dropit(s))
 'data.frame':   5 obs. of  2 variables:
 $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5
 $ y: Factor w/ 1 level A: 1 1 1 1 1

 There is a page in the R wiki here:

  http://wiki.r-project.org/rwiki/doku.php?id=tips:data-manip:drop_unused_levels

 that has some approaches.

 HTH,

 Marc Schwartz



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] drop unused levels in subset.data.frame

2009-11-10 Thread Hadley Wickham

If you don't want to preserve factor levels when subsetting use
characters. There are very few other differences in behavior.

Hadley

On Tuesday, November 10, 2009, baptiste auguie
baptiste.aug...@googlemail.com wrote:
 Dear list,

 subset has a 'drop' argument that I had often mistaken for the one in
 [.factor which removes unused levels.
 Clearly it doesn't work that way, as shown below,

 d - data.frame(x = factor(letters[1:15]), y = factor(LETTERS[1:3]))
 s - subset(d, y==A, drop=TRUE)
 str(s)
 'data.frame':   5 obs. of  2 variables:
  $ x: Factor w/ 15 levels a,b,c,d,..: 1 4 7 10 13
  $ y: Factor w/ 3 levels A,B,C: 1 1 1 1 1

 The subset still retains all the unused factor levels. I wonder how
 people usually get rid of all unused levels in a data.frame after
 subsetting? I came up with this but I may have missed a better
 built-in solution,

 dropit - function (d, columns = names(d), ...)
 {
     d[columns] = lapply(d[columns], [, drop=TRUE, ...)
     d
 }

 str(dropit(s))
 'data.frame':   5 obs. of  2 variables:
  $ x: Factor w/ 5 levels a,d,g,j,..: 1 2 3 4 5
  $ y: Factor w/ 1 level A: 1 1 1 1 1


 Best regards,

 baptiste

 __
 r-h...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process gets killed spontaneously

2009-11-10 Thread Benilton Carvalho


Hi Peng,

in a very simplistic manner, what happens is that the Operating System  
thinks it is too dangerous to let the R process to use so much  
memory. So, to protect the whole system, it kills R, before the system  
becomes unstable.


I've been looking at the problem you observed last week and, as I  
promised, I will come back to you with my findings sometime this week.  
I was already informed on how to reduce the memory footprint when  
creating the specific object you're interested in, but still need to  
run some tests.


Will get back to you soon,

b

On Nov 10, 2009, at 2:23 PM, Peng Yu wrote:


2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de:



Peng Yu wrote:


My R process has been killed for a few times, although the system
administrator did not do so. It happened when R attempted to  
allocate

a lot of memory. I'm wondering whether R would spontaneously kill
itself if it can not allocate enough memory?



It does not kill itself. If it was not some out-of-memory kill  
process of
your OS, then you may have seen some segfault. We need reproducible  
code and

all the details described in the posting guide.


Please see this thread about the necessary information.

http://www.mail-archive.com/r-help@r-project.org/msg74804.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] phase determination

2009-11-10 Thread Lisandro Benedetti Cecchi

Hi,

 

I'm trying to determine the phase of irregularly sampled data. Is there any
particular reason why both spec.pgram and spec.ls return phase-NULL for
vectors?

 

Thank you.

Lisandro

 

 


x

Lisandro Benedetti-Cecchi

Associate Professor in Ecology

Department of Biology - University of Pisa

Via Derna 1, 56126 Pisa,

Italy

Office: 39 050 2211413

Fax: 39 050 2211410

e-mail: lbenede...@biologia.unipi.it

http://www.discat.unipi.it/BiolMar/people/LBC/LBC.htm

http://www.unipi.it

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] NetCDF output in R

2009-11-10 Thread nana


Dear R users,

I will be glad if someone can point out what I am doing wrong or not doing at 
all in this.

I am trying to write out netcdf file in R. I have 26 time step but only the 
first time step is written.

For example:
library(ncdf)
path - '/home/work/'
forecast - open.ncdf(paste(path,'cam.1980.2005.nc',sep=))
 fore - get.var.ncdf(forecast,'ppt')
 lon - get.var.ncdf(forecast,'lon')
 lat - get.var.ncdf(forecast,'lat')
dim(fore)[3]
26
 times - 1:dim(fore)[3]
 write.netcdf.time(paste(path,'cam_fore.nc',sep=), fore,lon,lat,times)
[1] put.var.ncdf: warning: you asked to write 1440 values, but the passed data 
array has 37440 entries!
[[1]]
[1] 6

Warning message:
In 1:nt : numerical expression has 26 elements: only the
 first used


# function for writing out the netcdf file #
write.netcdf.time - function(filename='outputfile.nc',data,lons,lats,nt){
lon - dim.def.ncdf('lon','degrees_east',lons)
lat - dim.def.ncdf('lat','degrees_north',lats)
times - 1:nt
tdim - dim.def.ncdf('time','days since 1980-01-01', times, unlim=TRUE)
# levs - dim.def.ncdf('lev','pressure',levs)
var - var.def.ncdf('data','unitless',list(lon,lat,tdim),-999.9)
ncid - create.ncdf(filename,list(var))
put.var.ncdf(ncid, var, data)
close.ncdf(ncid)
}

##end of function###

Thank you.
Nana Browne



There is no key to happiness. The door is always open.


  


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Error: cannot allocate vector of size...

2009-11-10 Thread tlumley


On Tue, 10 Nov 2009, maiya wrote:



OK, it's the simple math that's confusing me :)

So you're saying 2.4GB, while windows sees the data as 700KB. Why is that
different?


Your data are stored on disk as a text file (in CSV format, in fact), not as 
numbers. This can take up less space.


And lets say I could potentially live with e.g. 1/3 of the cases - that
would make it .8GB, which should be fine? But then my question is if there
is any way to sample the rows in read.table? Or what would be the best way
of importing a random third of my cases?


A better solution is probably to read a subset of the columns at a time.  The easiest way 
to do this is probably to read the data into a SQLite database with the 'sqldf' package, 
but another solution is to use the colClasses= argument to read.table() and specify 
NULL for the classes of the columns you don't want to read. There are other 
ways as well.

It might even be faster to do the cross-tabulations in a database and read the 
resulting summaries into R to compute any statistics you need.


Thanks!

M.



jholtman wrote:


A little simple math.  You have 3M rows with 100 items on each row.
If read in this would be 300M items.  If numeric, 8 bytes/item, this
is 2.4GB.  Given that you are probably using a 32 bit version of R,
you are probably out of luck.  A rule of thumb is that your largest
object should consume at most 25% of your memory since you will
probably be making copies as part of your processing.

Given that, is you want to read in 100 variables at a time, I would
say your limit would be about 500K rows to be reasonable.  So you have
a choice; read in fewer rolls, read in all 3M rows but at 20 columns
per read, put the data in a database and extract what you need.
Unless you go to a 64-bit version of R you will probably not be able
to have the whole file in memory at one time.

On Tue, Nov 10, 2009 at 7:10 AM, maiya maja.zaloz...@gmail.com wrote:


I'm trying to import a table into R the file is about 700MB. Here's my
first
try:


DD-read.table(01uklicsam-20070301.dat,header=TRUE)


Error: cannot allocate vector of size 15.6 Mb
In addition: Warning messages:
1: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)
2: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)
3: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)
4: In scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,
 :
 Reached total allocation of 1535Mb: see help(memory.size)

Then I tried


memory.limit(size=4095)

 and got


DD-read.table(01uklicsam-20070301.dat,header=TRUE)

Error: cannot allocate vector of size 11.3 Mb

but no additional errors. Then optimistically to clear up the workspace:


rm()
DD-read.table(01uklicsam-20070301.dat,header=TRUE)

Error: cannot allocate vector of size 15.6 Mb

Can anyone help? I'm confused by the values even: 15.6Mb, 1535Mb, 11.3Mb?
I'm working on WinXP with 2 GB of RAM. Help says the maximum obtainable
memory is usually 2Gb. Surely they mean GB?

The file I'm importing has about 3 million cases with 100 variables that
I
want to crosstabulate each with each. Is this completely unrealistic?

Thanks!

Maja
--
View this message in context:
http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26282348.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://old.nabble.com/Error%3A-cannot-allocate-vector-of-size...-tp26282348p26283467.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and

Re: [R] when vectorising does not work: silent function fail?

2009-11-10 Thread tlumley


On Tue, 10 Nov 2009, jim holtman wrote:


Have you tried something like this:

my.results = apply(chr, 2, function(x){
   result - try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3])
   if (inherits(result, try-error)) return(NULL)
   result
})

This should catch the error and have NULL in that list element.


It's probably more useful to return a numeric vector of the right size, with 
tryCatch()

   tryCatch(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri))[1,3],
error=function(e) NA)

Also, you probably get less data copying by using a for() or while() loop than 
by using apply() in this context.

Finally, the overhead of formula parsing and model matrix construction repeated 
thousands of times probably dominates this computation; if it isn't just a 
one-off it would probably be worth a lower-level implementation.

 -thomas



On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli
f.calb...@imperial.ac.uk wrote:

Dear All,

I'm using apply to do some genetic association analysis along a chromosome,
with many thousands markers. For each marker the analysis is the same, so I
was planning to use apply(chrom, 2, somefunction)

In the specific case I do:

my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x + time.cpstc +
age + sex + mri))[1,3]})

This is all good and well in theory, but in practice the lrm() model will
fail for some markers and wreck the whole process. Failure for some markers
is no problem for me, but *predicting* which markers will fail can be hugely
problematic.

I then though of creating some fucntion to catch the error messages that
would otherwise scr*w things over:

my.lrm = function(x){
pol = NULL
pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)
if(length(pol)  0)
rez = anova(pol)[1,3]
if(length(pol)  == 0)
rez = 1
rez}

my.results = apply(chr, 2, my.lrm)

Still no joy, even adding try() in the evaluation and
options(show.error.messages = F)

I am at loss on how to get the darn function to bail out *silently* if needs
be so I can just smack a replacement value in --which would also have the
benefit of keeping the order of the markers.

Any idea will be gratefully acknowledged.

Best,

Federico


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602     Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] when vectorising does not work: silent function fail?

2009-11-10 Thread Federico Calboli


On 10 Nov 2009, at 17:16, tlum...@u.washington.edu wrote:


On Tue, 10 Nov 2009, jim holtman wrote:


Have you tried something like this:

my.results = apply(chr, 2, function(x){
  result - try(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex +  
mri))[1,3])

  if (inherits(result, try-error)) return(NULL)
  result
})

This should catch the error and have NULL in that list element.


It's probably more useful to return a numeric vector of the right  
size, with tryCatch()


   tryCatch(anova(lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)) 
[1,3],

error=function(e) NA)


Both suggestions work beautifully, thanks to both!


Also, you probably get less data copying by using a for() or while()  
loop than by using apply() in this context.


I'm happy to take this in, I am not sure I can see how it would work  
(the less data copying) though.


Finally, the overhead of formula parsing and model matrix  
construction repeated thousands of times probably dominates this  
computation; if it isn't just a one-off it would probably be worth a  
lower-level implementation.


The markers are ~ 1.5M, but I don't do this kind of analysis often  
enough to justify any more work than a measly script.


Cheers,

Fede




 -thomas



On Tue, Nov 10, 2009 at 10:04 AM, Federico Calboli
f.calb...@imperial.ac.uk wrote:

Dear All,

I'm using apply to do some genetic association analysis along a  
chromosome,
with many thousands markers. For each marker the analysis is the  
same, so I

was planning to use apply(chrom, 2, somefunction)

In the specific case I do:

my.results = apply(chr, 2, function(x){anova(lrm( cpstc.f ~ x +  
time.cpstc +

age + sex + mri))[1,3]})

This is all good and well in theory, but in practice the lrm()  
model will
fail for some markers and wreck the whole process. Failure for  
some markers
is no problem for me, but *predicting* which markers will fail can  
be hugely

problematic.

I then though of creating some fucntion to catch the error  
messages that

would otherwise scr*w things over:

my.lrm = function(x){
pol = NULL
pol = lrm( cpstc.f ~ x + time.cpstc + age + sex + mri)
if(length(pol)  0)
rez = anova(pol)[1,3]
if(length(pol)  == 0)
rez = 1
rez}

my.results = apply(chr, 2, my.lrm)

Still no joy, even adding try() in the evaluation and
options(show.error.messages = F)

I am at loss on how to get the darn function to bail out  
*silently* if needs
be so I can just smack a replacement value in --which would also  
have the

benefit of keeping the order of the markers.

Any idea will be gratefully acknowledged.

Best,

Federico


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle


--
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St. Mary's Campus
Norfolk Place, London W2 1PG

Tel +44 (0)20 75941602   Fax +44 (0)20 75943193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] NetCDF output in R

2009-11-10 Thread tlumley


On Tue, 10 Nov 2009, nana wrote:


I will be glad if someone can point out what I am doing wrong or not doing at 
all in this.


Sending the same message to both r-help and r-devel is one thing you are doing 
wrong.


I am trying to write out netcdf file in R. I have 26 time step but only the 
first time step is written.

For example:

library(ncdf)
path - '/home/work/'
forecast - open.ncdf(paste(path,'cam.1980.2005.nc',sep=))
fore - get.var.ncdf(forecast,'ppt')
lon - get.var.ncdf(forecast,'lon')
lat - get.var.ncdf(forecast,'lat')
dim(fore)[3]
26
times - 1:dim(fore)[3]
write.netcdf.time(paste(path,'cam_fore.nc',sep=), fore,lon,lat,times)

[1] put.var.ncdf: warning: you asked to write 1440 values, but the passed data 
array has 37440 entries!
[[1]]
[1] 6

Warning message:
In 1:nt : numerical expression has 26 elements: only the
first used


This looks like the cause of your problem.  The function has
  times - 1:nt
implying that nt is a single number, but you are passing the vector 'times' as 
the argument.



# function for writing out the netcdf file #
write.netcdf.time - function(filename='outputfile.nc',data,lons,lats,nt){
lon - dim.def.ncdf('lon','degrees_east',lons)
lat - dim.def.ncdf('lat','degrees_north',lats)
times - 1:nt
tdim - dim.def.ncdf('time','days since 1980-01-01', times, unlim=TRUE)
# levs - dim.def.ncdf('lev','pressure',levs)
var - var.def.ncdf('data','unitless',list(lon,lat,tdim),-999.9)
ncid - create.ncdf(filename,list(var))
put.var.ncdf(ncid, var, data)
close.ncdf(ncid)
}

##end of function###

Thank you.
Nana Browne



There is no key to happiness. The door is always open.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
tlum...@u.washington.eduUniversity of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merge data

2009-11-10 Thread Chuck White

df1 -- dataframe with column date and several other columns. #rows 40k  
Several of the dates are repeated.
df2 -- dataframe with two columns date and index. #rows ~130  This is really a 
map from date to index.

I would like to create a column called index in df1 which has the corresponding 
index from df2.

The following works:
index - NULL
for(wk in df1$week){
index - c(index,df2$index[df2$week==wk])
}
and then add index to df1.

Can you please suggest a better way of doing this? I didn't think merge was 
suitable for this...is it? THANKS.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Interfacing R and C++

2009-11-10 Thread Nathan Harmston

Hi,

I m currently working on interfacing R and C++ (passing a matrix from
R to C++, doing some stuff, returning a vector of results). There seem
to be a number of ways of doing this, such as rcpp and swig. Is there
a recommended way of doing this? In the long run I would like what I m
working on to be portable across OS etc.

Nathan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Titles on panel graphs created in zoo

2009-11-10 Thread MichelleJ


I have a plotted a stacked panel graph (single x axis and multiple y axis)
using the package zoo and would like to add a title for each separate panel.
I am using the script: 

z - with(mydata,zoo(cbind(mydata$Water.level,mydata$Submerged.plants,
mydata$Crayfish.CPUE,mydata$Carp.CPUE),Year))

plot(z,type=b,pch=16,lty=2,xlab=Year,ylab=c(Metres,Realtive density,
CPUE,CPUE),main=)

Any help would be much appreciated.

M
-- 
View this message in context: 
http://old.nabble.com/Titles-on-panel-graphs-created-in-zoo-tp26285713p26285713.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] HEEELP!!!!

2009-11-10 Thread Ana María Prieto

Hello.
My name is Ana. I´m doing an eology master, and I´m just learning how R
works.
I have a Mac OS X 10.5.6, and I´m tryng to run just a simple ANOVA
nanalyses.
I dowloaded R version 2.10.0, and it seems I have problems with the script.
I don´t know what to do, I´ve already change the languages, be sure of
being
working in the correct directory and doesn´t work

the script is:

#1. example

data01-read.table(ch01.txt,header=T)

names(data01)

attach(data01)

FERTIL-factor(FERTIL)

model01-(YIELD~FERTIL)

summary(model01)

anova(model01)


all my classmates, even those with a mac operator, wrote the same, and it
worked.

When copying the script in the R console, it seems that there is a problem
with the

~ symbol. this symbol is not in the keyboard, so I select it from spetial
characters,

and then i selected with shift an key, and still, it doesn´t works

I attached th file with the data. i hope you can help me!


Ana.
FERTIL  YIELD   YIELDM  VARIETY SEX DBH FLOWERS YIELD2
1   6.2725.12   1   2   174 234 6.27
1   5.3617.25   1   1   171 110 5.36
1   6.3926.42   1   2   212 512 6.39
1   4.8516.08   1   1   187 133 4.85
1   5.9922.15   1   2   227 11075.99
1   7.1415.92   1   1   274 624 7.14
1   5.0840.25   2   2   86  8   5.08
1   4.0735.25   2   2   161 333 4.07
1   4.3531.98   2   1   285 637 4.35
1   4.9536.52   2   1   182 225 4.95
2   3.0743.32   2   2   225 512 3.07
2   3.2937.12   1   133 47  3.29
2   4.0418.33   2   154 164 4.04
2   4.1922.63   2   71  1   4.19
2   3.4125.93   1   133 36  3.41
2   3.7515.05   3   1   267 524 3.75
2   4.8728.05   4   2   166 198 4.87
2   3.9428.55   4   2   145 225 3.94
2   6.2833.24   2   207 548 NA
2   3.1531.68   4   1   216 272 3.15
3   4.0430.32   4   2   157 164 4.04
3   3.7927.58   4   2   87  5   3.79
3   4.56NA  NA  1   185 133 4.56
3   4.55NA  NA  1   183 133 4.55
3   4.55NA  NA  2   193 419 4.55
3   4.53NA  NA  1   264 442 4.53
3   3.53NA  NA  1   83  3   3.53
3   3.71NA  NA  2   300 14823.71
3   7   NA  NA  1   242 333 NA
3   4.61NA  NA  2   316 11384.61
NA  NA  NA  NA  2   217 561 NA
NA  NA  NA  NA  1   235 488 NA
NA  NA  NA  NA  2   189 282 NA
NA  NA  NA  NA  1   242 536 NA
NA  NA  NA  NA  2   84  11  NA
NA  NA  NA  NA  2   135 148 NA
NA  NA  NA  NA  1   267 512 NA
NA  NA  NA  NA  2   122 70  NA
NA  NA  NA  NA  2   280 1107NA
NA  NA  NA  NA  2   295 941 NA
NA  NA  NA  NA  2   139 89  NA
NA  NA  NA  NA  2   165 181 NA
NA  NA  NA  NA  2   81  8   NA
NA  NA  NA  NA  1   190 173 NA
NA  NA  NA  NA  2   68  1   NA
NA  NA  NA  NA  2   314 1091NA
NA  NA  NA  NA  1   284 561 NA
NA  NA  NA  NA  2   77  3   NA
NA  NA  NA  NA  1   228 312 NA
NA  NA  NA  NA  2   282 1398NA
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to use package-rsm to generate experimental design

2009-11-10 Thread Rob Wang


Hi guys, 

I am a totally begginer with R.  I am planning to use R to design and
optimize my experiment. The experiment includes 4 factors, and three of the
fators have 3 levels, and the last factor has 6 levels. I am having a really
hard time to learn this program on my own and didn't find enough useful
material to get start. It would be appreciated if any one send me useful
materials, especially those on how to use the pakage-RSM(Response surface
methodoly) to generate experiment design. Thanks!

Rob


-- 
View this message in context: 
http://old.nabble.com/How-to-use-package-rsm-to-generate-experimental-design-tp26287354p26287354.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] source() vs attach()0

2009-11-10 Thread Stefan Zeugner


Hello,
After hours of googling I could not resolve the following (although it 
seems simple):


I would like to put subfunctions in a separate .R file that is then 
called with source() from inside several main functions. A crude 
example would be as follows:


 file subtest.R **
subtest - function() {
 foo - foo+1
 }
**


*** main function 
test-function(foo) {
 source(test.R,local=TRUE)
 subtest()
 return(foo)
 }
**

Then executing the code
 test(1)
works as prescribed.
But of course, each function call to test() 'sources' (i.e. parse() and 
eval()) the file subtest.R again and again. (try e.g. changing line 2 in 
subtest.R to 'foo- foo +2' and run 'test(1)' )



How can I 'attach' the function subtest inside the main function, such 
that it is not evaluated again at each function call?


And this while maintaining the above 'subfunction' functionality (i.e. 
attaching the subfunction under the right environment)?


I was not able to resolve that - thanks in advance for any help!

all the best
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Comparison of vectors in a matrix

2009-11-10 Thread Tony Plate


Nice problem!

If I understand you correctly, here's how to do it (with list-based matrices):


set.seed(1)
(x - matrix(lapply(rpois(10,2)+1, function(k) sample(letters[1:10], size=k)), ncol=2, 
dimnames=list(1:5,c(A,B
 A   B  
1 Character,2 Character,5

2 Character,2 Character,5
3 Character,3 Character,3
4 Character,5 Character,3
5 Character,2 i

x[1,1]

[[1]]
[1] c b


x[1,2]

[[1]]
[1] c d a j f


(y - cbind(x, A-B=apply(x, 1, function(ab) setdiff(ab[[1]], ab[[2]]
 A   B   A-B
1 Character,2 Character,5 b
2 Character,2 Character,5 g
3 Character,3 Character,3 Character,3

4 Character,5 Character,3 Character,2
5 Character,2 i Character,2

y[1,3]

[[1]]
[1] b





-- Tony Plate

esterhazy wrote:

Hi,

I have a matrix with two columns, and the elements of the matrix are
vectors.

So for example, in line 3 of column 1 I have a vector v31=(marc, robert,
marie).

What I need to do is to compare all vectors in column 1 and 2, so as to get,
for example setdiff(v31,v32) into a new column.

Is there a way to do this in R?

Thanks!


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merge data



On Nov 10, 2009, at 12:36 PM, Chuck White wrote:

df1 -- dataframe with column date and several other columns. #rows  
40k  Several of the dates are repeated.
df2 -- dataframe with two columns date and index. #rows ~130  This  
is really a map from date to index.


I would like to create a column called index in df1 which has the  
corresponding index from df2.


The following works:
index - NULL
for(wk in df1$week){
   index - c(index,df2$index[df2$week==wk])
}
and then add index to df1.

Can you please suggest a better way of doing this? I didn't think  
merge was suitable for this...is it? THANKS.


I think merge should work, but if you really have looked at the  
various arguments, tested reasonable examples and are still convinced  
it wouldn't, then see what you get with:


 df1 - data.frame(dt = Sys.Date() - sample(100:120, 30,  
replace=TRUE), 1:30)

 df2 - data.frame(dt2 = Sys.Date() -100:120, index=LETTERS[1:21])

 df1$index - df2[ match(df1$dt,df2$dt2), index]
 df1
   dt X1.30 index
1  2009-07-30 1 D
2  2009-07-16 2 R
3  2009-07-23 3 K
4  2009-07-29 4 E
5  2009-07-15 5 S
6  2009-08-02 6 A
7  2009-07-18 7 P
8  2009-07-21 8 M
9  2009-07-27 9 G
10 2009-07-2610 H
11 2009-07-3111 C
12 2009-07-2612 H
13 2009-07-1813 P
14 2009-07-2314 K
15 2009-07-2115 M
16 2009-07-1916 O
17 2009-07-1417 T
18 2009-07-1618 R
19 2009-07-1519 S
20 2009-07-1320 U
21 2009-07-2821 F
22 2009-07-2022 N
23 2009-07-2423 J
24 2009-07-2024 N
25 2009-07-1625 R
26 2009-07-3026 D
27 2009-07-1427 T
28 2009-08-0228 A
29 2009-07-1929 O
30 2009-07-2630 H

I tried merge(df1, df2, by.x=1, by.y=1) and got the same result modulo  
the order of the output.



--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Why arima does not work when the time series is a costant?

2009-11-10 Thread Aijun


Hi! 

One of my time series happens to be a constant. When I call arima with c(0,
0, 0), it gives me error. Here is an example:

ts1 - ts(rep(1, 29))
fit - arima(ts1, order=c(0,0,0))

When I run the second line above, it gives me the following error
Error in decompose(ts(x[1L:wind], start = start(x), frequency = f),
seasonal) : 
  time series has no or less than 2 periods 

I was confused why arima does not work if my time series is a constant. It
should be very simple - resulted model should be a constant with the value
of 1. 

Thanks,
Aijun.

-- 
View this message in context: 
http://old.nabble.com/Why-arima-does-not-work-when-the-time-series-is-a-costant--tp26287374p26287374.html
Sent from the R help mailing list archive at Nabble.com.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HEEELP!!!!

2009-11-10 Thread Liviu Andronic

Hello

On 11/10/09, Ana María Prieto prieto.anama...@gmail.com wrote:
  When copying the script in the R console, it seems that there is a problem
  with the

  ~ symbol. this symbol is not in the keyboard, so I select it from spetial
  characters,
For what language is your keyboard designed? If it is for English,
similar to this one [1], try a British English layout, and then the ~
sign will (possibly) be on the \ button (press shift+\). I'm on
Linux, but I hope it would be similar for Mac.
Liviu

[1] http://www.mrtotes.co.uk/UK_Mac_Keyboard.jpg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Interfacing R and C++

2009-11-10 Thread Duncan Murdoch


On 11/10/2009 12:35 PM, Nathan Harmston wrote:

Hi,

I m currently working on interfacing R and C++ (passing a matrix from
R to C++, doing some stuff, returning a vector of results). There seem
to be a number of ways of doing this, such as rcpp and swig. Is there
a recommended way of doing this? In the long run I would like what I m
working on to be portable across OS etc.


Just follow the instructions in the Writing R Extensions manual.  You're 
not doing anything very unusual, so rcpp and swig may be overkill.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HEEELP!!!!



On Nov 10, 2009, at 11:24 AM, Ana María Prieto wrote:


Hello.
My name is Ana. I´m doing an eology master, and I´m just learning  
how R

works.
I have a Mac OS X 10.5.6, and I´m tryng to run just a simple ANOVA
nanalyses.
I dowloaded R version 2.10.0, and it seems I have problems with the  
script.

I don´t know what to do, I´ve already change the languages, be sure of
being
working in the correct directory and doesn´t work

the script is:

#1. example

data01-read.table(ch01.txt,header=T)

names(data01)

attach(data01)

FERTIL-factor(FERTIL)

model01-(YIELD~FERTIL)


I suspect that the problem is not with the lack of a tilde (~) or  
the misinterpretation of the special character that you seem to have  
gotten into this mail message, but rather with your failure to type  
the two letters lm thusly:


model01 - lm(YIELD~FERTIL)

I have 3 Macs of varying vintages (one desktop and two laptops) and  
they all have the tilde in the upper left corner just below the escape  
key.


Future submissions to the R-help list should be accompanied by a more  
informative subject line. That sort of Subject is considered rude  
(some might even say infantile) behavior in this arena.


--
David.



summary(model01)

anova(model01)


all my classmates, even those with a mac operator, wrote the same,  
and it

worked.

When copying the script in the R console, it seems that there is a  
problem

with the

~ symbol. this symbol is not in the keyboard, so I select it from  
spetial

characters,

and then i selected with shift an key, and still, it doesn´t works

I attached th file with the data. i hope you can help me!


Ana.

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Monte Carlo Simulation in R...

Hi, Dear R users,

I'm wondering if I can do Monte Carlo Simulation in R. My problem is like
this: I know variable X follows Gamma distribution with shape parameter
0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me
to simulate a vector of X that satisfies both the probability distribution
and the sum. Anyone has a clue to this? Much appreciated.

Regards
Garry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] All possible combinations of functions within a function

2009-11-10 Thread bikemike42


Dear All,

I wrote a function for cluster analysis to compute cophenetic correlations
between dissimilarity matrices (using the VEGAN library) and cluster
analyses of every possible clustering algorithm (SEE ATTACHED)
http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R .  As it is now,
it is extremely long, and for the future I was hoping to find a more
efficient way of doing this sort of thing.  

To give you an outline of the function, first I create the  dissimiarity
matrices using all possible methods in the VEGAN command vegdist, then
create the clusters using all possible algorithms in hclust and the
dissimilarity matrices I crated, then create a table, and in one column,
list all combinations, and in the other, compute and put the cophenetic
correlation.

Any help would be appreciated!  I'm pretty new to writing my own functions
but I see great time-saving potential.

Thanks!
Mike
-- 
View this message in context: 
http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Command-line arguments and --interactive

2009-11-10 Thread Adam D. I. Kramer



On Tue, 10 Nov 2009, Duncan Murdoch wrote:

--interactive tells R that there is a human producing the input stream, so it 
can ask questions and expect them to be answered.  In your experiments with 
it, your input stream was the pipe holding the output of echo, and R got 
confused because that pipe wouldn't answer its question.


I see. That makes sense, and thank you.


Your problem is that you want an input stream that starts out from your
fixed code and then switches to your shell's stdin.


Correct. Or, you might consider it an ability to add a command or two to the
effective end of .Rprofile.


I think you can do that on some systems by saving your input to a file
then concatenating it to stdin, e.g.  something like this:

echo '10*5; scan()' test.R
cat test.R - | R --interactive

I don't know if there's a way to do this in one line, and I'd expect some 
oddities.


Creating a new file progammatically is difficult.

Another solution, suggested by peterdc in the #R irc channel, was to run R
itself and then use the system() command to complete the
prior-to-launching-R task...that is the solution I'm going with, but thanks
very much for your help!

--Adam

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Monte Carlo Simulation in R...

2009-11-10 Thread Duncan Murdoch


On 11/10/2009 1:25 PM, Hongwei Dong wrote:

Hi, Dear R users,

I'm wondering if I can do Monte Carlo Simulation in R. My problem is like
this: I know variable X follows Gamma distribution with shape parameter
0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help me
to simulate a vector of X that satisfies both the probability distribution
and the sum. Anyone has a clue to this? Much appreciated.


Your requirements are slightly contradictory or incomplete.  Here's one 
way to fully specify the problem:


The X_i values are independent Gammas, with the given shape and scale. 
You want to simulate from the joint distribution conditional on the 
event sum(X) == 2000.


Is that your problem?  I don't know how to do the simulation, but maybe 
someone else does.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] All possible combinations of functions within a function

Here is a hint of how you might want to do the first part.  You might
want to study the 'lappy' function

vegList - lapply(c('bray', ..., 'binomial'), function(.method){
vegdist(x, method=.method)
})

clustList - lapply(vegList, function(.dist){
lapply(c('average', ..., 'centroid'), function(.dist){
hclust(.veg, .dist)
})
})

rest left as exercise for the reader.

On Tue, Nov 10, 2009 at 2:05 PM, bikemike42 ml...@tamu.edu wrote:

 Dear All,

 I wrote a function for cluster analysis to compute cophenetic correlations
 between dissimilarity matrices (using the VEGAN library) and cluster
 analyses of every possible clustering algorithm (SEE ATTACHED)
 http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R .  As it is now,
 it is extremely long, and for the future I was hoping to find a more
 efficient way of doing this sort of thing.

 To give you an outline of the function, first I create the  dissimiarity
 matrices using all possible methods in the VEGAN command vegdist, then
 create the clusters using all possible algorithms in hclust and the
 dissimilarity matrices I crated, then create a table, and in one column,
 list all combinations, and in the other, compute and put the cophenetic
 correlation.

 Any help would be appreciated!  I'm pretty new to writing my own functions
 but I see great time-saving potential.

 Thanks!
 Mike
 --
 View this message in context: 
 http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] All possible combinations of functions within a function

2009-11-10 Thread baptiste auguie

Hi,

From what I understand of your code, you might find the following
construct useful,

funs - c(mean, sum, sd, diff)
x - 1:10
lapply(funs, do.call, args=list(x))

and then working with lists rather than naming every object
individually. You might find mapply useful too when you have to pass
several parameters.

HTH,

baptiste

2009/11/10 bikemike42 ml...@tamu.edu:

Dear All,

I wrote a function for cluster analysis to compute cophenetic correlations
between dissimilarity matrices (using the VEGAN library) and cluster
analyses of every possible clustering algorithm (SEE ATTACHED)
http://old.nabble.com/file/p26288610/cor.coef.R cor.coef.R . As it is now,
it is extremely long, and for the future I was hoping to find a more
efficient way of doing this sort of thing.

To give you an outline of the function, first I create the dissimiarity
matrices using all possible methods in the VEGAN command vegdist, then
create the clusters using all possible algorithms in hclust and the
dissimilarity matrices I crated, then create a table, and in one column,
list all combinations, and in the other, compute and put the cophenetic
correlation.

Any help would be appreciated! I'm pretty new to writing my own functions
but I see great time-saving potential.

Thanks!
Mike
--
View this message in context:
http://old.nabble.com/All-possible-combinations-of-functions-within-a-function-tp26288610p26288610.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

Exactly! Thanks, Duncan.

Let me re-phrase me question like this:

1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008
2) Min(X)=1 and Max(X)=85
3) SUM(X)=2000
4) Do I also have to define the number of draws? if yes, it could be 250.

Based on these restrictions, I want to generate random draw. I'm wondering
how I can do this in R. Thanks.

Garry



On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.cawrote:

 On 11/10/2009 1:25 PM, Hongwei Dong wrote:

 Hi, Dear R users,

 I'm wondering if I can do Monte Carlo Simulation in R. My problem is like
 this: I know variable X follows Gamma distribution with shape parameter
 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help
 me
 to simulate a vector of X that satisfies both the probability distribution
 and the sum. Anyone has a clue to this? Much appreciated.


 Your requirements are slightly contradictory or incomplete.  Here's one way
 to fully specify the problem:

 The X_i values are independent Gammas, with the given shape and scale. You
 want to simulate from the joint distribution conditional on the event sum(X)
 == 2000.

 Is that your problem?  I don't know how to do the simulation, but maybe
 someone else does.

 Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] do.call and timeSeries

2009-11-10 Thread Peter Ehlers



Bierbryer, Andrew wrote:

Does anyone know why the following code hangs on the do.call, but works
fine when I either comment out the require(timeSeries) or only do 2
levels of a for loop instead of 3?

 


Thanks,

 


Andrew Bierbryer

 

 

 


require(timeSeries)

 

num - 1 

 


x.list - list()

 


for ( i in 1:10 ) {

  for ( j in 1:20 ) {

for ( k in 1:30 ) {

  x.list[[num]] - cbind(num,10)

  num - num + 1

}

  }

}

 

 


cat('calling do.call\n')

 


x.df - do.call(rbind,x.list)

  


cat('called do.call\n')

 

This seems like a highly unusual way to generate x.df,
but it works fine for me. No idea why this doesn't work
for you.

Here's my sessionInfo; what's yours?

 sessionInfo()
R version 2.10.0 Patched (2009-11-02 r50295)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] timeSeries_2100.84 timeDate_2100.86


 -Peter Ehlers




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...



On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:


Exactly! Thanks, Duncan.

Let me re-phrase me question like this:

1) X_i values are independent Gammas, with the shape 0.067 and scale  
0.008

2) Min(X)=1 and Max(X)=85


You might want to check that your parameterization in in agreement  
with that used by the rgamma function. Simply using those numbers  
yields a distribution that does not look as though it would get many  
qualifying samples. Here are 20 draws without any exclusions outside a  
range:


  rgamma(20, shape=0.067,  scale = 0.008)
 [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07  
7.680773e-38 6.441082e-15 6.168961e-13
 [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11  
1.852885e-04 4.212802e-07 1.774495e-25

[17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html



3) SUM(X)=2000
4) Do I also have to define the number of draws? if yes, it could be  
250.


Based on these restrictions, I want to generate random draw. I'm  
wondering

how I can do this in R. Thanks.

Garry



On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch  
murd...@stats.uwo.cawrote:



On 11/10/2009 1:25 PM, Hongwei Dong wrote:


Hi, Dear R users,

I'm wondering if I can do Monte Carlo Simulation in R. My problem  
is like
this: I know variable X follows Gamma distribution with shape  
parameter
0.067 and scale parameter 0.008. The sum of the X is 2000. I need  
R help

me
to simulate a vector of X that satisfies both the probability  
distribution

and the sum. Anyone has a clue to this? Much appreciated.



Your requirements are slightly contradictory or incomplete.  Here's  
one way

to fully specify the problem:

The X_i values are independent Gammas, with the given shape and  
scale. You
want to simulate from the joint distribution conditional on the  
event sum(X)

== 2000.

Is that your problem?  I don't know how to do the simulation, but  
maybe

someone else does.

Duncan Murdoch



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

Thanks.
I tried the rgamma function too. But I'm still wondering how I can set the
min, max, and sum of the variates created by the random draws. Anyone has a
clue? Thanks.

Garry



On Tue, Nov 10, 2009 at 11:47 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

  Exactly! Thanks, Duncan.

 Let me re-phrase me question like this:

 1) X_i values are independent Gammas, with the shape 0.067 and scale 0.008
 2) Min(X)=1 and Max(X)=85


 You might want to check that your parameterization in in agreement with
 that used by the rgamma function. Simply using those numbers yields a
 distribution that does not look as though it would get many qualifying
 samples. Here are 20 draws without any exclusions outside a range:

   rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07
 7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11
 1.852885e-04 4.212802e-07 1.774495e-25
 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html



  3) SUM(X)=2000
 4) Do I also have to define the number of draws? if yes, it could be 250.

 Based on these restrictions, I want to generate random draw. I'm wondering
 how I can do this in R. Thanks.

 Garry



 On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.ca
 wrote:

  On 11/10/2009 1:25 PM, Hongwei Dong wrote:

  Hi, Dear R users,

 I'm wondering if I can do Monte Carlo Simulation in R. My problem is
 like
 this: I know variable X follows Gamma distribution with shape parameter
 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R help
 me
 to simulate a vector of X that satisfies both the probability
 distribution
 and the sum. Anyone has a clue to this? Much appreciated.


 Your requirements are slightly contradictory or incomplete.  Here's one
 way
 to fully specify the problem:

 The X_i values are independent Gammas, with the given shape and scale.
 You
 want to simulate from the joint distribution conditional on the event
 sum(X)
 == 2000.

 Is that your problem?  I don't know how to do the simulation, but maybe
 someone else does.

 Duncan Murdoch


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] do.call and timeSeries

2009-11-10 Thread Bierbryer, Andrew

Peter -

I am using 2.8.1 on linux. When I use 2.10.0 on the pc, it works, so it
must be a version issue.




 sessionInfo()
R version 2.8.1 (2008-12-22) 
x86_64-pc-linux-gnu 

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.U
TF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=
C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATI
ON=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base





-Original Message-
From: Peter Ehlers [mailto:ehl...@ucalgary.ca] 
Sent: Tuesday, November 10, 2009 2:38 PM
To: Bierbryer, Andrew
Cc: r-help@r-project.org
Subject: Re: [R] do.call and timeSeries


Bierbryer, Andrew wrote:
 Does anyone know why the following code hangs on the do.call, but
works
 fine when I either comment out the require(timeSeries) or only do 2
 levels of a for loop instead of 3?
 
  
 
 Thanks,
 
  
 
 Andrew Bierbryer
 
  
 
  
 
  
 
 require(timeSeries)
 
  
 
 num - 1 
 
  
 
 x.list - list()
 
  
 
 for ( i in 1:10 ) {
 
   for ( j in 1:20 ) {
 
 for ( k in 1:30 ) {
 
   x.list[[num]] - cbind(num,10)
 
   num - num + 1
 
 }
 
   }
 
 }
 
  
 
  
 
 cat('calling do.call\n')
 
  
 
 x.df - do.call(rbind,x.list)
 
   
 
 cat('called do.call\n')
 
  
This seems like a highly unusual way to generate x.df,
but it works fine for me. No idea why this doesn't work
for you.

Here's my sessionInfo; what's yours?

  sessionInfo()
R version 2.10.0 Patched (2009-11-02 r50295)
i386-pc-mingw32

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C
[5] LC_TIME=English_Canada.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] timeSeries_2100.84 timeDate_2100.86
 

  -Peter Ehlers

 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Formatted contingency tables with (%)

2009-11-10 Thread soeren . vogel


Quite often, I need those tables:

x - sample(c(a, b, c), 40, rep=T)
y - sample(c(X, Y), 40, rep=T)
(tbl - table(x, y))
(z - as.factor(paste(as.vector(tbl),  (,  
round(prop.table(as.vector(tbl)) * 100, 1), %), sep=)))

matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl))

But the result looks ugly and is not copypaste-able for LaTeX  
verbatim or table environment, moreover, the \ is not what I want  
in the printout. How to achieve:


   y
x  X  Y
a  3  (7.5%)   7 (17.5%)
b  9 (22.5%)   5 (12.5%)
c  6 (15.0%)  10 (25.0%)

Thank you for help or hints.

Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Automating Plot Commands using a Loop

2009-11-10 Thread Koraelus


Hello,

Thank you very much. Your string makes perfect sense to me, but I get an
error when I try this:

Data-read.csv(Datacull.txt,header=T,row.names=1)
TData-t(Data)
PlotFunction-function (x) {
par(mfrow=c(3,6))
for (i in colnames(x)) {
plot(x[[i]],type=o,axes=F,xlab='',ylab='',ylim=c(0,2),main=i)
axis(1,at=1:6,lab=c(A,B,C,D,E,F))
axis(2,at=seq(0,2,.25))
}}
PlotFunction(TData)

I get an error:
Error in x[[i]] : subscript out of bounds

If I try to hard code it without using a function

for (i in colnames(TData)) {
plot(TData[[i]],type=o,axes=F,xlab='',ylab='',ylim=c(0,2),main=i)
axis(1,at=1:6,lab=c(A,B,C,D,E,F))
axis(2,at=seq(0,2,.25))
}}

I get the similar error:
Error in TData[[i]] : subscript out of bounds

What does this mean?



jholtman wrote:
 
 try this:
 
 Data - read.table(textConnection( Alpha Beta Gamma Delta
 A  .1 .2   .3.4
 B   .2.3   .4.5
 C   .8   .9   .43   .13
 D  .13   .34  .34  .3), header=TRUE)
 closeAllConnections()
 
 par(mfrow=c(2,2))
 for (i in colnames(Data)){
 plot(Data[[i]],type=o,axes=F,xlab='', ylab='', ylim=c(0,1), main=i)
 axis(1,at=1:4,lab=c(A,B,C,D))
 axis(2,at=seq(0, 1, .25)) # changed your axis to work
 }
 
 
 On Mon, Nov 9, 2009 at 3:38 PM, Koraelus korae...@yahoo.com wrote:

 I have three matrices with the same row and column names, but different
 data.

 e.g.

 Data
  Alpha Beta Gamma Delta
 A  .1     .2       .3        .4
 B   .2    .3       .4        .5
 C   .8   .9       .43       .13
 D  .13   .34      .34      .3


 For each column, I would like to create a separate plot on a single
 window.
 I currently am using the cmd-line:

 windows()
 par(mfrow=c(2,2))
 plot(Data$Alpha,type=o,axes=F,ann=F,ylim=c(0,1))
 axis(1,at=1:4,lab=c(A,B,C,D)
 axis(2,at=.25*0:1)
 plot(Data$Beta,type=o,axes=F,ann=F,ylim=c(0,1))
 axis(1,at=1:4,lab=c(A,B,C,D)
 axis(2,at=.25*0:1)
 etc.

 I would like to automate this as much as possible. I tried to write a
 function, but clearly it would involve some sort of loop... I don't know
 how
 to do anything like that.

 Thanks,
 koraelus
 --
 View this message in context:
 http://old.nabble.com/Automating-Plot-Commands-using-a-Loop-tp26273268p26273268.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem that you are trying to solve?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://old.nabble.com/Automating-Plot-Commands-using-a-Loop-tp26273268p26289078.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

2009-11-10 Thread Ravi Varadhan

I think he means rate = 0.008, so he is looking for:

 rgamma(n, shape=0.067, rate=0.008)

Even then his problem is not well-posed.  You cannot have both independent
gamma rv's and have them sum to 2000.

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of David Winsemius
Sent: Tuesday, November 10, 2009 2:47 PM
To: Hongwei Dong
Cc: R-help Forum; Duncan Murdoch
Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte
Carlo Simulation in R...


On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

 Exactly! Thanks, Duncan.

 Let me re-phrase me question like this:

 1) X_i values are independent Gammas, with the shape 0.067 and scale  
 0.008
 2) Min(X)=1 and Max(X)=85

You might want to check that your parameterization in in agreement  
with that used by the rgamma function. Simply using those numbers  
yields a distribution that does not look as though it would get many  
qualifying samples. Here are 20 draws without any exclusions outside a  
range:

   rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07  
7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11  
1.852885e-04 4.212802e-07 1.774495e-25
[17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html


 3) SUM(X)=2000
 4) Do I also have to define the number of draws? if yes, it could be  
 250.

 Based on these restrictions, I want to generate random draw. I'm  
 wondering
 how I can do this in R. Thanks.

 Garry



 On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch  
 murd...@stats.uwo.cawrote:

 On 11/10/2009 1:25 PM, Hongwei Dong wrote:

 Hi, Dear R users,

 I'm wondering if I can do Monte Carlo Simulation in R. My problem  
 is like
 this: I know variable X follows Gamma distribution with shape  
 parameter
 0.067 and scale parameter 0.008. The sum of the X is 2000. I need  
 R help
 me
 to simulate a vector of X that satisfies both the probability  
 distribution
 and the sum. Anyone has a clue to this? Much appreciated.


 Your requirements are slightly contradictory or incomplete.  Here's  
 one way
 to fully specify the problem:

 The X_i values are independent Gammas, with the given shape and  
 scale. You
 want to simulate from the joint distribution conditional on the  
 event sum(X)
 == 2000.

 Is that your problem?  I don't know how to do the simulation, but  
 maybe
 someone else does.

 Duncan Murdoch


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

2009-11-10 Thread Ravi Varadhan

May be you are interested in the first `n' for which the sum of iid gamma
rvs exceeds 2000, subject to the min-max constraints on each rv. 

If so, the following one-liner will give it to you:

which(cumsum(pmax(1, pmin(rgamma(500, shape=0.067, rate=0.008), 85))) 
2000)[1]

Note that I have used a slightly inefficient code (a somewhat generous N of
500), but this permits economy of code (on-liner!).

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Hongwei Dong
Sent: Tuesday, November 10, 2009 2:57 PM
To: David Winsemius
Cc: R-help Forum
Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte
Carlo Simulation in R...

Thanks.
I tried the rgamma function too. But I'm still wondering how I can set the
min, max, and sum of the variates created by the random draws. Anyone has a
clue? Thanks.

Garry



On Tue, Nov 10, 2009 at 11:47 AM, David Winsemius
dwinsem...@comcast.netwrote:


 On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

  Exactly! Thanks, Duncan.

 Let me re-phrase me question like this:

 1) X_i values are independent Gammas, with the shape 0.067 and scale
0.008
 2) Min(X)=1 and Max(X)=85


 You might want to check that your parameterization in in agreement with
 that used by the rgamma function. Simply using those numbers yields a
 distribution that does not look as though it would get many qualifying
 samples. Here are 20 draws without any exclusions outside a range:

   rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07
 7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11
 1.852885e-04 4.212802e-07 1.774495e-25
 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html



  3) SUM(X)=2000
 4) Do I also have to define the number of draws? if yes, it could be 250.

 Based on these restrictions, I want to generate random draw. I'm
wondering
 how I can do this in R. Thanks.

 Garry



 On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch murd...@stats.uwo.ca
 wrote:

  On 11/10/2009 1:25 PM, Hongwei Dong wrote:

  Hi, Dear R users,

 I'm wondering if I can do Monte Carlo Simulation in R. My problem is
 like
 this: I know variable X follows Gamma distribution with shape parameter
 0.067 and scale parameter 0.008. The sum of the X is 2000. I need R
help
 me
 to simulate a vector of X that satisfies both the probability
 distribution
 and the sum. Anyone has a clue to this? Much appreciated.


 Your requirements are slightly contradictory or incomplete.  Here's one
 way
 to fully specify the problem:

 The X_i values are independent Gammas, with the given shape and scale.
 You
 want to simulate from the joint distribution conditional on the event
 sum(X)
 == 2000.

 Is that your problem?  I don't know how to do the simulation, but maybe
 someone else does.

 Duncan Murdoch


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatted contingency tables with (%)



On Nov 10, 2009, at 3:07 PM, soeren.vo...@eawag.ch wrote:


Quite often, I need those tables:

x - sample(c(a, b, c), 40, rep=T)
y - sample(c(X, Y), 40, rep=T)
(tbl - table(x, y))
(z - as.factor(paste(as.vector(tbl),  (,  
round(prop.table(as.vector(tbl)) * 100, 1), %), sep=)))

matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl))

But the result looks ugly and is not copypaste-able for LaTeX  
verbatim or table environment, moreover, the \ is not what I want  
in the printout. How to achieve:


  y
x  X  Y
a  3  (7.5%)   7 (17.5%)
b  9 (22.5%)   5 (12.5%)
c  6 (15.0%)  10 (25.0%)

Thank you for help or hints.



In addition to my other thought:

 library(xtable)

 xtable(tbl)
% latex table generated in R 2.10.0 by xtable 1.5-5 package
% Tue Nov 10 15:44:14 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rrr}
  \hline
  X  Y \\
  \hline
a6   10 \\
  b   124 \\
  c35 \\
   \hline
\end{tabular}
\end{center}
\end{table}



Sören

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] gsub does not support \b?

2009-11-10 Thread Tan, Richard

Hello, can someone help?  How come
 
 gsub(\bINDS\b,INDUSTRIES,ADVANCED ENERGY INDS)
[1] ADVANCED ENERGY INDS
 
not ADVANCED ENERGY INDUSTRIES
 
Thanks.
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] creating multiple plots using a splitting factor

2009-11-10 Thread Johnston, Danielle

Thank you for the responses.  The Lattice library is indeed useful for
producing the graphs in which I am interested, and I appreciate the
clarification between a function and the result of a function.  

Ideally, I would like to be able to page through the graphs rather than
(or in addition to) having them displayed on the same page.  Is there a
way to do this?

Danielle B. Johnston, Habitat Researcher
Colorado Division of Wildlife

-Original Message-
From: Phil Spector [mailto:spec...@stat.berkeley.edu] 
Sent: Tuesday, November 10, 2009 12:46 AM
To: Johnston, Danielle
Subject: Re: [R] creating multiple plots using a splitting factor

Danielle -
What you meant to say was

lapply(splitlist, function(x)hist(x$distance_cm, breaks=10))

but you might also be interested in trying

library(lattice)
histogram(~distance_cm|site,data=seeddist2)


- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu




On Mon, 9 Nov 2009, Johnston, Danielle wrote:

 Hello,



 I am new to R.  I often collect data at multiple sites and need to
 create separate graphs (such as scatterplots or histograms) of
specific
 variables for each site.  I have tried to do this by splitting the
data
 frame and then using lapply, but it seems that the graphing commands
 cannot be called as functions.  Here is a sample of my data, called
 seeddist2:



   siteDaysSinceRelease   distance_cm

 10  GVM   1   17.8

 11  GVM   1   17.8

 12  GVM   1   14.0

 13  GVM   1   14.0

 14  GVM   1   14.9

 15  GVM   1   25.4

 16  WRR   1   25.4

 17  WRR   1   35.0

 18  WRR   1   45.0

 19  WRR   1   55.0

 20  WRR   1   60.0



 Here is what I tried to get separate histograms of distance_cm by
site:

 splitlist- split(seeddist2, site)

 lapply(splitlist, hist(distance_cm, breaks=10))



 I then get an error message saying that match.fun didn't find the
 function.  Is there another way to produce multiple graphs at once?



 Thank you,



 Danielle B. Johnston, Habitat Researcher

 Colorado Division of Wildlife




   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process gets killed spontaneously

2009-11-10 Thread rmailbox

This was happening to me on Red Hat Linux when I was running huge jobs within a 
screen session. By any chance are your R processes running within a screen 
session? (screen is a very nice program that will keep your sessions alive 
after you log out, but it was killing off my big memory jobs for unknown 
reasons.)

Eric



- Original message -
From: Peng Yu pengyu...@gmail.com
To: r-h...@stat.math.ethz.ch
Date: Tue, 10 Nov 2009 10:23:18 -0600
Subject: Re: [R] R process gets killed spontaneously

2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de:


 Peng Yu wrote:

 My R process has been killed for a few times, although the system
 administrator did not do so. It happened when R attempted to allocate
 a lot of memory. I'm wondering whether R would spontaneously kill
 itself if it can not allocate enough memory?


 It does not kill itself. If it was not some out-of-memory kill process of
 your OS, then you may have seen some segfault. We need reproducible code and
 all the details described in the posting guide.

Please see this thread about the necessary information.

http://www.mail-archive.com/r-help@r-project.org/msg74804.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] All possible combinations of functions within a function

2009-11-10 Thread Michael L. Treglia

Thank you Baptiste and Jim- I look forward to trying these ideas out
when I have a chance.

Mike

baptiste auguie wrote:

Hi,

From what I understand of your code, you might find the following
construct useful,

funs - c(mean, sum, sd, diff)
x - 1:10
lapply(funs, do.call, args=list(x))

and then working with lists rather than naming every object
individually. You might find mapply useful too when you have to pass
several parameters.

HTH,

baptiste

2009/11/10 bikemike42 ml...@tamu.edu:

Dear All,

Any help would be appreciated! I'm pretty new to writing my own functions
but I see great time-saving potential.

--
Michael L. Treglia
Graduate Student
Wildlife and Fisheries Sciences
Texas AM University
Lab: (979)862-7245
Cell: (917)841-5603
ml...@tamu.edu
http://people.tamu.edu/~mlt35

[R] standardGeneric seems slow; any way to get around it?

2009-11-10 Thread John Tillinghast

Hi,
I'm running some routines with standard matrix operations like solve() and
diag().
When I do a profile, the lead item under total time is standardGeneric().
Furthermore, solve() and diag() have much greater total time than self time.
???
I assume there is some time-consuming decision going on in the usual
functions;
is there any way to avoid that and go straight to the calculaions?

Thanks
John Tillinghast

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process gets killed spontaneously

2009-11-10 Thread Peng Yu

I run R in gnome-terminal. Is it what you referred as 'screen session'?

On Tue, Nov 10, 2009 at 3:01 PM,  rmail...@justemail.net wrote:
 This was happening to me on Red Hat Linux when I was running huge jobs within 
 a screen session. By any chance are your R processes running within a screen 
 session? (screen is a very nice program that will keep your sessions alive 
 after you log out, but it was killing off my big memory jobs for unknown 
 reasons.)

 Eric



 - Original message -
 From: Peng Yu pengyu...@gmail.com
 To: r-h...@stat.math.ethz.ch
 Date: Tue, 10 Nov 2009 10:23:18 -0600
 Subject: Re: [R] R process gets killed spontaneously

 2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de:


 Peng Yu wrote:

 My R process has been killed for a few times, although the system
 administrator did not do so. It happened when R attempted to allocate
 a lot of memory. I'm wondering whether R would spontaneously kill
 itself if it can not allocate enough memory?


 It does not kill itself. If it was not some out-of-memory kill process of
 your OS, then you may have seen some segfault. We need reproducible code and
 all the details described in the posting guide.

 Please see this thread about the necessary information.

 http://www.mail-archive.com/r-help@r-project.org/msg74804.html

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Formatted contingency tables with (%)



On Nov 10, 2009, at 3:07 PM, soeren.vo...@eawag.ch wrote:


Quite often, I need those tables:

x - sample(c(a, b, c), 40, rep=T)
y - sample(c(X, Y), 40, rep=T)
(tbl - table(x, y))
(z - as.factor(paste(as.vector(tbl),  (,  
round(prop.table(as.vector(tbl)) * 100, 1), %), sep=)))

matrix(as.factor(z), nrow=3, dimnames=dimnames(tbl))

But the result looks ugly and is not copypaste-able for LaTeX  
verbatim or table environment, moreover, the \ is not what I want  
in the printout. How to achieve:


  y
x  X  Y
a  3  (7.5%)   7 (17.5%)
b  9 (22.5%)   5 (12.5%)
c  6 (15.0%)  10 (25.0%)

Thank you for help or hints.



 library(gmodels)
 CrossTable(tbl, prop.c=F, prop.r=F, prop.chisq=F)  # author, Marc  
Schwartz



   Cell Contents
|-|
|   N |
| N / Table Total |
|-|


Total Observations in Table:  40


 | y
   x | X | Y | Row Total |
-|---|---|---|
   a | 6 |10 |16 |
 | 0.150 | 0.250 |   |
-|---|---|---|
   b |12 | 4 |16 |
 | 0.300 | 0.100 |   |
-|---|---|---|
   c | 3 | 5 | 8 |
 | 0.075 | 0.125 |   |
-|---|---|---|
Column Total |21 |19 |40 |
-|---|---|---|

--

David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] gsub does not support \b?

2009-11-10 Thread Tan, Richard

Ok, I figured it out.  My stupid mistake, should be \\b instead of \b.

From: Tan, Richard 
Sent: Tuesday, November 10, 2009 3:36 PM
To: 'r-help@r-project.org'
Subject: gsub does not support \b?

Hello, can someone help?  How come

 gsub(\bINDS\b,INDUSTRIES,ADVANCED ENERGY INDS)
[1] ADVANCED ENERGY INDS

not ADVANCED ENERGY INDUSTRIES

Thanks.
Richard

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R echo code chunk runs off the page using Lyx and Sweave

2009-11-10 Thread Mark Connolly

I am not really sure where in the interactions this is handled, but I 
would like to keep echo-ed R code chunks from running past the right 
margin and off the page.  I started with R and options(width=n), but 
this does not seem to do anything (in the context of a document -- line 
command works just fine).   I have beating my head against different Lyx 
document settings without anything to show.  Anyone point me in the 
right direction?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

Sorry for the confusion.

Let me put it in this way. Here we have 2000 people and we want to put them
into 150 groups. The distribution of the group size follows the Gamma
distribution with shape parameter 0.067 and scale parameter 0.008. At the
same time, the minimum group size is 1, and the largest one should not be
bigger than 85.

My questions:  Can I generate a set of groups that follow the above rules by
generating random draws?

By the way, I also confused by the rate and scale parameters in R. I did the
distribution test in SPSS and got those shape and scale parameters. In SPSS
Q-Q plot, the scale parameter is 0.008. I noticed that someone mentioned
that this is actually the rate. I'm confused.


Thanks a lot.
Garry




On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.edu wrote:

 I think he means rate = 0.008, so he is looking for:

  rgamma(n, shape=0.067, rate=0.008)

 Even then his problem is not well-posed.  You cannot have both
 independent
 gamma rv's and have them sum to 2000.

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tml




 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of David Winsemius
 Sent: Tuesday, November 10, 2009 2:47 PM
 To: Hongwei Dong
 Cc: R-help Forum; Duncan Murdoch
 Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte
 Carlo Simulation in R...


 On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

  Exactly! Thanks, Duncan.
 
  Let me re-phrase me question like this:
 
  1) X_i values are independent Gammas, with the shape 0.067 and scale
  0.008
  2) Min(X)=1 and Max(X)=85

 You might want to check that your parameterization in in agreement
 with that used by the rgamma function. Simply using those numbers
 yields a distribution that does not look as though it would get many
 qualifying samples. Here are 20 draws without any exclusions outside a
 range:

rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07
 7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11
 1.852885e-04 4.212802e-07 1.774495e-25
 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html


  3) SUM(X)=2000
  4) Do I also have to define the number of draws? if yes, it could be
  250.
 
  Based on these restrictions, I want to generate random draw. I'm
  wondering
  how I can do this in R. Thanks.
 
  Garry
 
 
 
  On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch
  murd...@stats.uwo.cawrote:
 
  On 11/10/2009 1:25 PM, Hongwei Dong wrote:
 
  Hi, Dear R users,
 
  I'm wondering if I can do Monte Carlo Simulation in R. My problem
  is like
  this: I know variable X follows Gamma distribution with shape
  parameter
  0.067 and scale parameter 0.008. The sum of the X is 2000. I need
  R help
  me
  to simulate a vector of X that satisfies both the probability
  distribution
  and the sum. Anyone has a clue to this? Much appreciated.
 
 
  Your requirements are slightly contradictory or incomplete.  Here's
  one way
  to fully specify the problem:
 
  The X_i values are independent Gammas, with the given shape and
  scale. You
  want to simulate from the joint distribution conditional on the
  event sum(X)
  == 2000.
 
  Is that your problem?  I don't know how to do the simulation, but
  maybe
  someone else does.
 
  Duncan Murdoch
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to suppress the output from stepAIC?

2009-11-10 Thread mohamed . lajnef


Hi Jack,

try stepAIC with trace parameter: stepAIC(...,trace=FALSE)

Regards,
M

Jack Luo jluo.rh...@gmail.com a écrit :


Hi,

I am now running a cross-validation using coxph coupled with stepAIC for
model selection, is there anyway to suppress the output? It's too much.

-Jack

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process gets killed spontaneously

2009-11-10 Thread Cedrick W. Johnson


I think he was referring to the actual 'screen' command..

but, I digress... Is there any way you can put your commands in a script 
and execute them from the command line so that you see the actual memory 
and GC output from R in realtime? I use the following (in WinXP) to 
debug any faulty processes or detect if the job was actually completed 
successfully:


(adapted for Linux, *not* winxp)
R --verbose MorningStartup.r  morningsummary-log.txt

That should yield some more results which hopefully should help you 
investigate. The pertinent stuff is output to the console.


-cedrick


Peng Yu wrote:

I run R in gnome-terminal. Is it what you referred as 'screen session'?

On Tue, Nov 10, 2009 at 3:01 PM,  rmail...@justemail.net wrote:
  

This was happening to me on Red Hat Linux when I was running huge jobs within a 
screen session. By any chance are your R processes running within a screen 
session? (screen is a very nice program that will keep your sessions alive 
after you log out, but it was killing off my big memory jobs for unknown 
reasons.)

Eric



- Original message -
From: Peng Yu pengyu...@gmail.com
To: r-h...@stat.math.ethz.ch
Date: Tue, 10 Nov 2009 10:23:18 -0600
Subject: Re: [R] R process gets killed spontaneously

2009/11/10 Uwe Ligges lig...@statistik.tu-dortmund.de:


Peng Yu wrote:
  

My R process has been killed for a few times, although the system
administrator did not do so. It happened when R attempted to allocate
a lot of memory. I'm wondering whether R would spontaneously kill
itself if it can not allocate enough memory?


It does not kill itself. If it was not some out-of-memory kill process of
your OS, then you may have seen some segfault. We need reproducible code and
all the details described in the posting guide.
  

Please see this thread about the necessary information.

http://www.mail-archive.com/r-help@r-project.org/msg74804.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

By the way, maybe the number of groups can be determined endogenously. It
will be better if I do not have to set the total number of groups
exogenously.

Thanks
Garry

On Tue, Nov 10, 2009 at 1:29 PM, Hongwei Dong pdxd...@gmail.com wrote:

 Sorry for the confusion.

 Let me put it in this way. Here we have 2000 people and we want to put them
 into 150 groups. The distribution of the group size follows the Gamma
 distribution with shape parameter 0.067 and scale parameter 0.008. At the
 same time, the minimum group size is 1, and the largest one should not be
 bigger than 85.

 My questions:  Can I generate a set of groups that follow the above rules
 by generating random draws?

 By the way, I also confused by the rate and scale parameters in R. I did
 the distribution test in SPSS and got those shape and scale parameters. In
 SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone
 mentioned that this is actually the rate. I'm confused.


 Thanks a lot.
 Garry




 On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.eduwrote:

 I think he means rate = 0.008, so he is looking for:

  rgamma(n, shape=0.067, rate=0.008)

 Even then his problem is not well-posed.  You cannot have both
 independent
 gamma rv's and have them sum to 2000.

 Ravi.

 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:

 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.html




 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of David Winsemius
 Sent: Tuesday, November 10, 2009 2:47 PM
 To: Hongwei Dong
 Cc: R-help Forum; Duncan Murdoch
 Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte
 Carlo Simulation in R...


 On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

  Exactly! Thanks, Duncan.
 
  Let me re-phrase me question like this:
 
  1) X_i values are independent Gammas, with the shape 0.067 and scale
  0.008
  2) Min(X)=1 and Max(X)=85

 You might want to check that your parameterization in in agreement
 with that used by the rgamma function. Simply using those numbers
 yields a distribution that does not look as though it would get many
 qualifying samples. Here are 20 draws without any exclusions outside a
 range:

rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07
 7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11
 1.852885e-04 4.212802e-07 1.774495e-25
 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html


  3) SUM(X)=2000
  4) Do I also have to define the number of draws? if yes, it could be
  250.
 
  Based on these restrictions, I want to generate random draw. I'm
  wondering
  how I can do this in R. Thanks.
 
  Garry
 
 
 
  On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch
  murd...@stats.uwo.cawrote:
 
  On 11/10/2009 1:25 PM, Hongwei Dong wrote:
 
  Hi, Dear R users,
 
  I'm wondering if I can do Monte Carlo Simulation in R. My problem
  is like
  this: I know variable X follows Gamma distribution with shape
  parameter
  0.067 and scale parameter 0.008. The sum of the X is 2000. I need
  R help
  me
  to simulate a vector of X that satisfies both the probability
  distribution
  and the sum. Anyone has a clue to this? Much appreciated.
 
 
  Your requirements are slightly contradictory or incomplete.  Here's
  one way
  to fully specify the problem:
 
  The X_i values are independent Gammas, with the given shape and
  scale. You
  want to simulate from the joint distribution conditional on the
  event sum(X)
  == 2000.
 
  Is that your problem?  I don't know how to do the simulation, but
  maybe
  someone else does.
 
  Duncan Murdoch
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...

2009-11-10 Thread Ravi Varadhan

If the number of groups can be set endogenously my previous email about
the smallest `n' would apply.  You can view this as a waiting time
problem.  Here is one approach:

x - round(pmax(1, pmin(rgamma(500, shape=0.067, rate=0.008), 85)))

csx - cumsum(x)

ind - which(csx  2000)[1]

xg - c(x[1:(ind-1)], 2000 - csx[ind-1])  # group sizes

sum(xg)

length(xg)  # number of groups

However, note that the elements of `xg' are not gamma variables (due to
min-max constraint), and they are all not independent (due to sum
constraint).

Hope this helps,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: rvarad...@jhmi.edu

Webpage:
http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
tml

 





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Hongwei Dong
Sent: Tuesday, November 10, 2009 4:44 PM
To: R-help Forum
Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte
Carlo Simulation in R...

By the way, maybe the number of groups can be determined endogenously. It
will be better if I do not have to set the total number of groups
exogenously.

Thanks
Garry

On Tue, Nov 10, 2009 at 1:29 PM, Hongwei Dong pdxd...@gmail.com wrote:

 Sorry for the confusion.

 Let me put it in this way. Here we have 2000 people and we want to put
them
 into 150 groups. The distribution of the group size follows the Gamma
 distribution with shape parameter 0.067 and scale parameter 0.008. At the
 same time, the minimum group size is 1, and the largest one should not be
 bigger than 85.

 My questions:  Can I generate a set of groups that follow the above rules
 by generating random draws?

 By the way, I also confused by the rate and scale parameters in R. I did
 the distribution test in SPSS and got those shape and scale parameters. In
 SPSS Q-Q plot, the scale parameter is 0.008. I noticed that someone
 mentioned that this is actually the rate. I'm confused.


 Thanks a lot.
 Garry




 On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.eduwrote:

 I think he means rate = 0.008, so he is looking for:

  rgamma(n, shape=0.067, rate=0.008)

 Even then his problem is not well-posed.  You cannot have both
 independent
 gamma rv's and have them sum to 2000.

 Ravi.



 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:


http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h

tmlhttp://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadh
an.html






 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On
 Behalf Of David Winsemius
 Sent: Tuesday, November 10, 2009 2:47 PM
 To: Hongwei Dong
 Cc: R-help Forum; Duncan Murdoch
 Subject: Re: [R] Generate Random Draw from Gamma Distribution Re: Monte
 Carlo Simulation in R...


 On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

  Exactly! Thanks, Duncan.
 
  Let me re-phrase me question like this:
 
  1) X_i values are independent Gammas, with the shape 0.067 and scale
  0.008
  2) Min(X)=1 and Max(X)=85

 You might want to check that your parameterization in in agreement
 with that used by the rgamma function. Simply using those numbers
 yields a distribution that does not look as though it would get many
 qualifying samples. Here are 20 draws without any exclusions outside a
 range:

rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07
 7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11
 1.852885e-04 4.212802e-07 1.774495e-25
 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html


  3) SUM(X)=2000
  4) Do I also have to define the number of draws? if yes, it could be
  250.
 
  Based on these restrictions, I want to generate random draw. I'm
  wondering
  how I can do this in R. Thanks.
 
  Garry
 
 
 
  On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch
  murd...@stats.uwo.cawrote:
 
  On 11/10/2009 1:25 PM, Hongwei Dong wrote:
 
  Hi, Dear R users,
 
  I'm wondering if I can do Monte Carlo Simulation in R. My problem
  is like
  this: I know variable X follows Gamma distribution with shape
  parameter
  0.067 and scale parameter 0.008. The sum of

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...


On Nov 10, 2009, at 4:29 PM, Hongwei Dong wrote:

 Sorry for the confusion.

 Let me put it in this way. Here we have 2000 people and we want to  
 put them into 150 groups. The distribution of the group size follows  
 the Gamma distribution with shape parameter 0.067 and scale  
 parameter 0.008. At the same time, the minimum group size is 1, and  
 the largest one should not be bigger than 85.

 My questions:  Can I generate a set of groups that follow the above  
 rules by generating random draws?

 By the way, I also confused by the rate and scale parameters in R. I  
 did the distribution test in SPSS and got those shape and scale  
 parameters. In SPSS Q-Q plot, the scale parameter is 0.008. I  
 noticed that someone mentioned that this is actually the rate. I'm  
 confused.


I think you may want your scale (from whatever dataset it was  
determined) to be 1/0.008

Read this message on an SPSS mailing list:

http://www.listserv.uga.edu/cgi-bin/wa?A2=ind9902L=spssx-lD=0F=PP=7633

 and then look at the parameterization for rgamma in the r help  
pages. I seem to remember that Terry Therneau has also pointed out  
this ambiguous scale parameter issue for gamma in the past.

(This is, by the way, beginning to sound quite a bit like a homework  
problem.)

-- 
David


 Thanks a lot.
 Garry




 On Tue, Nov 10, 2009 at 12:15 PM, Ravi Varadhan rvarad...@jhmi.edu  
 wrote:
 I think he means rate = 0.008, so he is looking for:

  rgamma(n, shape=0.067, rate=0.008)

 Even then his problem is not well-posed.  You cannot have both  
 independent
 gamma rv's and have them sum to 2000.

 Ravi.
 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: rvarad...@jhmi.edu

 Webpage:
 http://www.jhsph.edu/agingandhealth/People/Faculty_personal_pages/Varadhan.h
 tml



 
 


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
 ] On
 Behalf Of David Winsemius
 Sent: Tuesday, November 10, 2009 2:47 PM
 To: Hongwei Dong
 Cc: R-help Forum; Duncan Murdoch
 Subject: Re: [R] Generate Random Draw from Gamma Distribution Re:  
 Monte
 Carlo Simulation in R...


 On Nov 10, 2009, at 2:26 PM, Hongwei Dong wrote:

  Exactly! Thanks, Duncan.
 
  Let me re-phrase me question like this:
 
  1) X_i values are independent Gammas, with the shape 0.067 and scale
  0.008
  2) Min(X)=1 and Max(X)=85

 You might want to check that your parameterization in in agreement
 with that used by the rgamma function. Simply using those numbers
 yields a distribution that does not look as though it would get many
 qualifying samples. Here are 20 draws without any exclusions outside a
 range:

rgamma(20, shape=0.067,  scale = 0.008)
  [1] 2.213459e-03 2.815705e-05 2.381306e-04 2.264602e-07 1.293713e-07
 7.680773e-38 6.441082e-15 6.168961e-13
  [9] 5.089033e-06 1.571858e-16 9.869878e-12 1.813121e-13 1.253287e-11
 1.852885e-04 4.212802e-07 1.774495e-25
 [17] 1.892984e-07 5.927422e-17 1.322638e-12 4.327472e-05

 http://finzi.psych.upenn.edu/R/Rhelp02/archive/31459.html


  3) SUM(X)=2000
  4) Do I also have to define the number of draws? if yes, it could be
  250.
 
  Based on these restrictions, I want to generate random draw. I'm
  wondering
  how I can do this in R. Thanks.
 
  Garry
 
 
 
  On Tue, Nov 10, 2009 at 11:17 AM, Duncan Murdoch
  murd...@stats.uwo.cawrote:
 
  On 11/10/2009 1:25 PM, Hongwei Dong wrote:
 
  Hi, Dear R users,
 
  I'm wondering if I can do Monte Carlo Simulation in R. My problem
  is like
  this: I know variable X follows Gamma distribution with shape
  parameter
  0.067 and scale parameter 0.008. The sum of the X is 2000. I need
  R help
  me
  to simulate a vector of X that satisfies both the probability
  distribution
  and the sum. Anyone has a clue to this? Much appreciated.
 
 
  Your requirements are slightly contradictory or incomplete.  Here's
  one way
  to fully specify the problem:
 
  The X_i values are independent Gammas, with the given shape and
  scale. You
  want to simulate from the joint distribution conditional on the
  event sum(X)
  == 2000.
 
  Is that your problem?  I don't know how to do the simulation, but
  maybe
  someone else does.
 
  Duncan Murdoch
 
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 Heritage Laboratories
 West Hartford, CT

 __
 R-help@r-project.org mailing list

Re: [R] R echo code chunk runs off the page using Lyx and Sweave

2009-11-10 Thread Ista Zahn

options(width=n) is supposed to work, and does for me. I don't use Lyx though...

-Ista
On Tue, Nov 10, 2009 at 4:27 PM, Mark Connolly mark_conno...@acm.org wrote:
 I am not really sure where in the interactions this is handled, but I would
 like to keep echo-ed R code chunks from running past the right margin and
 off the page.  I started with R and options(width=n), but this does not seem
 to do anything (in the context of a document -- line command works just
 fine).   I have beating my head against different Lyx document settings
 without anything to show.  Anyone point me in the right direction?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R echo code chunk runs off the page using Lyx and Sweave

2009-11-10 Thread Ben Bolker




Ista Zahn wrote:
 
 options(width=n) is supposed to work, and does for me. I don't use Lyx
 though...
 
 -Ista
 On Tue, Nov 10, 2009 at 4:27 PM, Mark Connolly mark_conno...@acm.org
 wrote:
 I am not really sure where in the interactions this is handled, but I
 would
 like to keep echo-ed R code chunks from running past the right margin and
 off the page.  I started with R and options(width=n), but this does not
 seem
 to do anything (in the context of a document -- line command works just
 fine).   I have beating my head against different Lyx document settings
 without anything to show.  Anyone point me in the right direction?
 
 

\SweaveOpts{keep.source=TRUE}

in your LaTeX code will (I think) keep whatever manual formatting you do, 
in all code chunks (or use keep.source=TRUE) for particular code chunks
of concern

-- 
View this message in context: 
http://old.nabble.com/R-echo-code-chunk-runs-off-the-page-using-Lyx-and-Sweave-tp26291535p26292226.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Generate Random Draw from Gamma Distribution Re: Monte Carlo Simulation in R...