Re: [R] cubature

2012-03-25 Thread Jeff Newmiller
Who knows? You mention different machines, yet fail to hint what they are much 
less provide sessionInfo() output, and x is not defined in your global 
environment in your snippet. Please read the posting guide.

I am not familiar with this library, but I would be surprised if it were 
capable of handling variable limits. I didn't see any hint that such would be 
supported in the help page. Are you sure the same code is being used on 
different machines?
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

JMDS jessica.d...@rmit.edu.au wrote:

Hi,

I am using adaptIntegrate from Cubature to do numerical integration on
a
double integral with a 1 x 2 vector x.

Say the function is something simple to start like f(x)=x1*x2 and I
wish to
integrate x1 over (0,365-x2) and x2 over (0,365)

f - function(x) {(x[2])*(x[1])} # x is vector
int1-adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(365-x[2],
365))


I recieve the following error:

Error in adaptIntegrate(f, lowerLimit = c(0, 0), upperLimit = c(365 - 
: 
  object 'x' not found

The problem is that this code works fine on other machines and I wonder
If I
am missing some referenced package somewhere.

Any help would be greatly appreciated.

Thanks!




--
View this message in context:
http://r.789695.n4.nabble.com/cubature-tp4502516p4502516.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to compute within-group mean and sd?

2012-03-25 Thread reeyarn
Thanks Rui, Greg, and Michael.
Your answers helped a lot!

Best,
Reeyarn

On Sun, Mar 25, 2012 at 1:02 AM, Rui Barradas rui1...@sapo.pt wrote:

 Hello,


 You can use sql with package 'sqldf'.
  library(sqldf) # It needs package 'tcltk'
 ?sqldf
  tbl - data.frame(firm_id=rep(c(1,2), 10), value=rnorm(20))
  sqldf(SELECT firm_id, count(*), avg(value), stdev(value)
  FROM tbl
  GROUP BY firm_id;)

 I've changed the name of your data.frame because 'table' is an R function.
  Hope this helps,

 Rui Barradas



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Multivariate function from univariate functions

2012-03-25 Thread physicistintheory
I'm relatively new to R and I'm stuck.

I'm trying to construct a surface to optimize from a multivariate dataset. 
The dataset contains the response of a system to various stimuli.  I am
trying to optimize the mix of stimuli to maximize the response.  To do so,
I've interpolated the various datasets of type response vs. stimuli and I
now have an array of functions interps whose length is the length of the
array of the names of the various stimuli.  I've also created a vector
containing names for the stimuli, vars = x.1, x.2, x.3...

Anyway, each of the functions in interps depends only on one variable
(obviously).  I would like to construct a function, call it, surface which
is essentially: surface(vars) = interps[[1]]vars[[1]] +
interps[[2]]vars[[2]]+...

I've tried constructing surface recursively in a for loop:  
surface - function(vars){0}
for(i in 1:length(vars)){
surface - function(vars){surface(vars) + interps[[i]](vars[[i]])}
}

This results in an infinite recursion error...which I think I understand. 
Essentially, I'm trying to do the analog of something like i = i+1, but
instead of adding integers, I want to add terms to a function.

Any help is appreciated.

--
View this message in context: 
http://r.789695.n4.nabble.com/Multivariate-function-from-univariate-functions-tp4502670p4502670.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Handling 8GB .txt file in R?

2012-03-25 Thread iliketurtles
Thanks to all the suggestions. To the first individual that replied, I can't
do any stuff with unix or perl. All I know is R.

@KEN:
I'm using Windows 7, 64 bit.

@Steve:
Here's the readLines output.. As we can see, lines 1-3 are empty and line 5
is empty, and there's also empty elements after line 5!. 

 [1]  


   
  [2]  


  

  [3] 



  [4]   PERMNO  DATETICKERPERMCO PRC  
VOLNUMTRDvwretdewretd  


  [5] 



  [6]106/01/19867952  .
 
. . -0.000138  0.001926

  
  [7]107/01/1986OMFGA   7952-2.56250 
1000 .  0.013809  0.011061 

 
  [8]108/01/1986OMFGA   7952-2.5
12800 . -0.020744 -0.005117

  
  [9]109/01/1986OMFGA   7952-2.5 
1400 . -0.011219 -0.011588 

 
 [10]110/01/1986OMFGA   7952-2.5 
8500 .  0.83  0.003651 

 
 [11]113/01/1986OMFGA   7952-2.62500 
5450 .  0.002749  0.002433 
   

-


Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context: 
http://r.789695.n4.nabble.com/Handling-8GB-txt-file-in-R-tp4500971p4502706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to install tclRequire(Iwidgets)

2012-03-25 Thread mrzung
hi, i'm trying to make Tabbed Notebook Widget,but cannot install
tclRequire(Iwidgets).

I installed Activetcl but I don't know what can be the next step to execute
tclRequire(Iwidgets).

how can i do that?

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-install-tclRequire-Iwidgets-tp4502775p4502775.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Weird POSIXct behaviour

2012-03-25 Thread Worik R
Friends

I have an xts that I wish to access.

Browse[2] DATA.ba[[p]][2012-03-20 00:59:57,bid]
   bid
2012-03-20 00:59:57 1.4993

So far so good.

Now putting the index into a variable:

Browse[2] Time
[1] 2012-03-20 00:59:57 NZDT
Browse[2] DATA.ba[[p]][Time, bid]
 bid

Where has it gone?


Looking closer


Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid])
[1] 2012-03-20 00:59:57 NZDT
Browse[2]

Browse[2] Time
[1] 2012-03-20 00:59:57 NZDT
Browse[2]

Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid]) == Time
[1] FALSE
Browse[2]

Browse[2] class(Time)
[1] POSIXct POSIXt

Browse[2] class(index(DATA.ba[[p]][2012-03-20 00:59:57,bid]))
[1] POSIXct POSIXt

So the variable 'Time' should be good to index DATA.ba[[p]].  I am
hopelessly confused.

cheers
Worik

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to install tclRequire(Iwidgets)

2012-03-25 Thread peter dalgaard

On Mar 25, 2012, at 08:11 , mrzung wrote:

 hi, i'm trying to make Tabbed Notebook Widget,but cannot install
 tclRequire(Iwidgets).
 
 I installed Activetcl but I don't know what can be the next step to execute
 tclRequire(Iwidgets).
 
 how can i do that?

By adding Iwidgets to your Tcl/Tk installation. How to do that is not an R 
issue. (It depends on your platform, so I really don't know, but you might try 
looking at http://incrtcl.sourceforge.net )

 
 Thanks
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/How-to-install-tclRequire-Iwidgets-tp4502775p4502775.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weird POSIXct behaviour

2012-03-25 Thread peter dalgaard
I see NZDT going in and out of your time values. Could that have something to 
do with it?

On Mar 25, 2012, at 10:15 , Worik R wrote:

 Friends
 
 I have an xts that I wish to access.
 
 Browse[2] DATA.ba[[p]][2012-03-20 00:59:57,bid]
   bid
 2012-03-20 00:59:57 1.4993
 
 So far so good.
 
 Now putting the index into a variable:
 
 Browse[2] Time
 [1] 2012-03-20 00:59:57 NZDT
 Browse[2] DATA.ba[[p]][Time, bid]
 bid
 
 Where has it gone?
 
 
 Looking closer
 
 
 Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid])
 [1] 2012-03-20 00:59:57 NZDT
 Browse[2]
 
 Browse[2] Time
 [1] 2012-03-20 00:59:57 NZDT
 Browse[2]
 
 Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid]) == Time
 [1] FALSE
 Browse[2]
 
 Browse[2] class(Time)
 [1] POSIXct POSIXt
 
 Browse[2] class(index(DATA.ba[[p]][2012-03-20 00:59:57,bid]))
 [1] POSIXct POSIXt
 
 So the variable 'Time' should be good to index DATA.ba[[p]].  I am
 hopelessly confused.
 
 cheers
 Worik
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Handling 8GB .txt file in R?

2012-03-25 Thread Jan van der Laan


What you could try to do is skip the first 5 lines. After that the file 
seems to be 'normal'. With read.table.ffdf you could try something like


# open a connection to the file
con - file('yourfile', 'rt')
# skip first 5 lines
tmp - readLines(con, n=5)
# read the remainder using read.table.ffdf
ffdf - read.table.ffdf(file=con)
# close connection
close(con)

HTH

Jan

On 03/25/2012 06:20 AM, iliketurtles wrote:

Thanks to all the suggestions. To the first individual that replied, I can't
do any stuff with unix or perl. All I know is R.

@KEN:
I'm using Windows 7, 64 bit.

@Steve:
Here's the readLines output.. As we can see, lines 1-3 are empty and line 5
is empty, and there's also empty elements after line 5!.

  [1]  
   [2] 

   [3]  
   [4]   PERMNO  DATETICKERPERMCO PRC
VOLNUMTRDvwretdewretd
   [5] 
   [6]106/01/19867952  .
. . -0.000138  0.001926
   [7]107/01/1986OMFGA   7952-2.56250
1000 .  0.013809  0.011061
   [8]108/01/1986OMFGA   7952-2.5
12800 . -0.020744 -0.005117
   [9]109/01/1986OMFGA   7952-2.5
1400 . -0.011219 -0.011588
  [10]110/01/1986OMFGA   7952-2.5
8500 .  0.83  0.003651
  [11]113/01/1986OMFGA   7952-2.62500
5450 .  0.002749  0.002433

-


Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context: 
http://r.789695.n4.nabble.com/Handling-8GB-txt-file-in-R-tp4500971p4502706.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-25 Thread Biedermann, Jürgen
Hi there,

I have a linear model with one factor having three levels.
I want to check if the different levels significantly differ from the overall 
mean (using contr.sum).
However one level (the last) is omitted in the standard procedure.

To illustrate this:

x - as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
y - 
c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
test - data.frame(x,y)
reg1 - lm(y~C(x,contr.sum),data=test)
summary(reg1)

Coefficients:
 Estimate Std. Error t value Pr(|t|)   
(Intercept)   1.60.06577  24.834 8.48e-15 ***
C(x, contr.sum)1 -0.483330.10792  -4.479  0.00033 ***
C(x, contr.sum)2 -0.483330.08936  -5.409 4.70e-05 ***

Is it possible to get the effect for the third level (against the overall mean) 
in the table too.

I figured out:

reg2 - lm(y~C(relevel(x,3),contr.sum),data=test)
summary(reg2)

C(relevel(x, 3), contr.sum)1  0.966670.07951  12.158 8.24e-10 ***
C(relevel(x, 3), contr.sum)2 -0.483330.10792  -4.479  0.00033 ***


The first row now test the third level against the overall mean, but I find 
this approach not so convenient.
Moreover, I wonder if it is meaningful at all regarding the cumulation of alpha 
error. Would a Bonferroni correction be sensible?

Greetings and thanks in advance
Jürgen
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] output by(...)

2012-03-25 Thread INCOMA GfK
Dear all,

I have a question that is probably pretty stupid, so apologies in advance...

I do a simple

 mydata.tab - by(my.data.frame, my.data.frame$category, colMeans)

...works fine, but I need to output the results to some flat file (kind of 
table) to work with it in Excel etc.

So I am doing now

 capture.output(data.frame(unlist(mydata.tab)), file=mydata.txt)

...and process the result in Excel.
Do you know a more pretty way to do this task? Perhaps something other than 
'by()' to make a table of colMeans?

Many thanks!
Zdenek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Jan van der Laan


Your question is not completely clear. read.csv.ffdf  automatically 
reads in the data in chunks. You don´t have to do anything for that. You 
can specify the size of the chunks using the next.rows option.


Jan


On 03/24/2012 09:29 PM, Mav wrote:

Hello!
A question about reading large CSV files

I need to analyse several files with  sizes larger than 3 GB. Those files
have more than 10million rows (and up to 25 million) and 9 columns. Since I
don´t have a large RAM memory,  I think that the ff package can really help
me. I am trying to use read.csv.ffdf but I have some questions:

How can I read the files in several chunks…with an automatic way of
calculating the number of rows to include in each chunk? (my problem is that
the files have different number of rows)

For instance…. I have used
read.csv.ffdf(NULL, “file.csv”, sep=|, dec=.,header = T,row.names =
NULL,colClasses = c(rep(integer, 3), rep(integer, 10), rep(integer,
6)))
  But with this way I am reading the whole fileI would prefer to read it
in chunksbut I don´t know  how to read it in chunks

I have read the ff documentation but I am not good with R!

Thanks in advance!

--
View this message in context: 
http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4502070.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] output by(...)

2012-03-25 Thread Gabor Grothendieck
2012/3/25 Skála, Zdeněk (INCOMA GfK) zdenek.sk...@gfk.com:
 Dear all,

 I have a question that is probably pretty stupid, so apologies in advance...

 I do a simple

 mydata.tab - by(my.data.frame, my.data.frame$category, colMeans)

 ...works fine, but I need to output the results to some flat file (kind of 
 table) to work with it in Excel etc.

 So I am doing now

 capture.output(data.frame(unlist(mydata.tab)), file=mydata.txt)

 ...and process the result in Excel.
 Do you know a more pretty way to do this task? Perhaps something other than 
 'by()' to make a table of colMeans?


Try this:

do.call(rbind, by(iris[-5], iris[[5]], colMeans))




-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] output by(...)

2012-03-25 Thread peter dalgaard

On Mar 25, 2012, at 12:27 , Gabor Grothendieck wrote:

 2012/3/25 Skála, Zdeněk (INCOMA GfK) zdenek.sk...@gfk.com:
 Dear all,
 
 I have a question that is probably pretty stupid, so apologies in advance...
 
 I do a simple
 
 mydata.tab - by(my.data.frame, my.data.frame$category, colMeans)
 
 ...works fine, but I need to output the results to some flat file (kind of 
 table) to work with it in Excel etc.
 
 So I am doing now
 
 capture.output(data.frame(unlist(mydata.tab)), file=mydata.txt)
 
 ...and process the result in Excel.
 Do you know a more pretty way to do this task? Perhaps something other than 
 'by()' to make a table of colMeans?
 
 
 Try this:
 
 do.call(rbind, by(iris[-5], iris[[5]], colMeans))
 
 

How about 

aggregate(iris[-5], iris[5], mean)

?

 
 
 -- 
 Statistics  Software Consulting
 GKX Group, GKX Associates Inc.
 tel: 1-877-GKX-GROUP
 email: ggrothendieck at gmail.com
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-25 Thread Gabor Grothendieck
2012/3/25 Biedermann, Jürgen juergen.biederm...@charite.de:
 Hi there,

 I have a linear model with one factor having three levels.
 I want to check if the different levels significantly differ from the overall 
 mean (using contr.sum).
 However one level (the last) is omitted in the standard procedure.

 To illustrate this:

 x - as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
 y - 
 c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
 test - data.frame(x,y)
 reg1 - lm(y~C(x,contr.sum),data=test)
 summary(reg1)

 Coefficients:
                 Estimate Std. Error t value Pr(|t|)
 (Intercept)       1.6    0.06577  24.834 8.48e-15 ***
 C(x, contr.sum)1 -0.48333    0.10792  -4.479  0.00033 ***
 C(x, contr.sum)2 -0.48333    0.08936  -5.409 4.70e-05 ***

 Is it possible to get the effect for the third level (against the overall 
 mean) in the table too.

 I figured out:

 reg2 - lm(y~C(relevel(x,3),contr.sum),data=test)
 summary(reg2)

 C(relevel(x, 3), contr.sum)1  0.96667    0.07951  12.158 8.24e-10 ***
 C(relevel(x, 3), contr.sum)2 -0.48333    0.10792  -4.479  0.00033 ***


 The first row now test the third level against the overall mean, but I find 
 this approach not so convenient.
 Moreover, I wonder if it is meaningful at all regarding the cumulation of 
 alpha error. Would a Bonferroni correction be sensible?


Try this:

 options(contrasts = c(contr.sum, contr.poly))
 reg1 - lm(y~x,data=test)
 dummy.coef(reg1)
Full coefficients are

(Intercept):  1.63
x:   1  2  3
-0.483 -0.483  0.967

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Format wanted...

2012-03-25 Thread Duncan Murdoch

On 12-03-24 10:47 PM, J Toll wrote:

On Sat, Mar 24, 2012 at 7:30 PM, Duncan Murdoch
murdoch.dun...@gmail.com  wrote:

Do we have a format that always includes a decimal point and a given number
of significant digits, but otherwise drops unnecessary characters?  For
example, if I wanted 5 digits, I'd want the following:

Round to 5 digits:
1.234567  -  1.2346

Drop unnecessary zeros:
1.23  -  1.23

Force inclusion of a decimal point:
1 -  1.



Duncan,

Maybe sprintf() will work for you.  As it's a wrapper for C sprintf,
it should have its functionality.


Maybe, but with which format string?

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Format wanted...

2012-03-25 Thread Marc Schwartz

On Mar 25, 2012, at 7:14 AM, Duncan Murdoch wrote:

 On 12-03-24 10:47 PM, J Toll wrote:
 On Sat, Mar 24, 2012 at 7:30 PM, Duncan Murdoch
 murdoch.dun...@gmail.com  wrote:
 Do we have a format that always includes a decimal point and a given number
 of significant digits, but otherwise drops unnecessary characters?  For
 example, if I wanted 5 digits, I'd want the following:
 
 Round to 5 digits:
 1.234567  -  1.2346
 
 Drop unnecessary zeros:
 1.23  -  1.23
 
 Force inclusion of a decimal point:
 1 -  1.
 
 
 Duncan,
 
 Maybe sprintf() will work for you.  As it's a wrapper for C sprintf,
 it should have its functionality.
 
 Maybe, but with which format string?
 
 Duncan Murdoch


I don't believe (though could be wrong), that you can do it all with one format 
string, but can do it conditionally based upon the input. According to the C 
printf documentation, the use of # forces a decimal point to be present, even 
if there are no trailing digits. Thus:

 sprintf(%#.f, 1)
[1] 1.

The other two values seem to be handled by signif() when applied to each value 
individually:

 signif(1.234567, 5)
[1] 1.2346

 signif(1.23, 5)
[1] 1.23

But, not when a vector:

 signif(c(1.234567, 1.23), 5)
[1] 1.2346 1.2300


So, wrapping that inside a function, using ifelse() to test for an integer 
value:

signif.d - function(x, digits)
{
  ifelse(x == round(x), 
 sprintf(%.#f, x), 
 signif(x, digits))
}


x - c(1.234567, 1.23, 1)

 signif.d(x, 5)
[1] 1.2346 1.23   1.  

 signif.d(x, 6)
[1] 1.23457 1.231. 

 signif.d(x, 7)
[1] 1.234567 1.23 1.  


Not extensively tested of course, but hopefully that might work for your needs 
Duncan.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to test omitted level from a multiple level factor against overall mean in regression models?

2012-03-25 Thread Biedermann, Jürgen
Hi Gabor,

Thanks a lot for the answer. 
However, I'm not so much focusing on the pure effect value of the omitted 
factor level, but more on the statistical test if it
differs significantly from 0.
Do you know a way for this purpose too?

Greetings Jürgen

Von: Gabor Grothendieck [ggrothendi...@gmail.com]
Gesendet: Sonntag, 25. März 2012 14:11
An: Biedermann, Jürgen
Cc: r-help@R-project.org
Betreff: Re: [R] How to test omitted level from a multiple level factor against 
overall mean in regression models?

2012/3/25 Biedermann, Jürgen juergen.biederm...@charite.de:
 Hi there,

 I have a linear model with one factor having three levels.
 I want to check if the different levels significantly differ from the overall 
 mean (using contr.sum).
 However one level (the last) is omitted in the standard procedure.

 To illustrate this:

 x - as.factor(c(1,1,1,2,2,2,2,2,2,3,3,3,3,3,3,3,3,3,3,3))
 y - 
 c(1.1,1.15,1.2,1.1,1.1,1.1,1.2,1.2,1.2,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,3,3.1)
 test - data.frame(x,y)
 reg1 - lm(y~C(x,contr.sum),data=test)
 summary(reg1)

 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)   1.60.06577  24.834 8.48e-15 ***
 C(x, contr.sum)1 -0.483330.10792  -4.479  0.00033 ***
 C(x, contr.sum)2 -0.483330.08936  -5.409 4.70e-05 ***

 Is it possible to get the effect for the third level (against the overall 
 mean) in the table too.

 I figured out:

 reg2 - lm(y~C(relevel(x,3),contr.sum),data=test)
 summary(reg2)

 C(relevel(x, 3), contr.sum)1  0.966670.07951  12.158 8.24e-10 ***
 C(relevel(x, 3), contr.sum)2 -0.483330.10792  -4.479  0.00033 ***


 The first row now test the third level against the overall mean, but I find 
 this approach not so convenient.
 Moreover, I wonder if it is meaningful at all regarding the cumulation of 
 alpha error. Would a Bonferroni correction be sensible?


Try this:

 options(contrasts = c(contr.sum, contr.poly))
 reg1 - lm(y~x,data=test)
 dummy.coef(reg1)
Full coefficients are

(Intercept):  1.63
x:   1  2  3
-0.483 -0.483  0.967

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] output by(...)

2012-03-25 Thread INCOMA GfK
perfect, thanks a lot! :o)
-z


Odesílatel: Gabor Grothendieck [ggrothendi...@gmail.com]
Odesláno: 25. března 2012 12:27
Komu: Skála, Zdeněk (INCOMA GfK)
Kopie: r-help@r-project.org
Předmět: Re: [R] output by(...)

2012/3/25 Skála, Zdeněk (INCOMA GfK) zdenek.sk...@gfk.com:
 Dear all,

 I have a question that is probably pretty stupid, so apologies in advance...

 I do a simple

 mydata.tab - by(my.data.frame, my.data.frame$category, colMeans)

 ...works fine, but I need to output the results to some flat file (kind of 
 table) to work with it in Excel etc.

 So I am doing now

 capture.output(data.frame(unlist(mydata.tab)), file=mydata.txt)

 ...and process the result in Excel.
 Do you know a more pretty way to do this task? Perhaps something other than 
 'by()' to make a table of colMeans?


Try this:

do.call(rbind, by(iris[-5], iris[[5]], colMeans))




--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Work -Shift Scheduling - Constraint Linear Programming

2012-03-25 Thread agent dunham
Dear Community, 

I've a Work -Shift Scheduling Problem I'd like to solve via constraint
linear programming. 

Maybe something similar to
http://support.sas.com/documentation/cdl/en/orcpug/63349/HTML/default/viewer.htm#orcpug_clp_sect037.htm

Can anybody suggest me any package/R examples to solve this?

If it's needed more details of my little problemm I can provide. 

Thanks in advance, u...@host.com as u...@host.com

--
View this message in context: 
http://r.789695.n4.nabble.com/Work-Shift-Scheduling-Constraint-Linear-Programming-tp4503037p4503037.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2012-03-25 Thread Anjana Thampi
How do you decompose inequality in R, say by gender?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to calculate deviance on test data

2012-03-25 Thread zmlr
Hi all,

If I got a Cox model based on training set, then how should I calculate the
Cox log partial likelihood for the test data?
Actually I am trying to calculate the deviance on test dataset to evaluate
the performance of prediction model, the equation is as follows: D =
-2{L(test)[beta_train] - L(test)[0]}. It means using the beta coefficients
got from training set to calculate the likelihood of test data. I know I can
got log likelihood for training model, but how to get it for test data?
Any package or function can do that? 


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-calculate-deviance-on-test-data-tp4503206p4503206.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] SPLS does not output probabilities for predicted classes.

2012-03-25 Thread saskay
Hi,
I was using SPLS package to do multi-class classification and would require
probabilities to be output for each class.
In the vignette, the documentation does say that you can output
probabilities by requesting fit.type = response, however I still only get
predicted classes, rather than the respective probabilities.

library(spls)
data(prostate)
# SPLSDA with eta=0.8  3 hidden components
f - splsda( prostate$x, prostate$y, K=3, eta=0.8, scale.x=FALSE )
# Prediction on the training dataset
(pred.f - predict( f, type=fit ))
#predict probabilities
pred.f_withprobs - predict( f, type=fit, fit.type = response )



--
View this message in context: 
http://r.789695.n4.nabble.com/SPLS-does-not-output-probabilities-for-predicted-classes-tp4503031p4503031.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multivariate function from univariate functions

2012-03-25 Thread Mitchell Maltenfort
Naïve question: would a saturated multivariate model work as an
interpolation function?

On 3/24/12, physicistintheory physicistinthe...@gmail.com wrote:
 I'm relatively new to R and I'm stuck.

 I'm trying to construct a surface to optimize from a multivariate dataset.
 The dataset contains the response of a system to various stimuli.  I am
 trying to optimize the mix of stimuli to maximize the response.  To do so,
 I've interpolated the various datasets of type response vs. stimuli and I
 now have an array of functions interps whose length is the length of the
 array of the names of the various stimuli.  I've also created a vector
 containing names for the stimuli, vars = x.1, x.2, x.3...

 Anyway, each of the functions in interps depends only on one variable
 (obviously).  I would like to construct a function, call it, surface which
 is essentially: surface(vars) = interps[[1]]vars[[1]] +
 interps[[2]]vars[[2]]+...

 I've tried constructing surface recursively in a for loop:
 surface - function(vars){0}
 for(i in 1:length(vars)){
 surface - function(vars){surface(vars) + interps[[i]](vars[[i]])}
 }

 This results in an infinite recursion error...which I think I understand.
 Essentially, I'm trying to do the analog of something like i = i+1, but
 instead of adding integers, I want to add terms to a function.

 Any help is appreciated.

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Multivariate-function-from-univariate-functions-tp4502670p4502670.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Sent from my mobile device


Ersatzistician and Chutzpahthologist
I can answer any question.  I don't know is an answer. I don't know yet
is a better answer.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonparametric bivariate distribution estimation and sampling

2012-03-25 Thread heyi xiao
David,
Great idea! It is simple yet very effective. I didn’t know that sample can be 
used with probability/weight. I feel that I learned a lot. Thanks a lot for the 
help!
Heyi


--- On Fri, 3/23/12, David Winsemius dwinsem...@comcast.net wrote:

 From: David Winsemius dwinsem...@comcast.net
 Subject: Re: [R] Nonparametric bivariate distribution estimation and sampling
 To: heyi xiao xiaohey...@yahoo.com
 Cc: Sarah Goslee sarah.gos...@gmail.com, r-help@r-project.org
 Date: Friday, March 23, 2012, 5:29 PM
 
 On Mar 23, 2012, at 3:55 PM, heyi xiao wrote:
 
  David,
  Thanks a lot for the specific suggestions. That’s
 very helpful. My question 1 is fully answered now. I guess I
 am not clear enough for my question 2. I would like to
 generate a random sample using the estimated probability
 density (as a result of my question 1) as the reference
 distribution.
 
  Say, I get a matrix of the estimated density (at some
 grid points) using MASS::kde2d. How can I use that result as
 a reference distribution to sample data from? I know it is a
 trivial issue for parametric distributions like bivariate
 normal, but what about such a nonparametric bivariate
 reference distribution? Any particular procedures or
 functions I can use?
 
 See if this works:
 
 data(geyser, package=MASS)
 x - cbind(geyser$duration, geyser$waiting)
 est - bkde2D(x, bandwidth=c(0.7, 7))
 
 # Heh, I realized after I did this that I started with
 KernSmooth::bkde2D
 # and checked the results with MASS::kde2d
 # only difference appears to be name of density matrix
 # Construct a dataframe with X.Y information and the data
 from the bivariate density.
 # The output of bkde2D with n=50 is:
 
 #List of 3
  #$ x1  : num [1:51] -0.2167 -0.0823 0.052 0.1863
 0.3207 ...
 # $ x2  : num [1:51] 32.5 34.2 35.9 37.7 39.4 ...
 # $ fhat: num [1:51, 1:51] 3.05e-19 2.17e-19 3.25e-19
 2.17e-19 0.00 ...
 
 # The index X.Y could be X + 51*Y and there would be a 1:1
 mapping from (X,Y) to X.Y
 # and the fhat values would be properly arranged
 
 dfrm - expand.grid(X=1:51, Y=1:51)
 dfrm$fhat - c(est$fhat)
 
 #Sample randomly from X.Y with length=51*51 using the fhat
 values for prob.
 
 # The X.Y index never actually gets computed
 # but is implicit in the order of the data.frame
  sampfrm - dfrm[sample(51*51, 300, prob=est$fhat) , ]
  f2 - with(sampfrm,  MASS::kde2d(X, Y, n = 50, lims
 = c(0, 51, 0, 51)) )
  persp(f2)
 
 # Looks reasonable to my eye anyway.
 
 --David.
 
  The reason I don’t want to use sampling (with
 replacement, I can sample more data than I have without
 replacement), as this will generate lots of duplicate data
 points, if I want to generated bigger dataset yet my raw
 data do not have a big sample size. The scatter plot of the
 sampled data doesn’t look good this way.
  Heyi
  
  
  --- On Fri, 3/23/12, David Winsemius dwinsem...@comcast.net
 wrote:
  
  From: David Winsemius dwinsem...@comcast.net
  Subject: Re: [R] Nonparametric bivariate
 distribution estimation and sampling
  To: heyi xiao xiaohey...@yahoo.com
  Cc: Sarah Goslee sarah.gos...@gmail.com,
 r-help@r-project.org
  Date: Friday, March 23, 2012, 2:20 PM
  
  On Mar 23, 2012, at 1:53 PM, heyi xiao wrote:
  
  Sarah,
  Thanks for the response. I actually have
 several years
  of working experience with R and statistics,
 although may
  not be as good as you. that’s why I am here ;) I
 dug
  deeper into R documentations and previous R-help
 posts, and
  couldn’t found anything particular help. Again, I
 want to
  do two things: (1) estimate the probability density
 of this
  bivariate distribution using some nonparametric
 method
  (kernel, spline etc);
  
  ?MASS::kde2d
  ?KernSmooth::bkde2D
  ?ade4::s.kde2d
  help(package=locfit)
  
  (2) sample a big dataset from this bivariate
  distribution for a simulation study.
  
  What is wrong with `sample`?
  
  # to get sample of size n without replacement
  set.seed(42)
  dfrm[ sample(1:NROW(dfrm), n) , ]
  
  --David.
  If my questions are not clear enough show my
 how I can
  improve, or which part is not clear enough. If you
 have any
  particular suggestions/comments, you are more than
 welcome.
  Thanks!
  Heyi
  
  
  --- On Fri, 3/23/12, Sarah Goslee sarah.gos...@gmail.com
  wrote:
  
  From: Sarah Goslee sarah.gos...@gmail.com
  Subject: Re: [R] Nonparametric bivariate
  distribution estimation and sampling
  To: heyi xiao xiaohey...@yahoo.com
  Cc: r-help@r-project.org
  Date: Friday, March 23, 2012, 12:26 PM
  R can do all of that and more.
  
  But you'll need to put some work in reading
 about
  how to use
  R, about
  the statistical methods involved, and about
 how to
  use them
  to best
  effect. You might want, for instance,
 generalized
  additive
  models. Or
  not. If your question isn't more
 fully-formed than
  this,
  your best bet
  is almost certainly to talk to a local
  statistician, spend
  some time
  working with R, and then come back to the
 list
  with
  specific
  questions.
  
  Sarah
  
  On Fri, 

Re: [R] Weird POSIXct behaviour

2012-03-25 Thread Joshua Ulrich
On Sun, Mar 25, 2012 at 3:15 AM, Worik R wor...@gmail.com wrote:
 Friends

 I have an xts that I wish to access.

 Browse[2] DATA.ba[[p]][2012-03-20 00:59:57,bid]
                       bid
 2012-03-20 00:59:57 1.4993

 So far so good.

 Now putting the index into a variable:

 Browse[2] Time
 [1] 2012-03-20 00:59:57 NZDT
 Browse[2] DATA.ba[[p]][Time, bid]
     bid

 Where has it gone?

It's hard to say, especially since you give no indication how you
assigned the value to Time.  A reproducible example, as requested in
the posting guide, would be helpful.  Also, what version of R, xts,
and zoo are you using?


 Looking closer


 Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid])
 [1] 2012-03-20 00:59:57 NZDT
 Browse[2]

 Browse[2] Time
 [1] 2012-03-20 00:59:57 NZDT
 Browse[2]

 Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid]) == Time
 [1] FALSE
 Browse[2]

 Browse[2] class(Time)
 [1] POSIXct POSIXt

 Browse[2] class(index(DATA.ba[[p]][2012-03-20 00:59:57,bid]))
 [1] POSIXct POSIXt

 So the variable 'Time' should be good to index DATA.ba[[p]].  I am
 hopelessly confused.

The printed representation of the index value and class cannot be used
to accurately determine equality.

 cheers
 Worik


--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com

R/Finance 2012: Applied Finance with R
www.RinFinance.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to get the input of a function right?

2012-03-25 Thread Robert Baer
Your definition of x1, x2 and x3 requires the c() function.  Try leaving 
some space around operators for increased readability.


Remember that the that the result of operations on sets containing NA is 
often set to NA as well.  My guess is that if you print out x it contains 
NA and that your function does operations on x that are included in the 
function result.


Rob

-Original Message- 
From: stella

Sent: Thursday, March 22, 2012 10:43 AM
To: r-help@r-project.org
Subject: [R] How to get the input of a function right?

Hi,

I wrote a function with three inputs fun(x,y,z).

x is a matrix of three vectors combined with cbind. e.g.

x1-(1,2,3,4)
x2-(2,3,4,5)
x3-(3,4,5,6)

x-cbind(x1,x2,x3)

y is a vector e.g
y-c(7,8,9)

z is a real number e.g.
z-2.5

If a give the function an input like this, I get 'NA' in return. If I give
the function a vector e.g c(1,2,3) instead of 'x' the function works just
fine.
Does anyone has an idea why the function would not except 'x' as an input?

Thank you very much!
Stella

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-get-the-input-of-a-function-right-tp4495879p4495879.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ROC Analysis

2012-03-25 Thread Camille Leclerc
Hi everybody,

Pascal, your code works, but when I want to do the graph I have an error
message. 

here is my code :
x-rev(unlist(pred@cutoffs))
tpf-unlist(performance(pred, tpr)@y.values)
fpf-unlist(performance(pred,fpr)@y.values)
ll-length(x)
p-(tpf[1:(ll-1)]-tpf[2:ll])/(fpf[1:(ll-1)]-fpf[2:ll])
plot(x,p)

*Erreur dans xy.coords(x, y, xlabel, ylabel, log) : 
'x' and 'y' lengths differ*

So, when I look the lenghts of x and p, I have this :
*x : numeric[1735]
p : numeric[1734]*

On the other hand, it's normal since I have the slope between two points on
the ROC curve and so I have x points and x-1 slope values. How to get the
graph?!

All the best,
Camille

-
--
Camille Leclerc, Master student
Lab ESE, UMR CNRS 8079
Univ Paris-Sud
Bat 362
F-91405  Orsay Cedex FRANCE
--
View this message in context: 
http://r.789695.n4.nabble.com/ROC-Analysis-tp4469203p4503354.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] multiple hexbin plots with varying greatest densities

2012-03-25 Thread Wayne Gray
Greetings:

I have multiple hexbin plots with varying greatest densities. 

Right now, the data on each plot varies from 1-256 levels of density. The 
problem with that is that in Plot A the data are more scattered across a 2 x 2 
grid whereas in Plot B the data are more concentrated in fewer cells. Hence, 
Plot A has a few very dense grid cells and plot B has nothing as dense as plot 
A. 

(In Plot A the legend shows that the range of densities per cell is 1-700 
whereas for Plot B it is 1-453. Both plots use the full range of available 
colors with 256 gradients.)

I want people to be able to compare the two plots and being able to see the 
differences in variability of the densities is one of the things I want them to 
see. AT PRESENT, each plot uses the full range of densities; hence, this makes 
it seem as if each plot has the maximum highest and minimum lowest densities. 

How can I do this?

Thanks,

Wayne Gray

--
View this message in context: 
http://r.789695.n4.nabble.com/multiple-hexbin-plots-with-varying-greatest-densities-tp4503405p4503405.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Mav
Thank you Jan

My problem is the following:
For instance, I have 2 files with different number of rows (15 million and 8
million of rows each).
I would like to read the first one in chunks of 5 million each. However
between the first and second chunk, I would like to analyze those first 5
million of rows, write the analysis in a new csv and then proceed to read
and analyze the second chunk and so on until the third chunk. With the
second file, I would like to do the same...read the first chunk, analyze it
and continue to read the second and analyze it.

Basically my problem is that I manage to read the filesbut with so many
rows...I cannot do any analyses (even filtering the rows) because of the RAM
restrictions.

Sorry if is still not clear.

Thank you

--
View this message in context: 
http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4503642.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Multivariate function from univariate functions

2012-03-25 Thread Petr Savicky
On Sat, Mar 24, 2012 at 08:04:49PM -0700, physicistintheory wrote:
 I'm relatively new to R and I'm stuck.
 
 I'm trying to construct a surface to optimize from a multivariate dataset. 
 The dataset contains the response of a system to various stimuli.  I am
 trying to optimize the mix of stimuli to maximize the response.  To do so,
 I've interpolated the various datasets of type response vs. stimuli and I
 now have an array of functions interps whose length is the length of the
 array of the names of the various stimuli.  I've also created a vector
 containing names for the stimuli, vars = x.1, x.2, x.3...
 
 Anyway, each of the functions in interps depends only on one variable
 (obviously).  I would like to construct a function, call it, surface which
 is essentially: surface(vars) = interps[[1]]vars[[1]] +
 interps[[2]]vars[[2]]+...
 
 I've tried constructing surface recursively in a for loop:  
 surface - function(vars){0}
 for(i in 1:length(vars)){
 surface - function(vars){surface(vars) + interps[[i]](vars[[i]])}
 }

Hi.

Is the following close to what you are asking for?

  f1 - function(x) 3*x^2 + x
  f2 - function(x) 2*x^2 + 2*x
  f3 - function(x) x^2 + 3*x
  lstf - list(f1, f2, f3)
  args - c(x2, x4, x5)

  # input as a matrix or data frame
  surface - function(dat)
  {
  y - rep(0, times=nrow(dat))
  for (i in seq.int(along=lstf)) {
  y - y + lstf[[i]](dat[, args[i]])
  }
  y
  }

  dat - matrix(1:35, ncol=5)
  colnames(dat) - paste(x, 1:5, sep=)

  surface(dat)

  [1] 2140 2346 2564 2794 3036 3290 3556

  # compare with an explicit formula
  f1(dat[, x2]) + f2(dat[, x4]) + f3(dat[, x5])

  [1] 2140 2346 2564 2794 3036 3290 3556

  # input as a named vector
  surface1 - function(x)
  {
  y - 0
  for (i in seq.int(along=lstf)) {
  y - y + lstf[[i]](x[args[i]])
  }
  unname(y)
  }

  vdat - 1:5
  names(vdat) - paste(x, 1:5, sep=)

  surface1(vdat) 

  [1] 94

  # compare again
  unname(f1(vdat[x2]) + f2(vdat[x4]) + f3(vdat[x5]))

  [1] 94

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading big files in chunks-ff package

2012-03-25 Thread Jan van der Laan
The 'normal' way of doing that with ff is to first convert your csv  
file completely to a
ffdf object (which stores its data on disk so shouldn't give any  
memory problems). You
can then use the chunk routine (see ?chunk) to divide your data in the  
required chunks.


Untested so may contain errors:

ffdf - read.table.ffdf(...)

chnks - chunk(from=1, to=nrow(yourffdf), by=5E6, method='seq')

for (chnk in chnks) {
  # read data
  data - ffdf[chnk, ]
  # do your thing with the data
  # clean up
  rm(data)
  gc()
}


If you want to process your csv file directly in chunks, you could  
also have a look at
the LaF package. Especially the process_blocks routine which does  
exactly that. The
manual vignette  
(http://cran.r-project.org/web/packages/LaF/vignettes/LaF-manual.pdf)

contains some examples how to do that.

Jan



Quoting Mav mastorvar...@gmail.com:


Thank you Jan

My problem is the following:
For instance, I have 2 files with different number of rows (15 million and 8
million of rows each).
I would like to read the first one in chunks of 5 million each. However
between the first and second chunk, I would like to analyze those first 5
million of rows, write the analysis in a new csv and then proceed to read
and analyze the second chunk and so on until the third chunk. With the
second file, I would like to do the same...read the first chunk, analyze it
and continue to read the second and analyze it.

Basically my problem is that I manage to read the filesbut with so many
rows...I cannot do any analyses (even filtering the rows) because of the RAM
restrictions.

Sorry if is still not clear.

Thank you

--
View this message in context:   
http://r.789695.n4.nabble.com/Reading-big-files-in-chunks-ff-package-tp4502070p4503642.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Handling 8GB .txt file in R?

2012-03-25 Thread steven mosher
As the other poster noted, you can just skip lines.

Big matrix should work just fine, except I am not sure how the dates will
be handled

Here is some sample code from my stuff
txtName   is the file name of the file you are reading
Directory   is the path where you want to write the file.backed matrix
filenameis the file.backed big matrix
dname  is a filename for describing the data
sep   What's your separator, comma or space?  below I use tab,
because my file is tab delimited

replace my column names with yours

PERMNO  DATETICKERPERMCO PRC
VOLNUMTRDvwretdewretd

Your dates may be coerced in factors. Not sure how that will work.
You can also try ff

  options(bigmemory.allow.dimnames=TRUE)
 D- read.big.matrix(txtName, skip = 5,
 backingpath = Directory,
  backingfile = filename,
 descriptorfile = dname,
 sep = \t,
 type = double,
 col.names =
 c(Id,SeriesNo,Date,Temp,Unc,Obs,Tobs)
 )



On Sat, Mar 24, 2012 at 9:20 PM, iliketurtles isaacm...@gmail.com wrote:

 Thanks to all the suggestions. To the first individual that replied, I
 can't
 do any stuff with unix or perl. All I know is R.

 @KEN:
 I'm using Windows 7, 64 bit.

 @Steve:
 Here's the readLines output.. As we can see, lines 1-3 are empty and line 5
 is empty, and there's also empty elements after line 5!.

  [1]  
  [2] 
 
  [3]  
  [4]   PERMNO  DATETICKERPERMCO PRC
 VOLNUMTRDvwretdewretd
  [5] 
  [6]106/01/19867952  .
 . . -0.000138  0.001926
  [7]107/01/1986OMFGA   7952-2.56250
 1000 .  0.013809  0.011061
  [8]108/01/1986OMFGA   7952-2.5
 12800 . -0.020744 -0.005117
  [9]109/01/1986OMFGA   7952-2.5
 1400 . -0.011219 -0.011588
  [10]110/01/1986OMFGA   7952-2.5
 8500 .  0.83  0.003651
  [11]113/01/1986OMFGA   7952-2.62500
 5450 .  0.002749  0.002433

 -
 

 Isaac
 Research Assistant
 Quantitative Finance Faculty, UTS
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Handling-8GB-txt-file-in-R-tp4500971p4502706.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Nonparmetric statistics with weighted data set?

2012-03-25 Thread Barnet Wagman
I'm doing some work with a weighted data set (the CPS) and I'd like to 
use nonparametric techniques. Is there an R package that supports this?  
The np package doesn't appear to have provisions for weighted data. (Or 
have I missed something?)  I need to perform both density estimation and 
regressions.


thanks,

bw

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple hexbin plots with varying greatest densities

2012-03-25 Thread Thomas Lumley
On Mon, Mar 26, 2012 at 4:59 AM, Wayne Gray wgray@gmail.com wrote:
 Greetings:

 I have multiple hexbin plots with varying greatest densities.

 Right now, the data on each plot varies from 1-256 levels of density. The 
 problem with that is that in Plot A the data are more scattered across a 2 x 
 2 grid whereas in Plot B the data are more concentrated in fewer cells. 
 Hence, Plot A has a few very dense grid cells and plot B has nothing as dense 
 as plot A.

 (In Plot A the legend shows that the range of densities per cell is 1-700 
 whereas for Plot B it is 1-453. Both plots use the full range of available 
 colors with 256 gradients.)

 I want people to be able to compare the two plots and being able to see the 
 differences in variability of the densities is one of the things I want them 
 to see. AT PRESENT, each plot uses the full range of densities; hence, this 
 makes it seem as if each plot has the maximum highest and minimum lowest 
 densities.

 How can I do this?

With the 'lattice' or 'centroids' styles you can use the maxarea=
argument to plot.hexbin or to grid.hexagons.

survey::svycoplot() is an example of this: you can either have the
same scale across panels or have each panel use the full range of
sizes.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nonparmetric statistics with weighted data set?

2012-03-25 Thread Thomas Lumley
On Mon, Mar 26, 2012 at 8:45 AM, Barnet Wagman b...@norbl.com wrote:
 I'm doing some work with a weighted data set (the CPS) and I'd like to use
 nonparametric techniques. Is there an R package that supports this?  The np
 package doesn't appear to have provisions for weighted data. (Or have I
 missed something?)  I need to perform both density estimation and
 regressions.

You may need to be more precise about what you mean by 'nonparametric
techniques', but if you want smoothing you can probably do what you
want with the survey package.

svysmooth() does densities and both mean and quantile scatterplot smoothing.

For regression models with smooth terms you would have to use
regression splines in svyglm(), but they are almost indistinguishable
from smoothing splines in practice.

For other sorts of non-parametric techniques there are two-sample rank
tests in the survey package.

   -thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weird POSIXct behaviour

2012-03-25 Thread Worik R
My bad.  I should be clearer about the source of my confusion.

Given two identical string representations of POSIXct objects, can the two
objects represent different times?


  Where has it gone?
 
 It's hard to say, especially since you give no indication how you
 assigned the value to Time.  A reproducible example, as requested in
 the posting guide, would be helpful.  Also, what version of R, xts,
 and zoo are you using?


Browse[2] Time
[1] 2012-03-20 00:59:57 NZDT

Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid])
[1] 2012-03-20 00:59:57 NZDT

A reproducible example would be huge at this point.  WHat I need is an
answer to that simple question.  If the answer is  No then it is worth
doing the work for a reproducible example.  If Yes I need to learn why
and better ways of pasing the objects in and out of matrices and vectors.

Using R version  2.12.1

The zoo documentation (?zoo) does not include a version.  Where can I find
it?

Worik



  Looking closer
 
 
  Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid])
  [1] 2012-03-20 00:59:57 NZDT
  Browse[2]
 
  Browse[2] Time
  [1] 2012-03-20 00:59:57 NZDT
  Browse[2]
 
  Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid]) == Time
  [1] FALSE
  Browse[2]
 
  Browse[2] class(Time)
  [1] POSIXct POSIXt
 
  Browse[2] class(index(DATA.ba[[p]][2012-03-20 00:59:57,bid]))
  [1] POSIXct POSIXt
 
  So the variable 'Time' should be good to index DATA.ba[[p]].  I am
  hopelessly confused.
 
 The printed representation of the index value and class cannot be used
 to accurately determine equality.

  cheers
  Worik
 

 --
 Joshua Ulrich  |  FOSS Trading: www.fosstrading.com

 R/Finance 2012: Applied Finance with R
 www.RinFinance.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Weird POSIXct behaviour

2012-03-25 Thread Joshua Ulrich
On Sun, Mar 25, 2012 at 3:01 PM, Worik R wor...@gmail.com wrote:
 My bad.  I should be clearer about the source of my confusion.

 Given two identical string representations of POSIXct objects, can the two
 objects represent different times?

Yes.  Here's an example (from my Ubuntu machine) of one way:

 (t1 - Sys.time()); (t2 - Sys.time())+0.001; t1 == t2
[1] 2012-03-25 15:13:48 CDT
[1] 2012-03-25 15:13:48 CDT
[1] FALSE
 options(digits.secs=3)
 (t1 - Sys.time()); (t2 - Sys.time())+0.001; t1 == t2
[1] 2012-03-25 15:17:36.520 CDT
[1] 2012-03-25 15:17:36.523 CDT
[1] FALSE


  Where has it gone?
 
 It's hard to say, especially since you give no indication how you
 assigned the value to Time.  A reproducible example, as requested in
 the posting guide, would be helpful.  Also, what version of R, xts,
 and zoo are you using?


 Browse[2] Time
 [1] 2012-03-20 00:59:57 NZDT

 Browse[2] index(DATA.ba[[p]][2012-03-20 00:59:57,bid])
 [1] 2012-03-20 00:59:57 NZDT

 A reproducible example would be huge at this point.  WHat I need is an
 answer to that simple question.  If the answer is  No then it is worth
 doing the work for a reproducible example.  If Yes I need to learn why
 and better ways of pasing the objects in and out of matrices and vectors.

No need for it to be huge.  dput(DATA.ba[[p]][2012-03-20
00:59:57,bid]) would be a sufficient start.

 Using R version  2.12.1

 The zoo documentation (?zoo) does not include a version.  Where can I find
 it?

From the output of sessionInfo(), or packageDescription(zoo).  The
output from sessionInfo() would be more helpful because it provides
more information about your installation.

 Worik



--
Joshua Ulrich  |  FOSS Trading: www.fosstrading.com

R/Finance 2012: Applied Finance with R
www.RinFinance.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] avoiding for loops

2012-03-25 Thread Ed Siefker
I have data that looks like this:

 df1
  group id
1   red  A
2   red  B
3   red  C
4  blue  D
5  blue  E
6  blue  F


I want a list of the groups containing vectors with the ids.I am
avoiding subset(), as it is
only recommended for interactive use.  Here's what I have so far:

df1 - data.frame(group=c(red, red, red, blue, blue,
blue), id=c(A, B, C, D, E, F))

groups - levels(df1$group)
byid - lapply(groups, ==, df1$group)
groupIDX - lapply(byid, which)

 groupIDX
[[1]]
[1] 4 5 6

[[2]]
[1] 1 2 3



This gives me a list of the indices for each group.  I want to subset
df1 based on this list.
If I want just one group I can do this:

 df1[groupIDX[[1]],]$id
[1] D E F


What sort of statement should I use if I want a result like:
[[1]]
[1] D E F
Levels: A B C D E F

[[2]]
[1] A B C
Levels: A B C D E F


So far, I've used a for loop.  Can I express this with apply statements?

groupIDs - list(1:length(groupIDX))
groupData-
for (i in 1:length(groupIDX)) {
groupIDs[[i]] - df1[groupIDX[[i]],]$id
}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] avoiding for loops

2012-03-25 Thread Richard M. Heiberger
 df1 - data.frame(group=c(red, red, red, blue, blue,
blue), id=c(A, B, C, D, E, F))
df1 - data.frame(group=c(red, red, red, blue, blue,
+ blue), id=c(A, B, C, D, E, F))
 df1
  group id
1   red  A
2   red  B
3   red  C
4  blue  D
5  blue  E
6  blue  F
 with(df1, split(id, group))
$blue
[1] D E F
Levels: A B C D E F

$red
[1] A B C
Levels: A B C D E F




On Sun, Mar 25, 2012 at 4:46 PM, Ed Siefker ebs15...@gmail.com wrote:

 I have data that looks like this:

  df1
  group id
 1   red  A
 2   red  B
 3   red  C
 4  blue  D
 5  blue  E
 6  blue  F


 I want a list of the groups containing vectors with the ids.I am
 avoiding subset(), as it is
 only recommended for interactive use.  Here's what I have so far:

 df1 - data.frame(group=c(red, red, red, blue, blue,
 blue), id=c(A, B, C, D, E, F))

 groups - levels(df1$group)
 byid - lapply(groups, ==, df1$group)
 groupIDX - lapply(byid, which)

  groupIDX
 [[1]]
 [1] 4 5 6

 [[2]]
 [1] 1 2 3



 This gives me a list of the indices for each group.  I want to subset
 df1 based on this list.
 If I want just one group I can do this:

  df1[groupIDX[[1]],]$id
 [1] D E F


 What sort of statement should I use if I want a result like:
 [[1]]
 [1] D E F
 Levels: A B C D E F

 [[2]]
 [1] A B C
 Levels: A B C D E F


 So far, I've used a for loop.  Can I express this with apply statements?

 groupIDs - list(1:length(groupIDX))
 groupData-
 for (i in 1:length(groupIDX)) {
groupIDs[[i]] - df1[groupIDX[[i]],]$id
}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] limma design matrix

2012-03-25 Thread statfan
I am trying to construct my design matrix needed in the {limma} function
lmfit but am having trouble with the formula I am to specify in the
function model.matrix.  Namely when to I use ~0 + factors (ex 1) vs ~-1 +
factors (ex 2).  Any clarification on this would be greatly appreciated. 
Thanks in advanced.  

Ex 1:
  f - factor(targets$Target, levels=c(RNA1,RNA2,RNA3))
design - model.matrix(~0+f)
[page 409, of Bioinformatics and computational biology solutions using R and
Bioconductor]

and 

Ex 2:
library(ALL)
pdat - pData(ALL)
design - model.matrix(~-1 + factor(pdat$type) 
[page 237, of Bioinformatics and computational biology solutions using R and
Bioconductor]



--
View this message in context: 
http://r.789695.n4.nabble.com/limma-design-matrix-tp4503651p4503651.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 'names' attribute must be the same length as the vector

2012-03-25 Thread jiefan
I have met into this problem when I tried to run panel regression by plm.
My code:

library(plm)
indus - read.csv(file=full.csv,header=TRUE)
industry-as.data.frame(indus)
reg-lm(LnTSO2 ~ LnPGDP + LnPGDP2 + LnSOES + LnCOES + LnLIMD + 
LnSHOLD + LnPRIV + LnFIEs + LnEXP + LnIMP + LnLEXRE + LnVALTAX + 
LnIND1 + LnIND2 + LnIND3 + LnIND4 + LnIND5 + LnIND6 + LnIND7 + 
LnIND8 + LnIND9, data=industry, model = pooling)
*Error in names(y) - namesy : 
  'names' attribute [1144] must be the same length as the vector [0]*

My data is attached showing as a screenshot.
http://r.789695.n4.nabble.com/file/n4503946/%E6%9C%AA%E5%91%BD%E5%90%8D.jpg 

Does anyone know why this is happening and how to fix?\
Thanks very much.



--
View this message in context: 
http://r.789695.n4.nabble.com/names-attribute-must-be-the-same-length-as-the-vector-tp4503946p4503946.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R numerical integration

2012-03-25 Thread casperyc
The quadinf command in library  pracma still fails when mu=-2.986731 with
sigma=53415.18.
While Maple gives me an estimate of 0.5001701024.

Maple: (for those who are interested)
myf:=(mu,sigma)-
evalf(Int(exp(-(x-mu)^2/2/sigma^2)/sigma/sqrt(2*Pi)/(1+exp(-x)),
x=-infinity..infinity));
myf(-2.986731, 53415.18 );
0.5001701024


These 'mu's and 'sigma's are now random starting points I generated for an
optimization problem like I have mentioned.

I should really investigate the behavior of this function before I ask R
doing the integration. As I have mentioned that I had already realized the
integral is between 0 and 1. And I have had a look at the contour plots of
different mu and sigma. I am going to 'restrict' mu and sigma to certain
(small) values, and still get the integral to produce a value between 0 and
1.

All of your help is much appreciated.

casper

-
###
PhD candidate in Statistics
School of Mathematics, Statistics and Actuarial Science, University of Kent
###

--
View this message in context: 
http://r.789695.n4.nabble.com/R-numerical-integration-tp4500095p4503766.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R Error : DATA to MATRIX

2012-03-25 Thread ritwik_r
Thanks David, your suggestion works fine.btw I have another
question..If I set (n,m) little bit large, say (n=20,m=10), R cannot
handle the large data frame generated through expand.grid.Is there
any way to increase R-memory so that I can tackle large data.frame in R
?

regards
ritwik


 On Mar 23, 2012, at 2:53 AM, ritwi...@isical.ac.in wrote:

 Dear Sir/Madam,

 I'm getting a problem with a R-code which converts a data frame to a
 matrix.

 It first generate a (m^(n-m) * m) matrix A and then regenerate another
 matrix B having less dimension than A which satisfy some condition.
 Now I
 wish to assign each row of B to a vector as individual.

 My problem is when I set any choice of (n,m) except m=1 it works
 fine but
 setting m=1 I got the error : Error in B[i, ] : incorrect number of
 dimensions.

 Moreover if (n,m) is large (say, (20,8)) I got the error : Error:
 cannot
 allocate vector of size 3.0 Gb. I know this is due to large
 dimension of
 matrix A. How to solve this problem.

 My code is given below:

 **

 n=5
 m=3
 R=numeric(0)
 # Generate all possible m-tuple ( variables having range 0 to n  )
 in a (
 m^(n-m) * m ) matrix

 r = expand.grid(rep(list(0:(n-m)), m))

 write.table(r,file=test.txt,row.names=FALSE,col.names=FALSE)

 a= read.table(file=test.txt,sep=,header=FALSE)

 A= data.matrix(a)

 #.

 # Generate matrix whose rowsum = n-m

 meet.crit = apply(A, 1, function(.row) any((sum(.row)) == n-m))  #
 criteron for being rowsum = n

 cbind(A, meet.crit)  #
 Checking rowsum = n for each row
 -m
 B=A[meet.crit,]

 At this point the default behavior of the [ function is to return a
 vector rather than a matrix. You need to add drop=FALSE as an
 additional argument. Read the help page for ?[.

   #
 Generate matrix

 #.


 for(i in 1:choose(n-1,m-1)){
 R=B[i,]
 }

 ***

 Can you please help me how to get rid of these errors. Thanking you in
 advance.

 Regards

 Ritwik Bhattacharya


 Senior Research Fellow
 SQC  OR UNIT, KOLKATA
 INDIAN STATISTICAL INSTITUTE

 --

 David Winsemius, MD
 West Hartford, CT

 This mail is scanned by Ironport




This mail is scanned by Ironport

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] string substitution for argument in function

2012-03-25 Thread Pedro Martinez
hello,
I want to iterate through a list of names and use each element as an
argument in a function. For instance:

 a = c('one','two','three')
 data= c()
 for(elem in a){data=cbind(elem = 2,data)}
 data
 elem elem elem
[1,]222

instead I want 'elem' to be substituted by the string in the list. Doing
it by hand would be:
 data = cbind('one'=2,data)
 data = cbind('two'=2,data)
 data = cbind('three'=2,data)
 data
 'one' 'two' 'three'
[1,]222

I guess that the clue would be in sub(),gsub(), paste() or similar but I
didnt get it to work.

I am comming from python were we woudl do something like:
 a = ['one','two','three']
 data = {}

 for elem in a:
   data[elem] = 2
 
 data
{'three': 2, 'two': 2, 'one': 2}

Thanks, Pedro


-
Prof. Dr. P. Martinez Arbizu
DZMB-Forschungsinstitut Senckenberg

Suedstrand 44
D-26382 Wilhelmshaven
Germany

Tel: +49 (0)4421 9475-100
Fax: +49 (0)4421 9475-111

Email: pmarti...@senckenberg.de

Senckenberg Gesellschaft für Naturforschung
Rechtsfähiger Verein gemäß § 22 BGB
Senckenberganlage 25
60325 Frankfurt
Direktorium: Prof. Dr. Dr. h.c. Volker Mosbrugger, Prof. Dr. Michael
Türkay, Dr. Johannes Heilmann, Prof. Dr. Pedro Martinez Arbizu, Prof.
Dr. Georg Zizka, Prof. Dr. Uwe Fritz
Vorsitzender des Präsidiums: Dietmar Schmid
Aufsichtsbehörde: Magistrat der Stadt Frankfurt am Main (Ordnungsamt)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing more than two coefficients in a plot

2012-03-25 Thread FJ M

I've successfully plotted (in the plot and abline code below) a simple 
regression of Lambda1_2 on VV1_2. I then successfully regressed Lambda1_2 on 
VV1_2, VV1_22 and VV1_212 producing lm2.l. When I go to plot lm2.l using abline 
I get the warning:

1: In abline(lm2.l, col = brown, lty = dotted, lwd = 2) : only using the 
first two of 4 regression coefficients

Is there another function like abline that will produce a line using the 
constant and three coefficients from the lm2.l regression?


lm.l - lm(Lambda1_2 ~ VV1_2, method = qr, model = TRUE, x = FALSE, y = 
FALSE, qr = TRUE) # unweighted regression

lm2.l - lm(Lambda1_2 ~ VV1_2 + VV1_22 + VV1_212, method = qr, model = TRUE, 
x = FALSE, y = FALSE, qr = TRUE) # unweighted regression

plot(VV1_2, Lambda1_2, ylim=yrange, tck=1, main=V(1) Parameters (V, V^2  
V^0.5), xlab=VV1_2, ylab=Lambda  Beta1_2,pch=19,col=red) 
{abline(lm2.l, col=brown, lty=dotted, lwd=2)
abline(wlm2.l, col=gold,lty=longdash, lwd=2)
points(VV1_2, Beta1_2, pch=19, col=blue)
abline(lm2.b, col=black,lty=dotted, lwd=2)
abline(wlm2.b, col=blue, lty=longdash, lwd=2)
legend(topright, inset=.05, title=Parameters,
labels, lwd=2, lty=c(1, 1, 1, 1, 2), col=colors)
} 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Format wanted...

2012-03-25 Thread Duncan Murdoch

On 12-03-25 10:45 AM, Marc Schwartz wrote:


On Mar 25, 2012, at 7:14 AM, Duncan Murdoch wrote:


On 12-03-24 10:47 PM, J Toll wrote:

On Sat, Mar 24, 2012 at 7:30 PM, Duncan Murdoch
murdoch.dun...@gmail.com   wrote:

Do we have a format that always includes a decimal point and a given number
of significant digits, but otherwise drops unnecessary characters?  For
example, if I wanted 5 digits, I'd want the following:

Round to 5 digits:
1.234567  -   1.2346

Drop unnecessary zeros:
1.23  -   1.23

Force inclusion of a decimal point:
1 -   1.



Duncan,

Maybe sprintf() will work for you.  As it's a wrapper for C sprintf,
it should have its functionality.


Maybe, but with which format string?

Duncan Murdoch



I don't believe (though could be wrong), that you can do it all with one format string, 
but can do it conditionally based upon the input. According to the C printf 
documentation, the use of # forces a decimal point to be present, even if 
there are no trailing digits. Thus:


sprintf(%#.f, 1)

[1] 1.

The other two values seem to be handled by signif() when applied to each value 
individually:


signif(1.234567, 5)

[1] 1.2346


signif(1.23, 5)

[1] 1.23

But, not when a vector:


signif(c(1.234567, 1.23), 5)

[1] 1.2346 1.2300


So, wrapping that inside a function, using ifelse() to test for an integer 
value:

signif.d- function(x, digits)
{
   ifelse(x == round(x),
  sprintf(%.#f, x),
  signif(x, digits))
}


x- c(1.234567, 1.23, 1)


signif.d(x, 5)

[1] 1.2346 1.23   1.


signif.d(x, 6)

[1] 1.23457 1.231.


signif.d(x, 7)

[1] 1.234567 1.23 1.


Not extensively tested of course, but hopefully that might work for your needs 
Duncan.


Thanks.  I had put together a different conditional (just do the 
conversion, then add a decimal point at the end if none is seen), but I 
was surprised that there was no standard format for this.


In case anyone is interested, I want to output code in a language (GLSL) 
that sees 1 and 1. as different types.  I want a floating point value, 
so I need the decimal point.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accessing more than two coefficients in a plot

2012-03-25 Thread Bert Gunter
Well, as a line in the plane is determined by 2 coefficients only, I'd
guess that trying to find an R function that plots a line defined by 4
coefficients has about the same chance of success as finding a unicorn
with 3 horns.

You do understand that your linear model defines a hyperplane in your
three covariates, do you not? Or do I misunderstand what you have
requested?

Cheers,
Bert

On Sun, Mar 25, 2012 at 2:32 PM, FJ M chicagobrownb...@hotmail.com wrote:

 I've successfully plotted (in the plot and abline code below) a simple 
 regression of Lambda1_2 on VV1_2. I then successfully regressed Lambda1_2 on 
 VV1_2, VV1_22 and VV1_212 producing lm2.l. When I go to plot lm2.l using 
 abline I get the warning:

 1: In abline(lm2.l, col = brown, lty = dotted, lwd = 2) : only using the 
 first two of 4 regression coefficients

 Is there another function like abline that will produce a line using the 
 constant and three coefficients from the lm2.l regression?


 lm.l - lm(Lambda1_2 ~ VV1_2, method = qr, model = TRUE, x = FALSE, y = 
 FALSE, qr = TRUE) # unweighted regression

 lm2.l - lm(Lambda1_2 ~ VV1_2 + VV1_22 + VV1_212, method = qr, model = 
 TRUE, x = FALSE, y = FALSE, qr = TRUE) # unweighted regression

 plot(VV1_2, Lambda1_2, ylim=yrange, tck=1, main=V(1) Parameters (V, V^2  
 V^0.5), xlab=VV1_2, ylab=Lambda  Beta1_2,pch=19,col=red)
 {abline(lm2.l, col=brown, lty=dotted, lwd=2)
 abline(wlm2.l, col=gold,lty=longdash, lwd=2)
 points(VV1_2, Beta1_2, pch=19, col=blue)
 abline(lm2.b, col=black,lty=dotted, lwd=2)
 abline(wlm2.b, col=blue, lty=longdash, lwd=2)
 legend(topright, inset=.05, title=Parameters,
 labels, lwd=2, lty=c(1, 1, 1, 1, 2), col=colors)
 }
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Format wanted...

2012-03-25 Thread Hasan Diwan
Duncan,

On 25 March 2012 15:28, Duncan Murdoch murdoch.dun...@gmail.com wrote:
 In case anyone is interested, I want to output code in a language (GLSL)
 that sees 1 and 1. as different types.  I want a floating point value, so I
 need the decimal point.

GLSL, assuming it's the one that I'm looking at[1], supports implicit
conversion from integer to float by appending .0 to the end.

-- 
Sent from my mobile device
Envoyait de mon portable
1. http://www.opengl.org/wiki/GLSL_Types#Implicit_conversion

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] svycoxph and test statistics

2012-03-25 Thread Terry Therneau

On 03/24/2012 06:00 AM, r-help-requ...@r-project.org wrote:

I have been using the function 'svycoxph' in the Dr. Lumley's survey package 
(version 3.26) to compute coefficient estimates for Cox regression.

I have noticed the p-values output are based on normal distribution (like in 
coxph); however in svyglm (and in other software, such as Stata or SAS) the 
p-values are computed via the t distribution with degrees of freedom equal to 
the number of PSUs minus number of strata.

I am wondering why there is a difference here?
I'm not aware of any theory papers that back up the use of a 
t-distribution.  This is a Cox model, and do what my Gaussian package 
does is not usually the best approach.  I'm far from an expert in 
survey work though, so I'll yeild to Thomas L for a definitive answer.
  In the case of mixed effects models I see the exact same leaning 
towards (approximate) REML vs ML; this is an area that I do know deeply 
and and the REML better than ML arguments from linear mixed effects 
models to NOT transfer over.


Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] string substitution for argument in function

2012-03-25 Thread Weidong Gu
Hi,

This may help

 a = c('one','two','three')
data.frame(eval(substitute(rbind(var,2),list(var=a

?substitute
?eval

Weidong Gu


On Sun, Mar 25, 2012 at 5:22 PM, Pedro Martinez
pedro.marti...@senckenberg.de wrote:
 hello,
 I want to iterate through a list of names and use each element as an
 argument in a function. For instance:

 a = c('one','two','three')
 data= c()
 for(elem in a){data=cbind(elem = 2,data)}
 data
     elem elem elem
 [1,]    2    2    2

 instead I want 'elem' to be substituted by the string in the list. Doing
 it by hand would be:
 data = cbind('one'=2,data)
 data = cbind('two'=2,data)
 data = cbind('three'=2,data)
 data
     'one' 'two' 'three'
 [1,]    2    2    2

 I guess that the clue would be in sub(),gsub(), paste() or similar but I
 didnt get it to work.

 I am comming from python were we woudl do something like:
 a = ['one','two','three']
 data = {}

 for elem in a:
       data[elem] = 2

 data
 {'three': 2, 'two': 2, 'one': 2}

 Thanks, Pedro


 -
 Prof. Dr. P. Martinez Arbizu
 DZMB-Forschungsinstitut Senckenberg

 Suedstrand 44
 D-26382 Wilhelmshaven
 Germany

 Tel: +49 (0)4421 9475-100
 Fax: +49 (0)4421 9475-111

 Email: pmarti...@senckenberg.de

 Senckenberg Gesellschaft für Naturforschung
 Rechtsfähiger Verein gemäß § 22 BGB
 Senckenberganlage 25
 60325 Frankfurt
 Direktorium: Prof. Dr. Dr. h.c. Volker Mosbrugger, Prof. Dr. Michael
 Türkay, Dr. Johannes Heilmann, Prof. Dr. Pedro Martinez Arbizu, Prof.
 Dr. Georg Zizka, Prof. Dr. Uwe Fritz
 Vorsitzender des Präsidiums: Dietmar Schmid
 Aufsichtsbehörde: Magistrat der Stadt Frankfurt am Main (Ordnungsamt)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] string substitution for argument in function

2012-03-25 Thread Henrique Dallazuanna
try this:

`names-`(rep(2, 3), a)


On Sun, Mar 25, 2012 at 6:22 PM, Pedro Martinez
pedro.marti...@senckenberg.de wrote:

 hello,
 I want to iterate through a list of names and use each element as an
 argument in a function. For instance:

  a = c('one','two','three')
  data= c()
  for(elem in a){data=cbind(elem = 2,data)}
  data
     elem elem elem
 [1,]    2    2    2

 instead I want 'elem' to be substituted by the string in the list. Doing
 it by hand would be:
  data = cbind('one'=2,data)
  data = cbind('two'=2,data)
  data = cbind('three'=2,data)
  data
     'one' 'two' 'three'
 [1,]    2    2    2

 I guess that the clue would be in sub(),gsub(), paste() or similar but I
 didnt get it to work.

 I am comming from python were we woudl do something like:
  a = ['one','two','three']
  data = {}

  for elem in a:
        data[elem] = 2

  data
 {'three': 2, 'two': 2, 'one': 2}

 Thanks, Pedro


 -
 Prof. Dr. P. Martinez Arbizu
 DZMB-Forschungsinstitut Senckenberg

 Suedstrand 44
 D-26382 Wilhelmshaven
 Germany

 Tel: +49 (0)4421 9475-100
 Fax: +49 (0)4421 9475-111

 Email: pmarti...@senckenberg.de

 Senckenberg Gesellschaft für Naturforschung
 Rechtsfähiger Verein gemäß § 22 BGB
 Senckenberganlage 25
 60325 Frankfurt
 Direktorium: Prof. Dr. Dr. h.c. Volker Mosbrugger, Prof. Dr. Michael
 Türkay, Dr. Johannes Heilmann, Prof. Dr. Pedro Martinez Arbizu, Prof.
 Dr. Georg Zizka, Prof. Dr. Uwe Fritz
 Vorsitzender des Präsidiums: Dietmar Schmid
 Aufsichtsbehörde: Magistrat der Stadt Frankfurt am Main (Ordnungsamt)

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2012-03-25 Thread Rolf Turner

On 26/03/12 00:18, Anjana Thampi wrote:

How do you decompose inequality in R, say by gender?



This has to be one of the most meaningless and ill-expressed
questions I've ever seen on this list.  And that's a high hurdle
to clear.

cheers,

Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Seeming failure of options(width=60)

2012-03-25 Thread John C Nash
The following example gives output with a line length of 103 on my system. It 
is causing a
nuisance in creating a vignette. Is there something other than e.g., 
options(width=60) I
need to set? The Sweave FAQ suggests this should work.

options(width=60)
pastured - data.frame(
time=c(9, 14, 21, 28, 42, 57, 63, 70, 79),
yield= c(8.93, 10.8, 18.59, 22.33, 39.35,
 56.11, 61.73, 64.62, 67.08))
regmod-yield ~ t1 - t2*exp(-exp(t3+t4*log(time)))
huetstart-c(t1=70, t2=60, t3=0, t4=1)
anlsx-try(nls(regmod, start=huetstart, trace=TRUE, data=pastured))
print(anlsx)


John Nash

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Struggling with zoo and aggregate

2012-03-25 Thread Thomas Adams
Gabor,

Thank you for your help -- it did help me a lot. However, with my data:

   lead_time cycler_squared  fcst_date
1  6 0 5.405095e-02 07/31/2010
2 12 0 5.521620e-06 07/31/2010
3 18 0 1.565910e-04 07/31/2010
4 24 0 8.646822e-02 07/31/2010
5 30 0 1.719604e-02 07/31/2010
6 36 0 5.768113e-04 07/31/2010
7 42 0 2.501269e-06 07/31/2010
8 48 0 6.451727e-02 07/31/2010
9  612 2.857931e-01 07/31/2010
101212 1.138635e-01 07/31/2010
111812 2.225503e-02 07/31/2010
122412 1.182031e-03 07/31/2010
133012 8.841142e-04 07/31/2010
143612 1.082490e-01 07/31/2010
154212 1.502887e-05 07/31/2010
17 6 0 8.689588e-02 08/01/2010
1812 0 5.884336e-04 08/01/2010
1918 0 2.219316e-07 08/01/2010
2024 0 3.960752e-02 08/01/2010
2130 0 1.087413e-04 08/01/2010
2342 0 3.583030e-07 08/01/2010
2448 0 2.907109e-05 08/01/2010
25 612 8.693451e-02 08/01/2010
261212 3.208215e-02 08/01/2010
271812 0.00e+00 08/01/2010
28 6 0 2.962669e-02 08/02/2010
29 612 2.363506e-05 08/02/2010
301212 9.050178e-03 08/02/2010

from:

 z - read.zoo(q,index = 4, FUN = as.yearmon, format =
%m/%d/%Y,aggregate = mean)

I get:
 z
 lead_timecycle  r_squared
Jul 2010  25.6 5.60 0.05034771
Aug 2010  18.46154 4.615385 0.02191903

what I need is to NOT have the lead_time and cycle averaged, but only have
the r_squared values averaged by lead_time and cycle. I can not seem to
figure out the correct syntax to do this. I assume I use something like:

q_agg-aggregate(q,by=list(q$lead_time,q$cycle),index = 4, FUN =
as.yearmon, format = %m/%d/%Y)

but I get errors or nonsense when I follow with...

z - read.zoo(q_agg,index = 4, FUN = as.yearmon, format =
%m/%d/%Y,aggregate = mean)

or some variation of this.

Regards,
Tom


On Sat, Mar 24, 2012 at 10:58 PM, Gabor Grothendieck 
ggrothendi...@gmail.com wrote:

 On Sat, Mar 24, 2012 at 10:44 PM, Thomas Adams thomas.ad...@noaa.gov
 wrote:
  All:
 
  I have a SQlite database where I have stored some verification data by
 date
   time (cycle Z/UTC), lead_time as well as type, duration, etc. I would
  like to analyze  plot the data as monthly averages. I have looked at a
  bunch of examples which use some combination of zoo and aggregate, but I
  have not been able to successfully apply bits and pieces from the
 examples
  I have found. Any help is appreciated. BTW, I calculate mae (mean
 absolute
  error), mse (mean squared error), me (mean error), and other measures
  obtained by using the R verification package.
 
  The example below is limited to 20 records and shows lead_time,
 r_squared,
  (forecast) cycle, fcst_date (forecast date) -- the full data set is just
  over 2 years of daily data with 3 forecast cycles (00Z, 12Z, and 18Z)
 daily.
 
  From my query, below) how do I construct an appropriate data structure
 to
  analyze  plot the data as monthly averages?
 
  Regards,
  Tom
 
  q-dbGetQuery(con,select lead_time,r_squared,cycle,fcst_date from
  verify_table where duration=6 limit 20)
  q
lead_timer_squared cycle  fcst_date
  1  6 5.405095e-0200 07/31/2010
  2 12 5.521620e-0600 07/31/2010
  3 18 1.565910e-0400 07/31/2010
  4 24 8.646822e-0200 07/31/2010
  5 30 1.719604e-0200 07/31/2010
  6 36 5.768113e-0400 07/31/2010
  7 42 2.501269e-0600 07/31/2010
  8 48 6.451727e-0200 07/31/2010
  9  6 2.857931e-0112 07/31/2010
  1012 1.138635e-0112 07/31/2010
  1118 2.225503e-0212 07/31/2010
  1224 1.182031e-0312 07/31/2010
  1330 8.841142e-0412 07/31/2010
  1436 1.082490e-0112 07/31/2010
  1542 1.502887e-0512 07/31/2010
  1648   NA12 07/31/2010
  17 6 8.689588e-0200 08/01/2010
  1812 5.884336e-0400 08/01/2010
  1918 2.219316e-0700 08/01/2010
  2024 3.960752e-0200 08/01/2010
 

 Try this:

 Lines - lead_timer_squared cycle  fcst_date
 1  6 5.405095e-0200 07/31/2010
 2 12 5.521620e-0600 07/31/2010
 3 18 1.565910e-0400 07/31/2010
 4 24 8.646822e-0200 07/31/2010
 5 30 1.719604e-0200 07/31/2010
 6 36 5.768113e-0400 07/31/2010
 7 42 2.501269e-0600 07/31/2010
 8 48 6.451727e-0200 07/31/2010
 9  6 2.857931e-0112 07/31/2010
 1012 1.138635e-0112 07/31/2010
 1118 2.225503e-0212 07/31/2010
 1224 1.182031e-0312 07/31/2010
 1330 8.841142e-0412 07/31/2010
 1436 1.082490e-0112 07/31/2010

 library(zoo)
 q - read.table(text = Lines)

 z - read.zoo(q, index = 4, FUN = 

Re: [R] Seeming failure of options(width=60)

2012-03-25 Thread Yihui Xie
I got an error message:

 options(width=60)
 pastured - data.frame(
+ time=c(9, 14, 21, 28, 42, 57, 63, 70, 79),
+ yield= c(8.93, 10.8, 18.59, 22.33, 39.35,
+ 56.11, 61.73, 64.62, 67.08))
 regmod-yield ~ t1 - t2*exp(-exp(t3+t4*log(time)))
 huetstart-c(t1=70, t2=60, t3=0, t4=1)
 anlsx-try(nls(regmod, start=huetstart, trace=TRUE, data=pastured))
13386.91 :  70 60  0  1
Error in nls(regmod, start = huetstart, trace = TRUE, data = pastured) :
  singular gradient
 print(anlsx)
[1] Error in nls(regmod, start = huetstart, trace = TRUE, data =
pastured) : \n  singular gradient\n
attr(,class)
[1] try-error
attr(,condition)
simpleError in nls(regmod, start = huetstart, trace = TRUE, data =
pastured): singular gradient


and I see the number of characters of the message string is 94:

 nchar(anlsx)
[1] 94

options(width) is not supposed to _wrap_ your character scalars; it
cannot break a single string into several lines (use strwrap() instead
in that case). See ?options for the exact meaning of the width option.

# this may help you understand 'width'
options(width = 50)
rep(1, 100)

options(width = 90)
rep(1, 100)


Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA



On Sun, Mar 25, 2012 at 9:17 PM, John C Nash nas...@uottawa.ca wrote:
 The following example gives output with a line length of 103 on my system. It 
 is causing a
 nuisance in creating a vignette. Is there something other than e.g., 
 options(width=60) I
 need to set? The Sweave FAQ suggests this should work.

 options(width=60)
 pastured - data.frame(
 time=c(9, 14, 21, 28, 42, 57, 63, 70, 79),
 yield= c(8.93, 10.8, 18.59, 22.33, 39.35,
         56.11, 61.73, 64.62, 67.08))
 regmod-yield ~ t1 - t2*exp(-exp(t3+t4*log(time)))
 huetstart-c(t1=70, t2=60, t3=0, t4=1)
 anlsx-try(nls(regmod, start=huetstart, trace=TRUE, data=pastured))
 print(anlsx)


 John Nash

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] trellis plot

2012-03-25 Thread Sebastián Daza
Hi everyone,
I am just trying to figure out how to do a xyplot where in addition to
 dots and lines I can change dots' colors according to an individual
variable (e.g., marital disruption across time, a dummy 0/1). When I
use groups specification (see below), I get two different lines for
each individual based on groups, and what I want is to get one line
connecting dots, and different dots' colors according to marital
disruption.

Any ideas about how to do that?

person  - rep(1:2, each=4)
income  - c(100, 120, 150, 200, 90, 100,120, 150)
disruption  - rep(c(0,1), 4)
time  - rep(c(1:4),2)
dat  - as.data.frame(cbind(person,time, income, disruption))


library(lattice)
xyplot(income~time|as.factor(person),data=dat,
   type=c(p,g,o), col.line=black,
   xlab=Time,
   ylab=Familiar Income)

# I just want to change dots' colors according to disruption, not to
get two different lines:

xyplot(income~time|as.factor(person),data=dat,
   type=c(p,g,o), col.line=black, groups=disruption,
   xlab=Time,
   ylab=Familiar Income)



Thank you in advance!

-- 
Sebastián Daza

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.