date:20070201

[R] Re : Combining two datasets

2007-02-01 Thread justin bem


data3-sort(c(data1,data2))

 See ?cbind, ? rbind ?append  and ?merge for combining data

 
Justin BEM
Elève Ingénieur Statisticien Economiste
BP 294 Yaoundé.
Tél (00237)9597295.







___ 
Découvrez une nouvelle façon d'obtenir des réponses à toutes vos questions ! 
Profitez des connaissances, des opinions et des expériences des internautes sur 
Yahoo! Questions/Réponses 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Sleep Function

2007-02-01 Thread Shubha Vishwanath Karanth

Hi R,

 

I am fetching Bloomberg data from R. The problem I face is that I get a
downloading error once pasted, but the same code run again will download
the data. So, I assure you that it is not the problem with the R code. I
will not be able to download the data due to some system
capacities...Could I use sleep function of R here?

 

Thanks,

Shubha


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Calling C code from R

2007-02-01 Thread Deb Midya

Hi!
   
  Thanks in advance.
   
  I am using R-2.4.0 on Windows XP. I am trying to create dll file.
   
  My C code:
   
  /* useC1.c */
  void useC(int *i) {
i[6] = 100;
}
   
  I have tried to create useC1.dll. 
   
  C:\R-2.4.0\binR CMD SHLIB useC1.c
   
  'perl' is not recognized as an internal or external command, operable program 
or batch file.
   
  Then I have tried:
   
  C:\R-2.4.0\binRcmd SHLIB useC1.c
   
  'perl' is not recognized as an internal or external command, operable program 
or batch file.
   
  I am looking forward for your reply.
   
  Regards,
   
  Deb
   
  Statistician
  NSW Department of Commerce
  Sydney
  Australia.
   


-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extracting part of date variable

2007-02-01 Thread stat stat

Dear all,
   
  Suppose I have a date variable:
   
  c = 99/05/12
   
  I want to extract the parts of this date like month number, year and day. I 
can do it in SPSS. Is it possible to do this in R as well?
   
  Rgd,


-
 Heres a new way to find what you're looking for - Yahoo! Answers 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Estimation of discrete unimodal density

2007-02-01 Thread wvanwie

Dear All,


A method for the estimation is univariate unimodal densities (with unknown 
mode) is described in Statistical Inference under Order Restrictions by 
Barlow et al.. Would anyone know whether there is an R-implementation 
(preferably with reference) for the estimation of univariate discrete unimodal 
densities (with unknown mode)? Thanks in advance for your help.

Kind regards,


Wessel van Wieringen

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R for bioinformatics

2007-02-01 Thread Benoit Ballester

Hi,

I was wondering if someone could tell me more about this book, (if it's 
a good or bad one).
I can't find it, as it seems that O'Reilly doesn't publish any more.

Thanks,

Ben


-- 
Benoit Ballester

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting part of date variable

2007-02-01 Thread Henrique Dallazuanna

format.Date(c, %d)
format.Date(c, %m)
format.Date(c, %y)
format.Date(c, %Y)

On 01/02/07, stat stat [EMAIL PROTECTED] wrote:

 Dear all,

   Suppose I have a date variable:

   c = 99/05/12

   I want to extract the parts of this date like month number, year and
 day. I can do it in SPSS. Is it possible to do this in R as well?

   Rgd,


 -
 Here's a new way to find what you're looking for - Yahoo! Answers
 [[alternative HTML version deleted]]



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Henrique Dallazuanna

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Wavlet filter using morlet mother wavelet

2007-02-01 Thread anil kumar rohilla

nbsp;   Hi, List ,I am searching any package on R which can do wavelet 
filtering for mother wavelet morlet ,is anybody having any script for the same 
?I am new to the RwAVELET ANALSSIS..THANKS IN ADVANCE  ANIL KUMAR   
   ANIL KUMAR(nbsp;METEOROLOGIST)LRF SECTIONnbsp;NATIONAL 
CLIMATEnbsp;CENTER ADGM(RESEARCH)INDIA METEOROLOGICALnbsp;DEPARTMENT  
  SHIVIJI NAGARPUNE-411005 INDIAMOBILE +919422023277[EMAIL 
PROTECTED]  

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calling C code from R

2007-02-01 Thread Vladimir Eremeev


You need to install perl and MinGW, at least.
If you have them installed, then you need to properly set PATH environment
variable and, probably, restart your command line session.

See chapter 5 of the manual Writing R extensions (installed in
R_HOME/doc/manual)
and these two links

http://www.murdoch-sutherland.com/Rtools/
http://www.stats.uwo.ca/faculty/murdoch/software/debuggingR/

Also, it would be great to upgrade R to 2.4.1


Deb Midya wrote:
 
   I am using R-2.4.0 on Windows XP. I am trying to create dll file.
   My C code:
   /* useC1.c */
   void useC(int *i) {
 i[6] = 100;
 }

   I have tried to create useC1.dll. 
   C:\R-2.4.0\binR CMD SHLIB useC1.c
   'perl' is not recognized as an internal or external command, operable
 program or batch file.
 
   Then I have tried:
   C:\R-2.4.0\binRcmd SHLIB useC1.c
   'perl' is not recognized as an internal or external command, operable
 program or batch file.
 

-- 
View this message in context: 
http://www.nabble.com/-R--Calling-C-code-from-R-tf3154058.html#a8746593
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] matrix of matrices

2007-02-01 Thread Federico Abascal

Dear all,

it is likely a stupid question but I cannot solve it.

I want to have a matrix of 100 elements.
Each element must be a vector of 500 elements.

If I do:
imp-array(dim=100)
imp[1]-vector(length=500)
it does not work. Warning message: number of items to replace is not a
multiple of replacement length

If I do:
imp - array(dim=c(100,500))   
and then fill imp:
for(i in c(1:500)) {
imp[i,] - im[1:500,]
#im[1:500,] is a vector of length 500, of class numeric. IT
CONTAINS NAMES!
}

Now it works, but I loose the labels (names) associated to the original
im variable.
If I just do:
j- im[1:500,]
I do not loose the labels.

names(j) = list of labels
names(imp[1,]) = NULL

Any clue?

Thanks in advance!
Federico

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] indexing

2007-02-01 Thread javier garcia-pintado

Hello,
In a nutshell, I've got a data.frame like this:

 assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
 assignation
  value class
1   6.5 1
2   7.5 3
3   8.5 5
4  12.0 2
   

and a long vector of classes like this:

 x - c(1,1,2,7,6,5,4,3,2,2,2...)

And would like to obtain  a vector of length = length(x), with the
corresponding values extracted from assignation table. Like this:
 x.value
 [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0

Could you help me with an elegant way to do this ?
(I just can do it with looping for each class in the assignation table,
what a think is not perfect in R's sense)

Wishes,
Javier
-- 

Javier García-Pintado
Institute of Earth Sciences Jaume Almera (CSIC)
Lluis Sole Sabaris s/n, 08028 Barcelona
Phone: +34 934095410
Fax:   +34 934110012
e-mail:[EMAIL PROTECTED] 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting part of date variable

2007-02-01 Thread Gabor Grothendieck

Read the help desk article in R News 4/1 about dates and note the table at the
end of it, in particular.

On 2/1/07, stat stat [EMAIL PROTECTED] wrote:
 Dear all,

  Suppose I have a date variable:

  c = 99/05/12

  I want to extract the parts of this date like month number, year and day. I 
 can do it in SPSS. Is it possible to do this in R as well?

  Rgd,


 -
  Here's a new way to find what you're looking for - Yahoo! Answers
[[alternative HTML version deleted]]



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calling C code from R

2007-02-01 Thread Peter Dalgaard

Deb Midya wrote:
 Hi!

   Thanks in advance.

   I am using R-2.4.0 on Windows XP. I am trying to create dll file.

   My C code:

   /* useC1.c */
   void useC(int *i) {
 i[6] = 100;
 }

   I have tried to create useC1.dll. 

   C:\R-2.4.0\binR CMD SHLIB useC1.c

   'perl' is not recognized as an internal or external command, operable 
 program or batch file.

   Then I have tried:

   C:\R-2.4.0\binRcmd SHLIB useC1.c

   'perl' is not recognized as an internal or external command, operable 
 program or batch file.

   I am looking forward for your reply.

   
Did you install Perl? and did you read
http://cran.r-project.org/doc/manuals/R-admin.html#The-Windows-toolset
and http://cran.r-project.org/doc/manuals/R-exts.html#Creating-R-packages?

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting part of date variable

2007-02-01 Thread Peter Dalgaard

stat stat wrote:
 Dear all,

   Suppose I have a date variable:

   c = 99/05/12

   I want to extract the parts of this date like month number, year and day. I 
 can do it in SPSS. Is it possible to do this in R as well?

   Rgd,

   
Yes. One way is to use substr(), e.g.:

 substr(c,1,2)
[1] 99
 as.numeric(substr(c,1,2))
[1] 99

This also nicely sidesteps the ambiguity issue: 1999 or 1899? May or
December? On the other hand, you'll get in trouble if leading zeros are
sometimes absent (strsplit() or gsub() if you want to pursue that route
further).

For a more principled approach, use the time and date handling tools.
Assuming that you can live with the system defaults for 2-digit years,

 strptime(c,format=%y/%m/%d)
[1] 1999-05-12
 strptime(c,format=%y/%m/%d)$year
[1] 99
 strptime(c,format=%y/%m/%d)$mon
[1] 4
 strptime(c,format=%y/%m/%d)$mday
[1] 12

Beware the peculiarities of the entries defined by POSIX standard, see
?DateTimeClasses, and also:

 '%y' Year without century (00-99). If you use this on input, which
  century you get is system-specific.  So don't!  Often values
  up to 69 (or 68) are prefixed by 20 and 70(or 69) to 99 by
  19.

(I'm at a bit of a loss as to fixing up two digit years once the damage
has been done. Presumably, you can just diddle the year field, but I'm a
bit uneasy about the fact that  2000 was a leap year and 1900 was not.)



-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing

2007-02-01 Thread Dimitris Rizopoulos

one way is the following:

assignation$value[match(x, assignation$class)]


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: javier garcia-pintado [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 12:05 PM
Subject: [R] indexing


 Hello,
 In a nutshell, I've got a data.frame like this:

 assignation - 
 data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
 assignation
  value class
 1   6.5 1
 2   7.5 3
 3   8.5 5
 4  12.0 2


 and a long vector of classes like this:

 x - c(1,1,2,7,6,5,4,3,2,2,2...)

 And would like to obtain  a vector of length = length(x), with the
 corresponding values extracted from assignation table. Like this:
 x.value
 [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0

 Could you help me with an elegant way to do this ?
 (I just can do it with looping for each class in the assignation 
 table,
 what a think is not perfect in R's sense)

 Wishes,
 Javier
 -- 

 Javier García-Pintado
 Institute of Earth Sciences Jaume Almera (CSIC)
 Lluis Sole Sabaris s/n, 08028 Barcelona
 Phone: +34 934095410
 Fax:   +34 934110012
 e-mail:[EMAIL PROTECTED]







 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing

2007-02-01 Thread jim holtman

 assignation$value[match(x,assignation$class)]
 [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0


On 2/1/07, javier garcia-pintado [EMAIL PROTECTED] wrote:
 Hello,
 In a nutshell, I've got a data.frame like this:

  assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
  assignation
  value class
 1   6.5 1
 2   7.5 3
 3   8.5 5
 4  12.0 2
 

 and a long vector of classes like this:

  x - c(1,1,2,7,6,5,4,3,2,2,2...)

 And would like to obtain  a vector of length = length(x), with the
 corresponding values extracted from assignation table. Like this:
  x.value
  [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0

 Could you help me with an elegant way to do this ?
 (I just can do it with looping for each class in the assignation table,
 what a think is not perfect in R's sense)

 Wishes,
 Javier
 --

 Javier García-Pintado
 Institute of Earth Sciences Jaume Almera (CSIC)
 Lluis Sole Sabaris s/n, 08028 Barcelona
 Phone: +34 934095410
 Fax:   +34 934110012
 e-mail:[EMAIL PROTECTED]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.spss and encodings

2007-02-01 Thread Thomas Friedrichsmeier

Hi!

I'm having trouble with importing spss files containing non-ascii characters 
(R 2.4.1, debian linux, i386). To reproduce:

Download the following file: 
http://statmath.wu-wien.ac.at/data/spss/de/comphomeneu.sav

require (foreign)
Sys.setlocale (locale=C)
read.spss(comphomeneu.sav)$ARBEIT[1]
# prints:
# [1] im B\374ro
# Levels: im B\374ro zuhause

\374 of course is actually a u-umlaut. However, I guess in the C locale it's 
not expected to print as such. But now try this (use any UTF-8 locale you may 
have installed):

Sys.setlocale (locale=de_DE.UTF-8)
read.spss(comphomeneu.sav)$ARBEIT[1]
# prints:
# [1]Error in print.default(xx, quote = quote, ...) :
#invalid multibyte string

To me it looks, like read.spss () would probably need an encoding parameter, 
and / or some iconv () magic. Now, locale conversion always makes my head 
spin, so I thought I'd better post here, before calling this to be a bug in 
R. Two questions:

1) Is there some way to work around this, i.e. make sure it is converted to 
proper UTF-8 while importing? Am I missing something obvious?
2) Should I submit this as a bug report?

Thanks!
Thomas Friedrichsmeier


pgpEhd7gpCdY9.pgp
Description: PGP signature
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How can I calculate conditional mean in a large dataset including date data

2007-02-01 Thread Majid Iravani

Dear R users,

I have a dataframe with two columns: first column is date data (e.g. 
1/1/2000 with character format: daily data from 1/1/1970 till 31/12/2003) 
and second column is temperature value. Now I'd like to calculate mean for 
each month in a year (i.e. May 2001, June 1997) and mean for each month in 
all of years. As the number of days in some months is different from others 
I could not write appreciate command for this. Therefore I would greatly 
appreciate if somebody can help me in this case

Thank you
Majid

  Majid Iravani
  PhD Student
  Swiss Federal Research Institute WSL
  Research Group of Vegetation Ecology
  Zürcherstrasse 111  CH-8903 Birmensdorf  Switzerland
  Phone: +41-1-739-2693
  Fax: +41-1-739-2215
  Email: [EMAIL PROTECTED]
http://www.wsl.ch/staff/majid.iravani/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss and encodings

2007-02-01 Thread Peter Dalgaard

Thomas Friedrichsmeier wrote:
 Hi!

 I'm having trouble with importing spss files containing non-ascii characters 
 (R 2.4.1, debian linux, i386). To reproduce:

 Download the following file: 
 http://statmath.wu-wien.ac.at/data/spss/de/comphomeneu.sav

 require (foreign)
 Sys.setlocale (locale=C)
 read.spss(comphomeneu.sav)$ARBEIT[1]
 # prints:
 # [1] im B\374ro
 # Levels: im B\374ro zuhause

 \374 of course is actually a u-umlaut. However, I guess in the C locale it's 
 not expected to print as such. But now try this (use any UTF-8 locale you may 
 have installed):

 Sys.setlocale (locale=de_DE.UTF-8)
 read.spss(comphomeneu.sav)$ARBEIT[1]
 # prints:
 # [1]Error in print.default(xx, quote = quote, ...) :
 #invalid multibyte string

 To me it looks, like read.spss () would probably need an encoding parameter, 
 and / or some iconv () magic. Now, locale conversion always makes my head 
 spin, so I thought I'd better post here, before calling this to be a bug in 
 R. Two questions:

 1) Is there some way to work around this, i.e. make sure it is converted to 
 proper UTF-8 while importing? Am I missing something obvious
   
 2) Should I submit this as a bug report?
   
1) Yes, 2) No

This is really not in read.spss, but in R itself. The short version is
that in released versions, we have

 Im B\374ro
[1]Error: invalid multibyte string

which is indeed a buglet, since it is not good if you cannot output what
you can input (notice that there is no problem until you try to print).
In r-devel, this has become

 Im B\374ro
[1] Im B\xfcro

so that invalid multibytes at least do not cause error. However, the
real issue is that the string  is in the wrong encoding for your locale,
so you should convert it:

 iconv(Im B\xfcro, from=latin1, to=UTF-8)
[1] Im Büro
 iconv(Im B\374ro,from=latin1, to=UTF-8)
[1] Im Büro


-p

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory-efficient column aggregation of a sparse matrix

2007-02-01 Thread Douglas Bates

On 1/31/07, Jon Stearley [EMAIL PROTECTED] wrote:
 I need to sum the columns of a sparse matrix according to a factor -
 ie given a sparse matrix X and a factor fac of length ncol(X), sum
 the elements by column factors and return the sparse matrix Y of size
 nrow(X) by nlevels(f).  The appended code does the job, but is
 unacceptably memory-bound because tapply() uses a non-sparse
 representation.  Can anyone suggest a more memory and cpu efficient
 approach?  Eg, a sparse matrix tapply method?  Thanks.

This is the sort of operation that is much more easily performed in
the triplet representation of a sparse matrix where each nonzero
element is represented by its row index, column index and value.
Using that representation you could map the column indices according
to the factor then convert back to one of the other representations.
The only question would be what to do about nonzeros in different
columns of the original matrix that get mapped to the same element in
the result.  It turns out that in the sparse matrix code used by the
Matrix package the triplet representation allows for duplicate index
positions with the convention that the resulting value at a position
is the sum of the values of any triplets with that index pair.

If you decide to use this approach please be aware that the indices
for the triplet representation in the Matrix package are 0-based (as
in C code) not 1-based (as in R code).  (I imagine that Martin is
thinking we really should change that as he reads this part.)


 --
 +--+
 | Jon Stearley  (505) 845-7571  (FAX 844-9297) |
 | Sandia National Laboratories  Scalable Systems Integration   |
 +--+


 # x and y are of SparseM class matrix.csr
 aggregate.csr -
 function(x, fac) {
  # make a vector indicating the row of each nonzero
  rows - integer(length=length([EMAIL PROTECTED]))
  [EMAIL PROTECTED]:nrow(x)]] - 1 # put a 1 at start of each row
  rows - as.integer(cumsum(rows)) # and finish with a cumsum

  # make a vector indicating the column factor of each nonzero
  f - [EMAIL PROTECTED]

  # aggregate by row,f
  y - tapply([EMAIL PROTECTED], list(rows,f), sum)

  # sparsify it
  y[is.na(y)] - 0  # change tapply NAs to as.matrix.csr 0s
  y - as.matrix.csr(y)

  y
 }


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matrix of matrices

2007-02-01 Thread Federico Abascal

For the case someone is interested in it, here it is the solution
somebody suggested me: to use a list.

imp - vector(list, 100)
imp[[1]] - im[1:500,]
names(imp[[1]]) = the list of labels of imp[1:500,]

Thanks!
Federico



Federico Abascal wrote:
 Dear all,

 it is likely a stupid question but I cannot solve it.

 I want to have a matrix of 100 elements.
 Each element must be a vector of 500 elements.

 If I do:
 imp-array(dim=100)
 imp[1]-vector(length=500)
 it does not work. Warning message: number of items to replace is not a
 multiple of replacement length

 If I do:
 imp - array(dim=c(100,500))   
 and then fill imp:
 for(i in c(1:500)) {
 imp[i,] - im[1:500,]
 #im[1:500,] is a vector of length 500, of class numeric. IT
 CONTAINS NAMES!
 }

 Now it works, but I loose the labels (names) associated to the original
 im variable.
 If I just do:
 j- im[1:500,]
 I do not loose the labels.

 names(j) = list of labels
 names(imp[1,]) = NULL

 Any clue?

 Thanks in advance!
 Federico

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] features of save and save.image (unexpected file sizes)

2007-02-01 Thread Vaidotas Zemlys

Hi,


On 2/1/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Thu, 1 Feb 2007, Vaidotas Zemlys wrote:

  Hi,
 
  On 1/31/07, Professor Brian Ripley [EMAIL PROTECTED] wrote:
  Two comments:
 
  1) ls() does not list all the objects: it has all.names argument.
 
  Yes, I tried it with all.names, but the effect was the same, I forgot
  to mention it in a letter.
 
  2) save.image() does not just save the objects in the workspace, it also
  saves any environments they may have.  Having a function with a
  large environment is the usual cause of a large saved image.
 
  I have little experience dealing with enivronments, so is there a
  quick way to discard the environments of the functions? When saving
  the session I really do not need them.

 Change, not discard.  E.g. environment(f) - .GlobalEnv.  If environments
 are not mentioned by anything saved, they will not be saved.


I found the culprit. I was parsing formulas in my code, and I saved
them in that large object. So the environment came with saved
formulas. Is there a nice way to say R: please do not save the
environments with the formulas, I do not need them?

This is what I was doing (I am discarding irrelevant code)

testf- function(formula) {
   mainform - formula
   if(deparse(mainform[[3]][[1]])!=|) pandterm(invalid conditioning
for main regression)
mmodel - substitute(y~x,list(y=mainform[[2]],x=mainform[[3]][[2]]))
mmodel - as.formula(mmodel)
   list(formula=list(main=mmodel))
}

when called
bu - testf(lnp~I(CE/12000)+hhs|Country)

I get

ls(env=environment(bu$formula$main))
[1] formula  mainform mmodel

or in actual case, a lot of more objects, which I do not need, but
which take a lot of place. For the moment I solved the problem with

environment(mmodel) - NULL

but is this correct R way?

Vaidotas Zemlys
--
Doctorate student, http://www.mif.vu.lt/katedros/eka/katedra/zemlys.php
Vilnius University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory-efficient column aggregation of a sparse matrix

2007-02-01 Thread roger koenker

Doug is right, I think, that this would be easier with full indexing
using the  matrix.coo classe, if you want to use SparseM.  But
then the tapply seems to be the way to go.

url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820


On Feb 1, 2007, at 7:22 AM, Douglas Bates wrote:

 On 1/31/07, Jon Stearley [EMAIL PROTECTED] wrote:
 I need to sum the columns of a sparse matrix according to a factor -
 ie given a sparse matrix X and a factor fac of length ncol(X), sum
 the elements by column factors and return the sparse matrix Y of size
 nrow(X) by nlevels(f).  The appended code does the job, but is
 unacceptably memory-bound because tapply() uses a non-sparse
 representation.  Can anyone suggest a more memory and cpu efficient
 approach?  Eg, a sparse matrix tapply method?  Thanks.

 This is the sort of operation that is much more easily performed in
 the triplet representation of a sparse matrix where each nonzero
 element is represented by its row index, column index and value.
 Using that representation you could map the column indices according
 to the factor then convert back to one of the other representations.
 The only question would be what to do about nonzeros in different
 columns of the original matrix that get mapped to the same element in
 the result.  It turns out that in the sparse matrix code used by the
 Matrix package the triplet representation allows for duplicate index
 positions with the convention that the resulting value at a position
 is the sum of the values of any triplets with that index pair.

 If you decide to use this approach please be aware that the indices
 for the triplet representation in the Matrix package are 0-based (as
 in C code) not 1-based (as in R code).  (I imagine that Martin is
 thinking we really should change that as he reads this part.)


 --
 +--+
 | Jon Stearley  (505) 845-7571  (FAX 844-9297) |
 | Sandia National Laboratories  Scalable Systems Integration   |
 +--+


 # x and y are of SparseM class matrix.csr
 aggregate.csr -
 function(x, fac) {
  # make a vector indicating the row of each nonzero
  rows - integer(length=length([EMAIL PROTECTED]))
  [EMAIL PROTECTED]:nrow(x)]] - 1 # put a 1 at start of each row
  rows - as.integer(cumsum(rows)) # and finish with a cumsum

  # make a vector indicating the column factor of each nonzero
  f - [EMAIL PROTECTED]

  # aggregate by row,f
  y - tapply([EMAIL PROTECTED], list(rows,f), sum)

  # sparsify it
  y[is.na(y)] - 0  # change tapply NAs to as.matrix.csr 0s
  y - as.matrix.csr(y)

  y
 }


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Index mapping on arrays

2007-02-01 Thread Demi Anderson

Dear R-community,

I have some trouble with index mappings for arrays. If, for example,
I have the array

R A - array(1:9, c(3,3,2))

and two index mappings both of same size

R x - c(2, 3)
R y - c(1, 2)

Now I want to access the elements (A[1, x[i], y[i]])_i of A, i.e.
A[1, x[1], y[1]] = A[1, 2, 1] and A[1, x[2], y[2]] = A[1, 3, 2].

If I use

R A[1, x, y]

I would get every combinations of indices of all elements of x and y
i.e. A[1, x[1], y[1]], A[1, x[1], y[2]], A[1, x[2], y[1]] and A[1,
x[2], y[2]]. But how can I access the elements (A[1, x[i], y[i]])_i.
My arrays dimensions are actually large in the second
component (for example the dimension might be 10*1*10) so I'm
looking for a method avoiding loops.

The question is probably trivial for you, but I just could not
figure it out. So sorry for bugging you and many thanks in advance
for any help.

Best wishes, Demi Anderson.
-- 
Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wiki for Graphics tips for MacOS X

2007-02-01 Thread Ben Bolker

Gabor Grothendieck ggrothendieck at gmail.com writes:

 
 To get the best results you need to transfer it using vector
 graphics rather than bitmapped graphics:
 
 http://www.stc-saz.org/resources/0203_graphics.pdf
 
 There are a number of variations described here (see
 entire thread).  Its for UNIX and Windows but I think
 it would likely work similarly on Mac and Windows:
 
 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/32297.html
 
   
  yes, but:

  the whole point of this discussion was that there _is_ no
vector format that anyone knows of that (1) can be reliably
created on MacOS using R/open source tools and (2) can be
reliably imported into MS Word (with working preview etc.).
The thread you reference assumes that one has a Windows machine
handy (with or without R installed) for creating WMF
graphics.
  Hence the advice to create a high-resolution PNG, which
seems to work well enough even if it is not optimal.

  cheers
Ben Bolker

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any implementations for adaptive modeling of time series?

2007-02-01 Thread AA

Hi Peter,

generally speaking, wavelets are known to be good at extracting signal from
noisy data and are
adaptive but I am not familiar with any R implementation of wavelets.
A simple way of looking at changes would be to use CUSUM (strucchange
package).
I hope this helps.
Ansel.

On 1/30/07, Peter Nimda [EMAIL PROTECTED] wrote:

 Hallo,

 my noisy time series represent a fading signal comprising of long
 enough parts with a simple trend inside of each such a part.
 Transition from one part into another is always a non-smooth
 and very sharp/acute. In other words I have a piecewise
 polynomial noisy curve asymptotically converging to the
 biased constant, points between pieces are non-differentiable.

 I am looking for implementations of models adequate for such a
 data. Are there any possibilities to adapt the ARIMA or
 MCMC?

 Many thanks in advance for any help/URLs

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Estimation of discrete unimodal density

2007-02-01 Thread rolf

Wessel van Wieringen wrote:

 A method for the estimation is univariate unimodal densities (with
 unknown mode) is described in Statistical Inference under Order
 Restrictions by Barlow et al.. Would anyone know whether there is an
 R-implementation (preferably with reference) for the estimation of
 univariate discrete unimodal densities (with unknown mode)? Thanks in
 advance for your help.

You could have a look at my ``isotonic'' package.  Go to:

http://www.math.unb.ca/~rolf/Research/Packages/

Click on ``gzipped tar file for R'' under ``isotonic''.

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Index mapping on arrays

2007-02-01 Thread Dimitris Rizopoulos

probably you want something like the following:

A[cbind(rep(1, length(x)), x, y)]


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Demi Anderson [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 2:34 PM
Subject: [R] Index mapping on arrays


 Dear R-community,

 I have some trouble with index mappings for arrays. If, for example,
 I have the array

 R A - array(1:9, c(3,3,2))

 and two index mappings both of same size

 R x - c(2, 3)
 R y - c(1, 2)

 Now I want to access the elements (A[1, x[i], y[i]])_i of A, i.e.
 A[1, x[1], y[1]] = A[1, 2, 1] and A[1, x[2], y[2]] = A[1, 3, 2].

 If I use

 R A[1, x, y]

 I would get every combinations of indices of all elements of x and y
 i.e. A[1, x[1], y[1]], A[1, x[1], y[2]], A[1, x[2], y[1]] and A[1,
 x[2], y[2]]. But how can I access the elements (A[1, x[i], y[i]])_i.
 My arrays dimensions are actually large in the second
 component (for example the dimension might be 10*1*10) so I'm
 looking for a method avoiding loops.

 The question is probably trivial for you, but I just could not
 figure it out. So sorry for bugging you and many thanks in advance
 for any help.

 Best wishes, Demi Anderson.
 -- 
 Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prop.test() references

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 07:22 +0100, Jean lobry wrote:
 Dear R-help,
 
 I'm using prop.test() to compute a confidence interval for a proportion
 under R version 2.4.1, as in:
 
 prop.test(x = 340, n = 400)$conf
 [1] 0.8103309 0.8827749
 
 I have two questions:
 
 1) from the source code my understanding is that the confidence
 interval is computed according to Wilson, E.B. (1927) Probable
 inference, the law of succession, and statistical inference.
 J. Am. Stat. Assoc., 22:209-212.
 Is it correct?

Yes.

 2) The doc says Continuity correction is used only if it does not exceed
 the difference between sample and null proportions in absolute value.
 Does someone has a reference in which this point is discussed?

I believe that this is a modification by Newcombe. See:

Newcombe RG: Two-Sided Confidence Intervals for the Single Proportion:
Comparison of Seven Methods. Statistics in Medicine 1998;17:857-872.

Newcombe RG: Interval Estimation for the Difference Between Independent
Proportions: Comparison of Eleven Methods. Statistics in Medicine
1998;17:873-890.


HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wiki for Graphics tips for MacOS X

2007-02-01 Thread Gabor Grothendieck

On 2/1/07, Ben Bolker [EMAIL PROTECTED] wrote:
 Gabor Grothendieck ggrothendieck at gmail.com writes:

 
  To get the best results you need to transfer it using vector
  graphics rather than bitmapped graphics:
 
  http://www.stc-saz.org/resources/0203_graphics.pdf
 
  There are a number of variations described here (see
  entire thread).  Its for UNIX and Windows but I think
  it would likely work similarly on Mac and Windows:
 
  http://finzi.psych.upenn.edu/R/Rhelp02a/archive/32297.html
 

  yes, but:

  the whole point of this discussion was that there _is_ no
 vector format that anyone knows of that (1) can be reliably
 created on MacOS using R/open source tools and (2) can be
 reliably imported into MS Word (with working preview etc.).
 The thread you reference assumes that one has a Windows machine
 handy (with or without R installed) for creating WMF
 graphics.
  Hence the advice to create a high-resolution PNG, which
 seems to work well enough even if it is not optimal.

AFAIK there do exist tools for the Mac for fig graphics and that was one of the
several solutions proposed there.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] what is the purpose of an error message in uniroot?

2007-02-01 Thread Chris Andrews


Matt,

Some time back I didn't like the uniroot restriction either so I wrote a short 
function manyroots that breaks an interval into many shorter intervals and 
looks for a single root in each of them.  This function is NOT guaranteed to 
find all roots in an interval even if you specify many subintervals.  Graphing 
is always a good idea.  There is always a chance the function dips to or below 
the axis and back up in an arbitrarily short interval.  For example, f(x) = 
x^2.  If one of your subintervals doesn't happen to end at 0, you're out of 
luck.  (This is in contrast to uniroot working on a continuous function that is 
positive at one end of an interval and negative at the other end:  at least one 
root is guaranteed and uniroot will find it given enough iterations.)  
Furthermore, if you are working with a polynomial, just use polyroot.

Chris

(Sorry for the dearth of code comments in the following)

manyroots - function(f, interval, ints=1, maxlen=NULL,
  lower = min(interval), upper = max(interval),
  tol = .Machine$double.eps^0.25, maxiter = 1000, ...)
{
if (!is.numeric(lower) || !is.numeric(upper) || lower =
upper)
stop(lower  upper  is not fulfilled)
if (is.infinite(lower) || is.infinite(upper))
stop(Interval must have finite length)
if (!is.null(maxlen))
ints - ceiling((upper-lower)/maxlen)
if (!is.numeric(ints) || length(ints)1 || floor(ints)!=ints || ints1)
stop(ints must be positive integer)

ends - seq(lower, upper, length=ints+1)
fends - numeric(length(ends))
for (i in seq(along=ends)) fends[i] - f(ends[i], ...)

zeros - iters - prec - rep(NA, ints)

for (i in seq(ints)) {
cat(i, ends[i], ends[i+1], fends[i], fends[i+1], \n)
if (fends[i] * fends[i+1]  0) {
#cat(f() values at end points not of opposite sign\n)
next;
}
if (fends[i] == 0  i1) {
#cat(this was found in previous iteration\n)
next;
}

val - .Internal(zeroin(function(arg) f(arg, ...), ends[i],
ends[i+1], tol, as.integer(maxiter)))
if (as.integer(val[2]) == maxiter) {
warning(Iteration limit (, maxiter, ) reached in interval (,
  ends[i], ,, ends[i+1], ).)
}
zeros[i] - val[1]
iters[i] - val[2]
prec[i] - val[3]
}
zeros - as.vector(na.omit(zeros))
fzeros - numeric(length(zeros))
for (i in seq(along=zeros)) fzeros[i] - f(zeros[i], ...)

list(root = zeros, f.root = fzeros,
iter = as.vector(na.omit(iters)),
estim.prec = as.vector(na.omit(prec)))
}

gg - function(x) x*(x-1)*(x+1)
manyroots(gg, c(-4,4), 13, maxiter=200, tol=10^-10)

hh - function(x,x2) x^2-x2
manyroots(hh, c(-10, 10), maxlen=.178, x2=9)

manyroots(sin, c(-4,20), maxlen=.01)
#but
ss - function(x) sin(x)^2
manyroots(ss, c(-4,20), maxlen=.01)
plot(ss, -4,20)
abline(h=0)



From: [EMAIL PROTECTED]

[EMAIL PROTECTED] wrote:

  This is probably a blindingly obvious question:

Yes, it is.

  Why does it matter in the uniroot function whether the f() values at
  the end points that you supply are of the same sign?

Plot some graphs.

Think about the *name* of the function --- *uni*root.

Does that ring any bells?

And how do you know there *is* a root in the interval
in question?  Try your ``uniroot2'' on f(x) = 1+x2
and the interval [-5,5].

To belabour the point --- if the f() values are of the
same sign, then there are 0, or 2, or 4, or 
roots in the interval in question.

Rolf,
Only if f is continuous (of course finding roots of discontinuous functions 
is a greater challenge)

The ***only chance*** you have of there being a unique
root is if the f() values are of opposite sign.

The algorithm used and the precision estimates returned
presumably depend on the change of sign.  You can get
answers --- sometimes --- if the change of sign is not
present, but the results could be seriously misleading.

Without the opposite sign requirement the user will often
wind up trying to do something impossible or getting
results about which he/she is deluded.

cheers,

Rolf Turner
[EMAIL PROTECTED]

P. S.  If the f() values are of the same sign, uniroot() DOES
NOT give a warning!  It gives an error.

R. T.




-- 
Christopher Andrews, PhD
SUNY Buffalo, Department of Biostatistics
242 Farber Hall, [EMAIL PROTECTED], 716 829 2756

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible

Re: [R] How can I calculate conditional mean in a large dataset including date data

2007-02-01 Thread Vladimir Eremeev



dfr-data.frame(day=c(1/1/1970,5/1/1970,5/12/2003,31/12/2003),temperature=c(1,-1,2,0.5))
 dfr
day temperature
1   1/1/1970 1.0
2   5/1/1970-1.0
3  5/12/2003 2.0
4 31/12/2003 0.5

 aggregate(dfr[temperature],by=list(format(as.Date(dfr$day,format=%d/%m/%Y),%m-%Y)),mean,na.rm=TRUE)
  Group.1 temperature
1 01-19700.00
2 12-20031.25

 aggregate(dfr[temperature],by=list(format(as.Date(dfr$Dt,format=%d/%m/%Y),%m)),mean,na.rm=TRUE)
  Group.1 temperature
1  010.00
2  121.25


Majid Iravani wrote:
 
 Dear R users,
 
 I have a dataframe with two columns: first column is date data (e.g. 
 1/1/2000 with character format: daily data from 1/1/1970 till 31/12/2003) 
 and second column is temperature value. Now I'd like to calculate mean for 
 each month in a year (i.e. May 2001, June 1997) and mean for each month in 
 all of years. As the number of days in some months is different from
 others 
 I could not write appreciate command for this. Therefore I would greatly 
 appreciate if somebody can help me in this case
 
 Thank you
 Majid
 
 

-- 
View this message in context: 
http://www.nabble.com/-R--How-can-I-calculate-conditional-mean-in-a-large-dataset-including-date-data-tf3154751.html#a8748821
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss and encodings

2007-02-01 Thread Thomas Friedrichsmeier

On Thursday 01 February 2007 14:18, Peter Dalgaard wrote:
 so you should convert it:
  iconv(Im B\xfcro, from=latin1, to=UTF-8)

 [1] Im Büro

  iconv(Im B\374ro,from=latin1, to=UTF-8)

 [1] Im Büro

I see. Thanks!

Any chances of adding something like this to read.spss()?
read.spss - function([...], encoding=NULL) {
[...]
if (!is.null(encoding)) {
iconv.recursive - function(x, from) {
attribs - attributes(x);
if (is.character(x)) {
x - iconv(x, from=from, to=, sub=)
} else if (is.list(x)) {
x - lapply(x, function(sub) 
iconv.recursive(sub, from))
}
# convert factor levels and all other attributes
attributes(x) - lapply(attribs, function(sub) 
iconv.recursive(sub, from))
x
}

convert.recursive(rval, from=encoding)
} else {
rval
}
}

Now that I've written this iconv.recursive() function once, I'm fine. But I 
guess something like this might be useful to others as well.

Regards
Thomas


pgpzGuoLD95Bi.pgp
Description: PGP signature
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: Re: read.spss and encodings

2007-02-01 Thread John Kane

--- John Kane [EMAIL PROTECTED] wrote:

 Date: Thu, 1 Feb 2007 09:07:11 -0500 (EST)
 From: John Kane [EMAIL PROTECTED]
 Subject: Re: [R] read.spss and encodings
 To: Thomas Friedrichsmeier
 [EMAIL PROTECTED]

 Hi Thomas,

 I am using R 2.4.1 on WindowsXP and I don't seem to
 be
 having any problem, im Büro, and zuhause are coming
 in
  just fine in a 200 line dataset. 

 I have imported it with both read.spss and spss.get
 (package Hmisc) with no problems.

 I am afraid I have no idea what the problem is but
 it
 does not seem to be specifically an R problem

 --- Thomas Friedrichsmeier
 [EMAIL PROTECTED] wrote:

  Hi!

  I'm having trouble with importing spss files
  containing non-ascii characters 
  (R 2.4.1, debian linux, i386). To reproduce:

  Download the following file: 

http://statmath.wu-wien.ac.at/data/spss/de/comphomeneu.sav

  require (foreign)
  Sys.setlocale (locale=C)
  read.spss(comphomeneu.sav)$ARBEIT[1]
  # prints:
  # [1] im B\374ro
  # Levels: im B\374ro zuhause

  \374 of course is actually a u-umlaut. However, I
  guess in the C locale it's 
  not expected to print as such. But now try this
 (use
  any UTF-8 locale you may 
  have installed):

  Sys.setlocale (locale=de_DE.UTF-8)
  read.spss(comphomeneu.sav)$ARBEIT[1]
  # prints:
  # [1]Error in print.default(xx, quote = quote,
 ...)
  :
  #invalid multibyte string

  To me it looks, like read.spss () would probably
  need an encoding parameter, 
  and / or some iconv () magic. Now, locale
 conversion
  always makes my head 
  spin, so I thought I'd better post here, before
  calling this to be a bug in 
  R. Two questions:

  1) Is there some way to work around this, i.e.
 make
  sure it is converted to 
  proper UTF-8 while importing? Am I missing
 something
  obvious?
  2) Should I submit this as a bug report?

  Thanks!
  Thomas Friedrichsmeier
   __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.

 __
 Do You Yahoo!?

 protection around 
 http://mail.yahoo.com 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I calculate conditional mean in a large dataset including date data

2007-02-01 Thread bogdan romocea

days - seq(as.Date(1970/1/1), as.Date(2003/12/31), days)
temp - rnorm(length(days), mean=10, sd=8)
tapply(temp, format(days,%Y-%m), mean)
tapply(temp, format(days,%b), mean)


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Majid Iravani
 Sent: Thursday, February 01, 2007 8:11 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] How can I calculate conditional mean in a large
 dataset including date data

 Dear R users,

 I have a dataframe with two columns: first column is date data (e.g.
 1/1/2000 with character format: daily data from 1/1/1970 till
 31/12/2003)
 and second column is temperature value. Now I'd like to
 calculate mean for
 each month in a year (i.e. May 2001, June 1997) and mean for
 each month in
 all of years. As the number of days in some months is
 different from others
 I could not write appreciate command for this. Therefore I
 would greatly
 appreciate if somebody can help me in this case

 Thank you
 Majid
 --
 --
   Majid Iravani
   PhD Student
   Swiss Federal Research Institute WSL
   Research Group of Vegetation Ecology
   Zürcherstrasse 111  CH-8903 Birmensdorf  Switzerland
   Phone: +41-1-739-2693
   Fax: +41-1-739-2215
   Email: [EMAIL PROTECTED]
 http://www.wsl.ch/staff/majid.iravani/

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extracting part of date variable

2007-02-01 Thread John Kane

Thank you Peter.

It was not my question but I was just about to start
the morning's work by searching Help and RSiteSearch()
for this exact question. 

--- Peter Dalgaard [EMAIL PROTECTED] wrote:

 stat stat wrote:
  Dear all,
 
Suppose I have a date variable:
 
c = 99/05/12
 
I want to extract the parts of this date like
 month number, year and day. I can do it in SPSS. Is
 it possible to do this in R as well?
 
Rgd,
 

 Yes. One way is to use substr(), e.g.:
 
  substr(c,1,2)
 [1] 99
  as.numeric(substr(c,1,2))
 [1] 99
 
 This also nicely sidesteps the ambiguity issue: 1999
 or 1899? May or
 December? On the other hand, you'll get in trouble
 if leading zeros are
 sometimes absent (strsplit() or gsub() if you want
 to pursue that route
 further).
 
 For a more principled approach, use the time and
 date handling tools.
 Assuming that you can live with the system defaults
 for 2-digit years,
 
  strptime(c,format=%y/%m/%d)
 [1] 1999-05-12
  strptime(c,format=%y/%m/%d)$year
 [1] 99
  strptime(c,format=%y/%m/%d)$mon
 [1] 4
  strptime(c,format=%y/%m/%d)$mday
 [1] 12
 
 Beware the peculiarities of the entries defined by
 POSIX standard, see
 ?DateTimeClasses, and also:
 
  '%y' Year without century (00-99). If you use
 this on input, which
   century you get is system-specific.  So
 don't!  Often values
   up to 69 (or 68) are prefixed by 20 and
 70(or 69) to 99 by
   19.
 
 (I'm at a bit of a loss as to fixing up two digit
 years once the damage
 has been done. Presumably, you can just diddle the
 year field, but I'm a
 bit uneasy about the fact that  2000 was a leap year
 and 1900 was not.)
 
 
 
 -- 
O__   Peter Dalgaard Øster
 Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099,
 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark 
 Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED]) 
 FAX: (+45) 35327907
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Index mapping on arrays

2007-02-01 Thread Robin Hankin

Hello Demi


The trick for array indexing
on an array A, where length(dim(A))==n,
is to use an n-column matrix M to extract the elements
by rows of M.

If I understand correctly, the following should help:


  A - array(1:18,c(3,3,2))
  x - 2:3
  y - 1:2
  A[cbind(x,y,1)]
[1] 2 6




You may find the following useful too:

  A[as.matrix(cbind(1,expand.grid(x,y)))]
[1]  4  7 13 16
 


best

rksh



On 1 Feb 2007, at 13:34, Demi Anderson wrote:

 Dear R-community,

 I have some trouble with index mappings for arrays. If, for example,
 I have the array

 R A - array(1:9, c(3,3,2))

 and two index mappings both of same size

 R x - c(2, 3)
 R y - c(1, 2)

 Now I want to access the elements (A[1, x[i], y[i]])_i of A, i.e.
 A[1, x[1], y[1]] = A[1, 2, 1] and A[1, x[2], y[2]] = A[1, 3, 2].

 If I use

 R A[1, x, y]

 I would get every combinations of indices of all elements of x and y
 i.e. A[1, x[1], y[1]], A[1, x[1], y[2]], A[1, x[2], y[1]] and A[1,
 x[2], y[2]]. But how can I access the elements (A[1, x[i], y[i]])_i.
 My arrays dimensions are actually large in the second
 component (for example the dimension might be 10*1*10) so I'm
 looking for a method avoiding loops.

 The question is probably trivial for you, but I just could not
 figure it out. So sorry for bugging you and many thanks in advance
 for any help.

 Best wishes, Demi Anderson.
 --  
 Feel free - 10 GB Mailbox, 100 FreeSMS/Monat ...

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743


-- 
This e-mail (and any attachments) is confidential and intend...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] xtable and column headings

2007-02-01 Thread Ian Kennedy


When I generate a LaTeX table using xtable I have been setting column names
to strings with LaTeX code in order to get features like subscripts in the
column headings. I recently had to reinstall xtable and discovered that all
my LaTeX column headings were printing out in LaTeX code rather than with
LaTeX formatting. For example, with the older xtable I could give my column
a name something like $A_b$ to get printed column heading of A with the
subscript b. Now my printed column heading is $A_b$ and the LaTeX code
in the .tex file generated by Sweave is \$A\_b\$. It seems that the
newest version of print.xtable takes all my LaTeX special characters and
inserts backslashes, making LaTeX print the special characters rather than
interpreting them.

Is there a way to keep xtable from fixing my column names like this? Is
there another (maybe better) way to get nicely LaTeX formatted column
headings from xtable?

Thanks,
Ian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How can I calculate conditional mean in a large dataset including date data

2007-02-01 Thread Gabor Grothendieck

You could also use aggregate with the zoo package.  Using the same
input data that Vladimir used, create a zoo variable and aggregate it:

 library(zoo)
 z - zoo(dfr[,2], as.Date(dfr[,1], %d/%m/%Y))
 aggregate(z, as.yearmon, mean)
Jan 1970 Dec 2003
0.00 1.25

zoo is described in the vignette:

library(zoo)
vignette(zoo)


On 2/1/07, Vladimir Eremeev [EMAIL PROTECTED] wrote:

 
 dfr-data.frame(day=c(1/1/1970,5/1/1970,5/12/2003,31/12/2003),temperature=c(1,-1,2,0.5))
  dfr
day temperature
 1   1/1/1970 1.0
 2   5/1/1970-1.0
 3  5/12/2003 2.0
 4 31/12/2003 0.5

  aggregate(dfr[temperature],by=list(format(as.Date(dfr$day,format=%d/%m/%Y),%m-%Y)),mean,na.rm=TRUE)
  Group.1 temperature
 1 01-19700.00
 2 12-20031.25

  aggregate(dfr[temperature],by=list(format(as.Date(dfr$Dt,format=%d/%m/%Y),%m)),mean,na.rm=TRUE)
  Group.1 temperature
 1  010.00
 2  121.25


 Majid Iravani wrote:
 
  Dear R users,
 
  I have a dataframe with two columns: first column is date data (e.g.
  1/1/2000 with character format: daily data from 1/1/1970 till 31/12/2003)
  and second column is temperature value. Now I'd like to calculate mean for
  each month in a year (i.e. May 2001, June 1997) and mean for each month in
  all of years. As the number of days in some months is different from
  others
  I could not write appreciate command for this. Therefore I would greatly
  appreciate if somebody can help me in this case
 
  Thank you
  Majid
 
 

 --
 View this message in context: 
 http://www.nabble.com/-R--How-can-I-calculate-conditional-mean-in-a-large-dataset-including-date-data-tf3154751.html#a8748821
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] line plot

2007-02-01 Thread Rense Nieuwenhuis

Hi,

or otherwise you may try:

  plot(c(1,5), c(1,10),type=l)

kindest regard, Rense


On Feb 1, 2007, at 8:14 , Petr Pikal wrote:

 Hi

 see ?segments
 segments(1,10,5,10)

 HTH
 Petr


 On 1 Feb 2007 at 14:21, XinMeng wrote:

 From: XinMeng [EMAIL PROTECTED]
 To:   r-help@stat.math.ethz.ch
 Date sent:Thu, 01 Feb 2007 14:21:34 +0800
 Subject:  [R] line plot
 Send reply to:XinMeng [EMAIL PROTECTED]
   mailto:[EMAIL PROTECTED]
   mailto:[EMAIL PROTECTED]

 Hello sir:
 I wanna get such kind of plot: a line whose start point is(1,10),end
 point is(5,10)

 In other words:

 How can I draw a line if I only know the coordinate of the start  
 point
 and end point? Thanks! My best

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.

 Petr Pikal
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Need help writing a faster code

2007-02-01 Thread Ravi Varadhan

Hi,

 

I apologize for this repeat posting, which I first posted yesterday. I would
appreciate any hints on solving this problem:

 

I have two matrices A (m x 2) and B (n x 2), where m and n are large
integers (on the order of 10^4).  I am looking for an efficient way to
create another matrix, W (m x n), which can be defined as follows:

 

for (i in 1:m){

for (j in 1:n) {

W[i,j] - g(A[i,], B[j,])

} }

where g(x,y) is a function that takes two vectors and returns a scalar.

 

The following works okay, but is not fast enough for my purpose.  I am sure
that I can do better:

 

for (i in 1:m) {

W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x)) 

}

 

How can I do this in a faster manner? I attempted outer, kronecker,
expand.grid, etc, but with no success.

 

Here is an example:

 

m - 2000

n - 5000

A - matrix(rnorm(2*m),ncol=2)

B - matrix(rnorm(2*n),ncol=2)

W - matrix(NA, m, n)

 

for (i in 1:m) {

W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x)) 

}

 

g - function(x,y){

theta - atan((y[2]-x[2]) / (y[1] - x[1]))

theta + 2*pi*(theta  0)

}

 

Thanks for any suggestions.

 

Best,

Ravi.

 

 

 


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mca-graphics: all elements overlapping in the help-example for multiple correspondence analysis

2007-02-01 Thread Michael Reinecke

Thank you very much, that works fine!

I now realize that I should have looked up not only the help pages for plot and 
mca but also for plot.mca, which I did not think possible (unfortunately I am a 
too sporadic user to know where to get the appropriate information).

Best regards,

Michael
 

 -Ursprüngliche Nachricht-
 Von: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
 Gesendet: Mittwoch, 31. Januar 2007 10:19
 An: Michael Reinecke
 Cc: r-help@stat.math.ethz.ch
 Betreff: Re: [R] mca-graphics: all elements overlapping in 
 the help-example for multiple correspondence analysis
 
 On Wed, 31 Jan 2007, Michael Reinecke wrote:
 
  Dear all,
 
  I tried out the example in the help document for mca (the 
 multiple correspondence analysis of the MASS package):
 
  farms.mca - mca(farms, abbrev=TRUE)
  farms.mca
  plot(farms.mca)
 
  But the graphic that I get seems unfeasible to me: I cannot 
 recognize 
  the numbers (printed in black) because they are all overlapping and 
  concealing each other. I don ´t dare using my own data, 
 which consist 
  of several hundred cases - I guess I won ´t see anything.
 
  How can I solve this? Thank you for any idea!
 
 Some levels do overplot, as they are identical (this is an 
 unusual example).  But as you see in the book, not many, and 
 you can adjust pointsize of your device or 'cex' to mitigate 
 the problem.
 
 Plotting the rows is optional: see the help page.  I would 
 not recommend plotting rows for several hundred cases.
 
 -- 
 Brian D. Ripley,  [EMAIL PROTECTED]
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loading functions in R

2007-02-01 Thread Matthew Keller

Hi Jeff,

The way I do this is to place all the options that I want, along with
functions I've written that I always want available, into the
Rprofile.site file. R always loads this file upon startup. That file
is located in the etc/ folder. E.g., on my computer, it is at:
C:\Program Files\R\R-2.2.1\etc\Rprofile.site . This is explained in
section 10.8 of the R-intro.pdf manual that comes with R.

If you only want the functions available sometimes, then use Christos'
suggestions.

-- Matt


On 1/31/07, Christos Hatzis [EMAIL PROTECTED] wrote:
 The recommended approach is to make a package for your functions that will
 include documentation, error checks etc.
 Another way to accomplish what you want is to start a new R session and
 'source' your .R files and then to save the workspace in a .RData file, e.g.
 myFunctions.RData.

 Finally

 attach(myFunctions.RData)

 should do the trick without cluttering your workspace.

 -Christos

 Christos Hatzis, Ph.D.
 Nuvera Biosciences, Inc.
 400 West Cummings Park
 Suite 5350
 Woburn, MA 01801
 Tel: 781-938-3830
 www.nuverabio.com



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Forest Floor
 Sent: Wednesday, January 31, 2007 10:41 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Loading functions in R

 Hi all,

 This information must be out there, but I can't seem to find it.  What I
 want to do is to store functions I've created (as .R files or in whatever
 form) and then load them when I need them (or on startup) so that I can
 access without cluttering my program with the function code.
 This seems like it should be easy, but

 Thanks!

 Jeff

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help writing a faster code

2007-02-01 Thread Dimitris Rizopoulos

the following seems to be a first improvement:

m - 2000
n - 5000
A - matrix(rnorm(2*m), ncol=2)
B - matrix(rnorm(2*n), ncol=2)
W1 - W2 - matrix(0, m, n)

##
##

g1 - function(x, y){
theta - atan((y[2] - x[2]) / (y[1] - x[1]))
theta + 2*pi*(theta  0)
}

invisible({gc(); gc()})
system.time(for (i in 1:m) {
W1[i, ] - apply(B, 1, y = A[i,], function(x, y) g1(y, x))
})

##

g2 - function(x){
out - tB - x
theta - atan(out[2, ] / out[1, ])
theta + 2*pi*(theta  0)
}

tB - t(B)
invisible({gc(); gc()})
system.time(for (i in 1:m) {
W2[i, ] - g2(A[i, ])
})

## or

invisible({gc(); gc()})
system.time(W3 - t(apply(A, 1, g2)))

all.equal(W1, W2)
all.equal(W1, W3)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Ravi Varadhan [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 4:10 PM
Subject: [R] Need help writing a faster code


 Hi,



 I apologize for this repeat posting, which I first posted yesterday. 
 I would
 appreciate any hints on solving this problem:



 I have two matrices A (m x 2) and B (n x 2), where m and n are large
 integers (on the order of 10^4).  I am looking for an efficient way 
 to
 create another matrix, W (m x n), which can be defined as follows:



 for (i in 1:m){

 for (j in 1:n) {

 W[i,j] - g(A[i,], B[j,])

 } }

 where g(x,y) is a function that takes two vectors and returns a 
 scalar.



 The following works okay, but is not fast enough for my purpose.  I 
 am sure
 that I can do better:



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 How can I do this in a faster manner? I attempted outer, 
 kronecker,
 expand.grid, etc, but with no success.



 Here is an example:



 m - 2000

 n - 5000

 A - matrix(rnorm(2*m),ncol=2)

 B - matrix(rnorm(2*n),ncol=2)

 W - matrix(NA, m, n)



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 g - function(x,y){

 theta - atan((y[2]-x[2]) / (y[1] - x[1]))

 theta + 2*pi*(theta  0)

 }



 Thanks for any suggestions.



 Best,

 Ravi.







 
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html



 
 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help writing a faster code

2007-02-01 Thread Robin Hankin

Hi



  A - matrix(runif(10),ncol=2)
  B - matrix(runif(10),ncol=2)
  g - function(i4){theta - atan2( (i4[4]-i4[2]),(i4[3]-i4[1]))
+ return(theta + 2*pi*(theta0))}
  apply(A,1,function(x){apply(B,1,function(y){g(c(x,y))})})
   [,1]  [,2] [,3][,4]  [,5]
[1,] 1.1709326 2.6521457 3.857477 0.219274562 1.2948374
[2,] 1.1770919 4.2109056 4.057313 4.918552967 1.9733967
[3,] 0.9171661 0.6721475 4.193675 0.434253839 0.9781060
[4,] 0.9181475 0.6911804 4.213295 0.455127422 0.9771797
[5,] 1.0467449 4.9263243 3.983248 0.004371504 1.1693707
 



HTH

rksh


On 1 Feb 2007, at 15:10, Ravi Varadhan wrote:

 Hi,



 I apologize for this repeat posting, which I first posted  
 yesterday. I would
 appreciate any hints on solving this problem:



 I have two matrices A (m x 2) and B (n x 2), where m and n are large
 integers (on the order of 10^4).  I am looking for an efficient way to
 create another matrix, W (m x n), which can be defined as follows:



 for (i in 1:m){

 for (j in 1:n) {

 W[i,j] - g(A[i,], B[j,])

 } }

 where g(x,y) is a function that takes two vectors and returns a  
 scalar.



 The following works okay, but is not fast enough for my purpose.  I  
 am sure
 that I can do better:



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 How can I do this in a faster manner? I attempted outer,  
 kronecker,
 expand.grid, etc, but with no success.



 Here is an example:



 m - 2000

 n - 5000

 A - matrix(rnorm(2*m),ncol=2)

 B - matrix(rnorm(2*n),ncol=2)

 W - matrix(NA, m, n)



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 g - function(x,y){

 theta - atan((y[2]-x[2]) / (y[1] - x[1]))

 theta + 2*pi*(theta  0)

 }



 Thanks for any suggestions.



 Best,

 Ravi.







 -- 
 --
 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/ 
 Varadhan.html



 -- 
 --
 




   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743


-- 
This e-mail (and any attachments) is confidential and intend...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.spss and encodings

2007-02-01 Thread Thomas Lumley

On Thu, 1 Feb 2007, Thomas Friedrichsmeier wrote:

 Hi!

 I'm having trouble with importing spss files containing non-ascii characters

Peter has explained what is going on.  It would be ideal for read.spss() 
to do the translation to the current locale. This would require knowing 
what encoding the SPSS file is using.  I think it is always a one-byte 
encoding and in your case it is apparently Latin-1, but I don't know if 
this is always the case, or how to tell which encoding it uses.

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xtable and column headings

2007-02-01 Thread Richard M. Heiberger

I would use latex() in the Hmsic package.  Here is a short example

tmp - matrix(1:12,4)
library(Hmisc)
tmp.latex - latex(tmp, colheads=c(abc$_1$,def$^{12}_4$,$g\\times h$))
## note the escaped \ in the above colheads vector
print.default(tmp.latex)

Copy the contents of the file referenced in tmp.latex to your
real myfile.tex file.

There are about a zillion optional arguments to latex() that give you very fine
control over the appearance of the typeset object.  See ?latex

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [lattice] levelplot for 2D density plots

2007-02-01 Thread Sebastian Weber

Hello all,

I'm trying to use the levelplot lattice function and can not adapt it to
my tastes concering colors:

dens - data.frame(x=c(), y=c(), z=c(), run=c())
for(l in levels(degCorrel$run)) {
  ind - degCorrel$run == l
  dk - kde2d(log10(degCorrel$correlFunc[ind]), log10(degCorrel
$correlFunc.ref[ind]), n=50)
  dt - cbind(con2tr(dk), run=l)
  dt$z - dt$z/sum(dt$z)
  dens - rbind(dens, dt)
}
dens$run - factor(dens$run)

levelplot(z ~ x *y | run, data=dens)

However, I need to adjust the cuts for every panel differently since the
scales are very different. I know, that this is not a very good
practice, but anyway, how can I do it?

Any help is greatly appreciated. Thanks in advance,

Sebastian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] indexing without looping

2007-02-01 Thread javier garcia-pintado

Hello,
I've got a data.frame like this:

  assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
  assignation
   
  value class
1   6.5 1
2   7.5 3
3   8.5 5
4  12.0 2


   

and a long vector of classes like this:


  x - c(1,1,2,7,6,5,4,3,2,2,2...)
   

And would like to obtain  a vector of length = length(x), with the
corresponding values extracted from assignation table. Like this:

  x.value
   
 [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0

Could you help me with an elegant way to do this ?
(I just can do it with looping for each class in the assignation table,
what a think is not perfect in R's sense)

Wishes,
Javier

-- 
Javier García-Pintado
Institute of Earth Sciences Jaume Almera (CSIC)
Lluis Sole Sabaris s/n, 08028 Barcelona
Phone: +34 934095410
Fax:   +34 934110012
e-mail:[EMAIL PROTECTED] 

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help writing a faster code

2007-02-01 Thread Ravi Varadhan

Thank you, Dimitris and Robin.  

Dimitris - your solution(s) works very well.  Although my g function is a
lot more complicated than that in the simple example that I gave, I think
that I can use your idea of taking the whole matrix inside the function and
working directly with it.

Robin - using two applys doesn't make the code any faster, it just produces
a compact one-liner.

Best,
Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: Dimitris Rizopoulos [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 10:33 AM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Need help writing a faster code

the following seems to be a first improvement:

m - 2000
n - 5000
A - matrix(rnorm(2*m), ncol=2)
B - matrix(rnorm(2*n), ncol=2)
W1 - W2 - matrix(0, m, n)

##
##

g1 - function(x, y){
theta - atan((y[2] - x[2]) / (y[1] - x[1]))
theta + 2*pi*(theta  0)
}

invisible({gc(); gc()})
system.time(for (i in 1:m) {
W1[i, ] - apply(B, 1, y = A[i,], function(x, y) g1(y, x))
})

##

g2 - function(x){
out - tB - x
theta - atan(out[2, ] / out[1, ])
theta + 2*pi*(theta  0)
}

tB - t(B)
invisible({gc(); gc()})
system.time(for (i in 1:m) {
W2[i, ] - g2(A[i, ])
})

## or

invisible({gc(); gc()})
system.time(W3 - t(apply(A, 1, g2)))

all.equal(W1, W2)
all.equal(W1, W3)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Ravi Varadhan [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 4:10 PM
Subject: [R] Need help writing a faster code


 Hi,



 I apologize for this repeat posting, which I first posted yesterday. 
 I would
 appreciate any hints on solving this problem:



 I have two matrices A (m x 2) and B (n x 2), where m and n are large
 integers (on the order of 10^4).  I am looking for an efficient way 
 to
 create another matrix, W (m x n), which can be defined as follows:



 for (i in 1:m){

 for (j in 1:n) {

 W[i,j] - g(A[i,], B[j,])

 } }

 where g(x,y) is a function that takes two vectors and returns a 
 scalar.



 The following works okay, but is not fast enough for my purpose.  I 
 am sure
 that I can do better:



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 How can I do this in a faster manner? I attempted outer, 
 kronecker,
 expand.grid, etc, but with no success.



 Here is an example:



 m - 2000

 n - 5000

 A - matrix(rnorm(2*m),ncol=2)

 B - matrix(rnorm(2*n),ncol=2)

 W - matrix(NA, m, n)



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 g - function(x,y){

 theta - atan((y[2]-x[2]) / (y[1] - x[1]))

 theta + 2*pi*(theta  0)

 }



 Thanks for any suggestions.



 Best,

 Ravi.









 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] when i configure the R 2.4.1,i meet the problem

2007-02-01 Thread xiaopeng hu

i get the message :
configure: WARNING: you cannot build info or html versions of the R manuals

how to deal with it ?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing without looping

2007-02-01 Thread David Barron

One way would be to use merge, like this:

merge(assignation,data.frame(class=x),all.y=TRUE)

There might well be better ways...

On 01/02/07, javier garcia-pintado [EMAIL PROTECTED] wrote:
 Hello,
 I've got a data.frame like this:

   assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
   assignation
 
   value class
 1   6.5 1
 2   7.5 3
 3   8.5 5
 4  12.0 2

  
 

 and a long vector of classes like this:


   x - c(1,1,2,7,6,5,4,3,2,2,2...)
 

 And would like to obtain  a vector of length = length(x), with the
 corresponding values extracted from assignation table. Like this:

   x.value
 
  [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0

 Could you help me with an elegant way to do this ?
 (I just can do it with looping for each class in the assignation table,
 what a think is not perfect in R's sense)

 Wishes,
 Javier

 --
 Javier García-Pintado
 Institute of Earth Sciences Jaume Almera (CSIC)
 Lluis Sole Sabaris s/n, 08028 Barcelona
 Phone: +34 934095410
 Fax:   +34 934110012
 e-mail:[EMAIL PROTECTED]


 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
=
David Barron
Said Business School
University of Oxford
Park End Street
Oxford OX1 1HP

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing without looping

2007-02-01 Thread Ido M. Tamir

Hi,

xt - assignation$value[match(x,assignation$class)]

HTH
ido

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing

2007-02-01 Thread Tony Plate

  a - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
  x - c(1,1,2,7,6,5,4,3,2,2,2)
  match(x, a$class)
  [1]  1  1  4 NA NA  3 NA  2  4  4  4
  a[match(x, a$class), value]
  [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0
 

-- Tony Plate

javier garcia-pintado wrote:
 Hello,
 In a nutshell, I've got a data.frame like this:
 
 
assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
assignation
 
   value class
 1   6.5 1
 2   7.5 3
 3   8.5 5
 4  12.0 2
 
  
 
 
 and a long vector of classes like this:
 
 
x - c(1,1,2,7,6,5,4,3,2,2,2...)
 
 
 And would like to obtain  a vector of length = length(x), with the
 corresponding values extracted from assignation table. Like this:
 
x.value
 
  [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0
 
 Could you help me with an elegant way to do this ?
 (I just can do it with looping for each class in the assignation table,
 what a think is not perfect in R's sense)
 
 Wishes,
 Javier
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Need help writing a faster code

2007-02-01 Thread Ravi Varadhan

Dear Dimitris,

I implemented your solution on my actual problem.  I was able to generate my
large transition matrix in 56 seconds, compared to the previous time of
around 27 minutes.  Wow!!!

I thank you very much for the help.  R and the R-user group are truly
amazing!

Best regards,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: Dimitris Rizopoulos [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 10:33 AM
To: Ravi Varadhan
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Need help writing a faster code

the following seems to be a first improvement:

m - 2000
n - 5000
A - matrix(rnorm(2*m), ncol=2)
B - matrix(rnorm(2*n), ncol=2)
W1 - W2 - matrix(0, m, n)

##
##

g1 - function(x, y){
theta - atan((y[2] - x[2]) / (y[1] - x[1]))
theta + 2*pi*(theta  0)
}

invisible({gc(); gc()})
system.time(for (i in 1:m) {
W1[i, ] - apply(B, 1, y = A[i,], function(x, y) g1(y, x))
})

##

g2 - function(x){
out - tB - x
theta - atan(out[2, ] / out[1, ])
theta + 2*pi*(theta  0)
}

tB - t(B)
invisible({gc(); gc()})
system.time(for (i in 1:m) {
W2[i, ] - g2(A[i, ])
})

## or

invisible({gc(); gc()})
system.time(W3 - t(apply(A, 1, g2)))

all.equal(W1, W2)
all.equal(W1, W3)


I hope it helps.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm


- Original Message - 
From: Ravi Varadhan [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Sent: Thursday, February 01, 2007 4:10 PM
Subject: [R] Need help writing a faster code


 Hi,



 I apologize for this repeat posting, which I first posted yesterday. 
 I would
 appreciate any hints on solving this problem:



 I have two matrices A (m x 2) and B (n x 2), where m and n are large
 integers (on the order of 10^4).  I am looking for an efficient way 
 to
 create another matrix, W (m x n), which can be defined as follows:



 for (i in 1:m){

 for (j in 1:n) {

 W[i,j] - g(A[i,], B[j,])

 } }

 where g(x,y) is a function that takes two vectors and returns a 
 scalar.



 The following works okay, but is not fast enough for my purpose.  I 
 am sure
 that I can do better:



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 How can I do this in a faster manner? I attempted outer, 
 kronecker,
 expand.grid, etc, but with no success.



 Here is an example:



 m - 2000

 n - 5000

 A - matrix(rnorm(2*m),ncol=2)

 B - matrix(rnorm(2*n),ncol=2)

 W - matrix(NA, m, n)



 for (i in 1:m) {

 W[i,] - apply(B, 1, y=A[i,], function(x,y) g(y,x))

 }



 g - function(x,y){

 theta - atan((y[2]-x[2]) / (y[1] - x[1]))

 theta + 2*pi*(theta  0)

 }



 Thanks for any suggestions.



 Best,

 Ravi.









 ---

 Ravi Varadhan, Ph.D.

 Assistant Professor, The Center on Aging and Health

 Division of Geriatric Medicine and Gerontology

 Johns Hopkins University

 Ph: (410) 502-2619

 Fax: (410) 614-9625

 Email: [EMAIL PROTECTED]

 Webpage: 
 http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html





 




 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] when i configure the R 2.4.1,i meet the problem

2007-02-01 Thread Roland Rau

On 2/1/07, xiaopeng hu [EMAIL PROTECTED] wrote:

 i get the message :
 configure: WARNING: you cannot build info or html versions of the R
 manuals

 how to deal with it ?


did you read the manual 'R Installation and Administration'?
see section 2.2 there.
it is mentioned there that you need makeinfo version 4.7 or later for the
manuals in info format.
(No, this is not circular; you don't have to build the manuals first to be
able to read them. Check www.r-project.org - Manuals).

I guess several other things are not available either on your computer. For
example section 2.1 of the aforementioned manual points out that you need
Perl 5.

Hope this helps,
Roland

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] would you please navigate me?

2007-02-01 Thread a.eslami

  
 
Hello 
   My name is Aida Eslami. I am a M.S.c student of statistics at Shahid 
Beheshti University , Tehran, Iran. The subject of my thesis is Analysis of 
Masked Data. I have some problems in writing of my program (optimization). 
Would you please navigate me and introduce some to help me? 
Thank you. 

 

   Yours sincerely

Aida Eslami
 



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Losing factor levels when moving variables from one context to another

2007-02-01 Thread Michael Rennie


Hi, there

I'm currently trying to figure out how to keep my factor levels for a 
variable when moving it from one data frame or matrix to another.

Example below:

vec1-(rep(10,5))
vec2-(rep(30,5))
vec3-(rep(80,5))
vecs-c(vec1, vec2, vec3)

resp-rnorm(2,15)

dat-as.data.frame(cbind(resp, vecs))
dat$vecs-factor(dat$vecs)
dat

R returns:
   resp  vecs
1 1.57606068767956   10
2 2.30271782269308   10
3 2.39874788444542   10
40.963987738423353   10
5 2.03620782454740   10
6  -0.0706713324725649   30
7 1.49001721222926   30
8 2.00587718501980   30
90.450576585429981   30
102.87120375367357   30
112.25575058079324   80
122.03471288724508   80
132.67432066972984   80
141.74102136279177   80
152.29827581276955   80

and now:

newvar-(rnorm(15,4))
newdat-as.data.frame(cbind(newvar, dat$vecs))
newdat

R returns:

   newvar V2
1  4.300788  1
2  5.295951  1
3  5.099849  1
4  3.211045  1
5  3.703554  1
6  3.693826  2
7  5.314679  2
8  4.70  2
9  3.534515  2
10 4.037401  2
11 4.476808  3
12 4.842449  3
13 3.109677  3
14 4.752961  3
15 4.445216  3
 

I seem to have lost everything I once has associated with vecs, and it's 
turned my actual values into arbitrary groupings.

I assume this has something to do with the behaviour of factors? Does 
anyone have any suggestions on how to get my original levels, etc., back?

Cheers,

Mike

Michael Rennie
Ph.D. Candidate, University of Toronto at Mississauga
3359 Mississauga Rd. N.
Mississauga, ON  L5L 1C6
Ph: 905-828-5452  Fax: 905-828-3792
www.utm.utoronto.ca/~w3rennie

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Losing factor levels when moving variables from one context to another

2007-02-01 Thread Chuck Cleland

Michael Rennie wrote:
 Hi, there
 
 I'm currently trying to figure out how to keep my factor levels for a 
 variable when moving it from one data frame or matrix to another.
 
 Example below:
 
 vec1-(rep(10,5))
 vec2-(rep(30,5))
 vec3-(rep(80,5))
 vecs-c(vec1, vec2, vec3)
 
 resp-rnorm(2,15)
 
 dat-as.data.frame(cbind(resp, vecs))
 dat$vecs-factor(dat$vecs)
 dat
 
 R returns:
resp  vecs
 1 1.57606068767956   10
 2 2.30271782269308   10
 3 2.39874788444542   10
 40.963987738423353   10
 5 2.03620782454740   10
 6  -0.0706713324725649   30
 7 1.49001721222926   30
 8 2.00587718501980   30
 90.450576585429981   30
 102.87120375367357   30
 112.25575058079324   80
 122.03471288724508   80
 132.67432066972984   80
 141.74102136279177   80
 152.29827581276955   80
 
 and now:
 
 newvar-(rnorm(15,4))
 newdat-as.data.frame(cbind(newvar, dat$vecs))
 newdat
 
 R returns:
 
newvar V2
 1  4.300788  1
 2  5.295951  1
 3  5.099849  1
 4  3.211045  1
 5  3.703554  1
 6  3.693826  2
 7  5.314679  2
 8  4.70  2
 9  3.534515  2
 10 4.037401  2
 11 4.476808  3
 12 4.842449  3
 13 3.109677  3
 14 4.752961  3
 15 4.445216  3
  
 
 I seem to have lost everything I once has associated with vecs, and it's 
 turned my actual values into arbitrary groupings.
 
 I assume this has something to do with the behaviour of factors? Does 
 anyone have any suggestions on how to get my original levels, etc., back?

  It has more to do with the behavior of cbind().  Construct the data
frame with data.frame() rather than the combination of as.data.frame()
and cbind().  For example:

vec1 - (rep(10,2))
vec2 - (rep(30,2))
vec3 - (rep(80,2))
vecs - c(vec1, vec2, vec3)
resp - rnorm(6,2)

dat - data.frame(resp, vecs)
dat$vecs - factor(dat$vecs)
dat
  resp vecs
1 2.795851   10
2 3.673296   10
3 1.731921   30
4 1.172945   30
5 2.427164   80
6 1.470758   80

newvar - (rnorm(6,4))
newdat - data.frame(newvar, dat$vecs)
newdat
newvar dat.vecs
1 6.389386   10
2 3.453535   10
3 3.807821   30
4 6.067712   30
5 4.978724   80
6 3.015975   80

?data.frame

 Cheers,
 
 Mike
 
 Michael Rennie
 Ph.D. Candidate, University of Toronto at Mississauga
 3359 Mississauga Rd. N.
 Mississauga, ON  L5L 1C6
 Ph: 905-828-5452  Fax: 905-828-3792
 www.utm.utoronto.ca/~w3rennie
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Losing factor levels when moving variables from one context to another

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 12:13 -0500, Michael Rennie wrote:
 Hi, there
 
 I'm currently trying to figure out how to keep my factor levels for a 
 variable when moving it from one data frame or matrix to another.
 
 Example below:
 
 vec1-(rep(10,5))
 vec2-(rep(30,5))
 vec3-(rep(80,5))
 vecs-c(vec1, vec2, vec3)
 
 resp-rnorm(2,15)
 
 dat-as.data.frame(cbind(resp, vecs))
 dat$vecs-factor(dat$vecs)
 dat
 
 R returns:
resp  vecs
 1 1.57606068767956   10
 2 2.30271782269308   10
 3 2.39874788444542   10
 40.963987738423353   10
 5 2.03620782454740   10
 6  -0.0706713324725649   30
 7 1.49001721222926   30
 8 2.00587718501980   30
 90.450576585429981   30
 102.87120375367357   30
 112.25575058079324   80
 122.03471288724508   80
 132.67432066972984   80
 141.74102136279177   80
 152.29827581276955   80
 
 and now:
 
 newvar-(rnorm(15,4))
 newdat-as.data.frame(cbind(newvar, dat$vecs))
 newdat
 
 R returns:
 
newvar V2
 1  4.300788  1
 2  5.295951  1
 3  5.099849  1
 4  3.211045  1
 5  3.703554  1
 6  3.693826  2
 7  5.314679  2
 8  4.70  2
 9  3.534515  2
 10 4.037401  2
 11 4.476808  3
 12 4.842449  3
 13 3.109677  3
 14 4.752961  3
 15 4.445216  3
  
 
 I seem to have lost everything I once has associated with vecs, and it's 
 turned my actual values into arbitrary groupings.
 
 I assume this has something to do with the behaviour of factors? Does 
 anyone have any suggestions on how to get my original levels, etc., back?
 
 Cheers,
 
 Mike

Mike,

The problem (specific to your example) is that you are using
as.data.frame() and cbind(), which will first coerce the columns to a
common data type, create a matrix and then coerce the matrix to a
dataframe.

Thus, in the second case, your factor dat$vecs is first being coerced to
its numeric equivalent values, rather then being retained as a factor,
since a matrix can contain only one data type and the first column is
numeric.

Try this instead:

vec1-(rep(10, 5))
vec2-(rep(30, 5))
vec3-(rep(80, 5))
vecs-c(vec1, vec2, vec3)

set.seed(1)
resp-rnorm(15, 2)

dat - data.frame(resp, vecs)

 str(dat)
'data.frame':   15 obs. of  2 variables:
 $ resp: num  1.37 2.18 1.16 3.60 2.33 ...
 $ vecs: Factor w/ 3 levels 10,30,80: 1 1 1 1 1 2 2 2 2 2 ..


set.seed(2)
newvar - rnorm(15, 4)
newdat - data.frame(newvar, dat$vecs)

 str(newdat)
'data.frame':   15 obs. of  2 variables:
 $ newvar  : num  3.10 4.18 5.59 2.87 3.92 ...
 $ dat.vecs: Factor w/ 3 levels 10,30,80: 1 1 1 1 1 2 2 2 2 2 ...

 all(levels(newdat$dat.vecs) == levels(dat$vecs))
[1] TRUE


BTW, there may very well be times when you are combining two factors
together and need to ensure that the factor levels either are
intentionally different or need to relevel the combined factors into
common levels. See the Warning and other information in ?factor. This
would be critical, for example, if you are combining data sets to then
run modeling functions on the combined data sets.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] would you please navigate me?

2007-02-01 Thread Ben Bolker

a.eslami a.eslami at Mail.sbu.ac.ir writes:

 
 
 Hello 
My name is Aida Eslami. I am a M.S.c student of statistics at Shahid
Beheshti University , Tehran, Iran. The
 subject of my thesis is Analysis of Masked Data. I have some problems in
writing of my program
 (optimization). Would you please navigate me and introduce some to help me? 
 Thank you. 
 
  
  
 Yours sincerely
  
  Aida Eslami


  I'm sorry, but we can only answer _specific_ questions about R on
this mailing list; there are too many deserving students all over the
world for us to help them all.  You should try to get enough help
from someone at your local institution to get you to the point where
you can formulate a specific question about R; failing that, you will
have to struggle with the R documentation on your own until you can
get to that point.  As it says at the bottom of every e-mail to the
list, please read the posting guide as well ...

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-01 Thread Jeffrey Racine

Greetings.

For R gurus this may be a no brainer, but I could not find pointers to
efficient computation of this beast in past help files.

Background - I wish to implement a Cramer-von Mises type test statistic
which involves double sums of max(X_i,Y_j) where X and Y are vectors of
differing length.

I am currently using ifelse pointwise in a vector, but have a nagging
suspicion that there is a more efficient way to do this. Basically, I
require three sums:

sum1: \sum_i\sum_j max(X_i,X_j)
sum2: \sum_i\sum_j max(Y_i,Y_j)
sum3: \sum_i\sum_j max(X_i,Y_j)

Here is my current implementation - any pointers to more efficient
computation greatly appreciated.

  nx - length(x)
  ny - length(y)

  sum1 - 0
  sum3 - 0

  for(i in 1:nx) {
sum1 - sum1 + sum(ifelse(x[i]x,x[i],x))
sum3 - sum3 + sum(ifelse(x[i]y,x[i],y))
  }

  sum2 - 0
  sum4 - sum3 # symmetric and identical

  for(i in 1:ny) {
sum2 - sum2 + sum(ifelse(y[i]y,y[i],y))
  }

Thanks in advance for your help.

-- Jeff

-- 
Professor J. S. Racine Phone:  (905) 525 9140 x 23825
Department of EconomicsFAX:(905) 521-8232
McMaster Universitye-mail: [EMAIL PROTECTED]
1280 Main St. W.,Hamilton, URL:
http://www.economics.mcmaster.ca/racine/
Ontario, Canada. L8S 4M4

`The generation of random numbers is too important to be left to chance'

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-01 Thread Ravi Varadhan

Jeff,

Here is something which is a little faster:

sum1 - sum(outer(x, x, FUN=pmax))
sum3 - sum(outer(x, y, FUN=pmax))

Best,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 




-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jeffrey Racine
Sent: Thursday, February 01, 2007 1:18 PM
To: r-help@stat.math.ethz.ch
Subject: [R] Help with efficient double sum of max (X_i, Y_i) (X  Y
vectors)

Greetings.

For R gurus this may be a no brainer, but I could not find pointers to
efficient computation of this beast in past help files.

Background - I wish to implement a Cramer-von Mises type test statistic
which involves double sums of max(X_i,Y_j) where X and Y are vectors of
differing length.

I am currently using ifelse pointwise in a vector, but have a nagging
suspicion that there is a more efficient way to do this. Basically, I
require three sums:

sum1: \sum_i\sum_j max(X_i,X_j)
sum2: \sum_i\sum_j max(Y_i,Y_j)
sum3: \sum_i\sum_j max(X_i,Y_j)

Here is my current implementation - any pointers to more efficient
computation greatly appreciated.

  nx - length(x)
  ny - length(y)

  sum1 - 0
  sum3 - 0

  for(i in 1:nx) {
sum1 - sum1 + sum(ifelse(x[i]x,x[i],x))
sum3 - sum3 + sum(ifelse(x[i]y,x[i],y))
  }

  sum2 - 0
  sum4 - sum3 # symmetric and identical

  for(i in 1:ny) {
sum2 - sum2 + sum(ifelse(y[i]y,y[i],y))
  }

Thanks in advance for your help.

-- Jeff

-- 
Professor J. S. Racine Phone:  (905) 525 9140 x 23825
Department of EconomicsFAX:(905) 521-8232
McMaster Universitye-mail: [EMAIL PROTECTED]
1280 Main St. W.,Hamilton, URL:
http://www.economics.mcmaster.ca/racine/
Ontario, Canada. L8S 4M4

`The generation of random numbers is too important to be left to chance'

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-01 Thread Benilton Carvalho

Well, a reproducible example would be nice =)

not tested:

x = rnorm(10)
y = rnorm(20)
mymax - function(t1, t2) apply(cbind(t1, t2), 1, max)
sum(outer(x, y, mymax))

is this sth like what you need?

b

On Feb 1, 2007, at 1:18 PM, Jeffrey Racine wrote:

 Greetings.

 For R gurus this may be a no brainer, but I could not find pointers to
 efficient computation of this beast in past help files.

 Background - I wish to implement a Cramer-von Mises type test  
 statistic
 which involves double sums of max(X_i,Y_j) where X and Y are  
 vectors of
 differing length.

 I am currently using ifelse pointwise in a vector, but have a nagging
 suspicion that there is a more efficient way to do this. Basically, I
 require three sums:

 sum1: \sum_i\sum_j max(X_i,X_j)
 sum2: \sum_i\sum_j max(Y_i,Y_j)
 sum3: \sum_i\sum_j max(X_i,Y_j)

 Here is my current implementation - any pointers to more efficient
 computation greatly appreciated.

   nx - length(x)
   ny - length(y)

   sum1 - 0
   sum3 - 0

   for(i in 1:nx) {
 sum1 - sum1 + sum(ifelse(x[i]x,x[i],x))
 sum3 - sum3 + sum(ifelse(x[i]y,x[i],y))
   }

   sum2 - 0
   sum4 - sum3 # symmetric and identical

   for(i in 1:ny) {
 sum2 - sum2 + sum(ifelse(y[i]y,y[i],y))
   }

 Thanks in advance for your help.

 -- Jeff

 -- 
 Professor J. S. Racine Phone:  (905) 525 9140 x 23825
 Department of EconomicsFAX:(905) 521-8232
 McMaster Universitye-mail: [EMAIL PROTECTED]
 1280 Main St. W.,Hamilton, URL:
 http://www.economics.mcmaster.ca/racine/
 Ontario, Canada. L8S 4M4

 `The generation of random numbers is too important to be left to  
 chance'

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for bioinformatics

2007-02-01 Thread Seth Falcon

Benoit Ballester [EMAIL PROTECTED] writes:

 Hi,

 I was wondering if someone could tell me more about this book, (if it's 
 a good or bad one).
 I can't find it, as it seems that O'Reilly doesn't publish any more.

I've never seen a copy so I can't comment about its quality (has
anyone seen a copy?).

You might want to take a look at _Bioinformatics and Computational
Biology Solutions Using R and Bioconductor_.

http://www.bioconductor.org/pub/docs/mogr/

+ seth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Upcoming Course R/Splus Fundamentals and Programming Techniques In Washington DC, San Francisco and Princeton

2007-02-01 Thread elvis

XLSolutions Corporation (www.xlsolutions-corp.com) is proud to
announce our R/S-plus Fundamentals and Programming Techniques :
www.xlsolutions-corp.com/Rfund.htm

*** Washington DC / March 1-2, 2007
*** San Francisco / March 15-16, 2007
*** Princeton / Week of Feb 26  (dates coming soon)

Should we bring this course to your city? please let us know!

Interested in R/Splus Advanced course? email us.

Reserve your seat now at the early bird rates! Payment due AFTER
the class

Course Description:

This two-day beginner to intermediate R/S-plus course focuses on a
broad spectrum of topics, from reading raw data to a comparison of R
and S. We will learn the essentials of data manipulation, graphical
visualization and R/S-plus programming. We will explore statistical
data analysis tools,including graphics with data sets. How to enhance
your plots, build your own packages (librairies) and connect via
ODBC,etc.
We will perform some statistical modeling and fit linear regression
models. Participants are encouraged to bring data for interactive
sessions

With the following outline:

- An Overview of R and S
- Data Manipulation and Graphics
- Using Lattice Graphics
- A Comparison of R and S-Plus
- How can R Complement SAS?
- Writing Functions
- Avoiding Loops
- Vectorization
- Statistical Modeling
- Project Management
- Techniques for Effective use of R and S
- Enhancing Plots
- Using High-level Plotting Functions
- Building and Distributing Packages (libraries)
- Connecting; ODBC, Rweb, Orca via sockets and via Rjava


Email us for group discounts.
Email Sue Turner: [EMAIL PROTECTED]
Phone: 206-686-1578
Visit us: www.xlsolutions-corp.com/training.htm
Please let us know if you and your colleagues are interested in this
classto take advantage of group discount. Register now to secure your
seat!

Interested in R/Splus Advanced course? email us.


Cheers,
Elvis Miller, PhD
Manager Training.
XLSolutions Corporation
206 686 1578
www.xlsolutions-corp.com
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] indexing without looping

2007-02-01 Thread Greg Johnson

javier garcia-pintado jgarcia at ija.csic.es writes:

 
 Hello,
 I've got a data.frame like this:
 
   assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
   assignation

   value class
 1   6.5 1
 2   7.5 3
 3   8.5 5
 4  12.0 2
 
 

 
 and a long vector of classes like this:
 
   x - c(1,1,2,7,6,5,4,3,2,2,2...)

 
 And would like to obtain  a vector of length = length(x), with the
 corresponding values extracted from assignation table. Like this:
 
   x.value

  [1]  6.5  6.5 12.0   NA   NA  8.5   NA  7.5 12.0 12.0 12.0
 
 Could you help me with an elegant way to do this ?
 (I just can do it with looping for each class in the assignation table,
 what a think is not perfect in R's sense)
 
 Wishes,
 Javier
 

Javier,

you might try this:
 assignation - data.frame(value=c(6.5,7.5,8.5,12.0),class=c(1,3,5,2))
 assignation
  value class
1   6.5 1
2   7.5 3
3   8.5 5
4  12.0 2

 x - c(1,1,2,7,6,5,4,3,2,2,2)
 x
 [1] 1 1 2 7 6 5 4 3 2 2 2

 merge( x, assignation, by.x=1, by.y=2, all.x=T )
   x value
1  1   6.5
2  1   6.5
3  2  12.0
4  2  12.0
5  2  12.0
6  2  12.0
7  3   7.5
8  4NA
9  5   8.5
10 6NA
11 7NA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for bioinformatics

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 10:45 -0800, Seth Falcon wrote:
 Benoit Ballester [EMAIL PROTECTED] writes:
 
  Hi,
 
  I was wondering if someone could tell me more about this book, (if it's 
  a good or bad one).
  I can't find it, as it seems that O'Reilly doesn't publish any more.
 
 I've never seen a copy so I can't comment about its quality (has
 anyone seen a copy?).
 
 You might want to take a look at _Bioinformatics and Computational
 Biology Solutions Using R and Bioconductor_.
 
 http://www.bioconductor.org/pub/docs/mogr/

I'll stand (or sit) to be corrected on this as I cannot find the source,
but I have a recollection from seeing something quite some time ago that
the book may have never been published.

It is no longer listed on Amazon.com (USA), but here is a listing on UK:

http://www.amazon.co.uk/R-Bioinformatics-Kimberley-Seefeld/dp/059600544X

I located a posting from Kim Seefeld (one of the authors) on a usenet
group from back in 2003. Her e-mail then was listed as:

  [EMAIL PROTECTED]

You might want to drop her a line if the e-mail is still valid.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with efficient double sum of max (X_i, Y_i) (X Y vectors)

2007-02-01 Thread Achim Zeileis

Jeff,

you can do

 sum1: \sum_i\sum_j max(X_i,X_j)
 sum2: \sum_i\sum_j max(Y_i,Y_j)

sum(x * (2 * rank(x) - 1))

 sum3: \sum_i\sum_j max(X_i,Y_j)

sum(outer(x, y, pmax))

Probably, the latter can be speeded up even more...
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot.function with xlim, bug?

2007-02-01 Thread Prasenjit Kapat

Consider the following lines of code:

 plot(function(x) sin(cos(x)*exp(-x/2)), from=-8,to=7,xlim=c(-5,5))
Uses integral points (integers from -5 to 5) to draw the plot, instead
of the usual default of n= 101 equally spaced points (from
?plot.function).

plot(function(x) sin(cos(x)*exp(-x/2)), from=-8,to=7,n=101,xlim=c(-5,5))
Gives the following error:
Error in add  par(xlog) : invalid 'x' type in 'x  y'

Any explanations? The following modification, in the plot.R by NOT
passing 'y' to plot.function, seems to fix both of the above problems
! I am sure to be missing something!

plot2 - function (x, y, ...)
{
if (is.null(attr(x, class))  is.function(x)) {
nms - names(list(...))
## need to pass 'y' to plot.function() when positionally matched 
if(missing(y)) # set to defaults {could use formals(plot.default)}:
y - { if (!from %in% nms) 0 else
   if (!to   %in% nms) 1 else
   if (!xlim %in% nms) NULL }
if (ylab %in% nms)
plot.function(x, ...)
else
plot.function(x, ylab=paste(deparse(substitute(x)),(x)), ...)
}
else UseMethod(plot)
}
---
version.string R version 2.4.1 (2006-12-18)
platform   i486-pc-gnu-linux

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Lining up x-y datasets based on values of x

2007-02-01 Thread Christos Hatzis

Hi,

I was wondering if there is a direct approach for lining up 2-column
matrices according to the values of the first column.  An example and a
brute-force approach is given below:

x - cbind(1:10, runif(10))
y - cbind(5:14, runif(10))
z - cbind((-4):5, runif(10))

xx - seq( min(c(x[,1],y[,1],z[,1])), max(c(x[,1],y[,1],z[,1])), 1)
w - cbind(xx, matrix(rep(0, 3*length(xx)), ncol=3)) 

w[ xx = x[1,1]  xx = x[10,1], 2 ] - x[,2]
w[ xx = y[1,1]  xx = y[10,1], 3 ] - y[,2]
w[ xx = z[1,1]  xx = z[10,1], 4 ] - z[,2]

w 

I appreciate any pointers.

Thanks.
 
Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 15:05 -0500, Christos Hatzis wrote:
 Hi,
 
 I was wondering if there is a direct approach for lining up 2-column
 matrices according to the values of the first column.  An example and a
 brute-force approach is given below:
 
 x - cbind(1:10, runif(10))
 y - cbind(5:14, runif(10))
 z - cbind((-4):5, runif(10))
 
 xx - seq( min(c(x[,1],y[,1],z[,1])), max(c(x[,1],y[,1],z[,1])), 1)
 w - cbind(xx, matrix(rep(0, 3*length(xx)), ncol=3)) 
 
 w[ xx = x[1,1]  xx = x[10,1], 2 ] - x[,2]
 w[ xx = y[1,1]  xx = y[10,1], 3 ] - y[,2]
 w[ xx = z[1,1]  xx = z[10,1], 4 ] - z[,2]
 
 w 
 
 I appreciate any pointers.
 
 Thanks.

How about this:

x - cbind(1:10, runif(10))
y - cbind(5:14, runif(10))
z - cbind((-4):5, runif(10))

colnames(x) - c(X, Y)
colnames(y) - c(X, Y)
colnames(z) - c(X, Y)

xy - merge(x, y, by = X, all = TRUE)
xyz - merge(xy, z, by = X, all = TRUE)

xyz[is.na(xyz)] - 0

 xyz
X   Y.x   Y.y Y
1  -4 0.000 0.000 0.3969099
2  -3 0.000 0.000 0.8943127
3  -2 0.000 0.000 0.4882819
4  -1 0.000 0.000 0.0275787
5   0 0.000 0.000 0.7562341
6   1 0.6873130 0.000 0.6185218
7   2 0.1930880 0.000 0.2318025
8   3 0.1164783 0.000 0.7336057
9   4 0.7408532 0.000 0.3006347
10  5 0.7112887 0.6383823 0.8515126
11  6 0.2719079 0.5952721 0.000
12  7 0.2067017 0.8178048 0.000
13  8 0.2085043 0.5714917 0.000
14  9 0.2251435 0.4032660 0.000
15 10 0.3471888 0.5247478 0.000
16 11 0.000 0.6899197 0.000
17 12 0.000 0.7188912 0.000
18 13 0.000 0.9133252 0.000
19 14 0.000 0.9186001 0.000

Note that 'xyz' will be a data frame, so just use as.matrix(xyz) to
coerce back to a numeric matrix if needed.

See ?merge

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for bioinformatics

2007-02-01 Thread Peter Dalgaard

Marc Schwartz wrote:
 On Thu, 2007-02-01 at 10:45 -0800, Seth Falcon wrote:
   
 Benoit Ballester [EMAIL PROTECTED] writes:

 
 Hi,

 I was wondering if someone could tell me more about this book, (if it's 
 a good or bad one).
 I can't find it, as it seems that O'Reilly doesn't publish any more.
   
 I've never seen a copy so I can't comment about its quality (has
 anyone seen a copy?).

 You might want to take a look at _Bioinformatics and Computational
 Biology Solutions Using R and Bioconductor_.

 http://www.bioconductor.org/pub/docs/mogr/
 

 I'll stand (or sit) to be corrected on this as I cannot find the source,
 but I have a recollection from seeing something quite some time ago that
 the book may have never been published.
   
It's been a while since the status was something along the lines that 
the authors may or may not complete it. Subject matter moving faster 
than pen, I suspect

-p

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can this loop be delooped?

2007-02-01 Thread Talbot Katz

Hi.

I have the following code in a loop.  It splits a vector into subvectors of 
equal size.  But if the size of the original vector is not an exact multiple 
of the desired subvector size, then the first few subvectors have one more 
element than the last few.  I know that the cut function could be used to 
determine where to break up the vector, but it doesn't seem to provide 
control over where to put the larger and smaller subvectors.

numgp1_v=sidect_v%/%compmin
numgroup_v[small]=max(1,numgp1_v[small])
sidemin_v=sidect_v%/%numgroup_v
nummax_v=sidect_v%%sidemin_v
eix=0
smallindexlist-list(NULL)
for(i in 1:numgroup_v[small])   {
bix=eix+1
eix=bix+sidemin_v[small]+(i=nummax_v[small])-1
smallindexlist[[i]]-dlpo_sm_v[bix:eix]
}

The key fact is that smallindexlist is a list, each list element is a 
subvector of dlpo_sm_v of the proper size.  The sizes may be different.


I tried to see whether I could eliminate the loop, as follows.  First I 
defined a function:

intgpi  -  function(totalength,numgroups,groupnum,place=LEFT){
#   function to split the integer sequence, 1:totalength, into the groupnum 
group out of numgroups groups of equal size, totalength%/%numgroups
#   there are totalength%%numgroups number of groups of length 
1+totalength%/%numgroups, with the large groups all to one side, left if 
place=LEFT
#   totalength = numgroups = groupnum all integers, or it won't work right
if(charmatch(toupper(place),RIGHT,nomatch=FALSE)==1){
extra1_1=max((groupnum-1)+((totalength%%numgroups)-numgroups),0)
extra1_2=(groupnumnumgroups-totalength%%numgroups)
}
else{
extra1_1=min(totalength%%numgroups,groupnum-1)
extra1_2=(groupnum=totalength%%numgroups)
}
gsize=totalength%/%numgroups
gleft=((groupnum-1)*gsize)+extra1_1+1
gright=gleft+gsize+extra1_2-1
gleft:gright
}


The function appears to work okay.  Then I used it as follows:

numgp1_v=sidect_v%/%compmin
numgroup_v[small]=max(1,numgp1_v[small])
smallindexlist-list(NULL)
smallindexlist=sapply(1:numgroup_v[small],function(i){dlpo_sm_v[intgpi(sidect_v[small],numgroup_v[small],i)]})

In this case, smallindexlist will be a list like I had before if the 
subvectors are not all the same size, but if the subvectors are all the same 
size, it appears that I get an array.  Can I force this operation to give me 
a list the way I want it in all cases?  Or is there a better way to deloop 
my original code?

Thanks!

--  TMK  --
212-460-5430home
917-656-5351cell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems installing R-2.4.1 on Solaris 11 x-86 from source: error in gmake after successful configure

2007-02-01 Thread Prof Brian Ripley

What is 'Solaris 11'?  According to www.sun.com, the latest Solaris 
version is 10, and my sysadmins have not heard of Solaris 11.

You seem to be missing the Solaris compilation tools, ar in this case.
In Solaris = 10 they are in /usr/ccs/bin, not in the path by default.

On Wed, 31 Jan 2007, Octavio Tourinho wrote:

 Dear friends,
 I am trying to install R-2.4.1 from source on Solaris 11 x-86. 64 bits,

There is 32-bit x86 and 64-bit amd64 or x86_64.

 running on Sun Ultra-20 workstation, and using the SunStudio 11 compilers.
 I was able to configure R correctly, but received an error in gmake, 
 aparently related to bzip2 which I have been unable to debug.
 The messages are listed below.
 The configure.log and configure.status files are attached.

 Any help would be sincerely appreciated.

 Octavio Tourinho

 =
 R is now configured for i386-pc-solaris2.11

 Source directory:  .
 Installation directory:/usr/local

 C compiler:gcc -std=gnu99 -D__NO_MATH_INLINES -g -O2
 Fortran 77 compiler:   g77  -g -O2

 C++ compiler:  g++  -g -O2
 Fortran 90/95 compiler:f95 -g

 Interfaces supported:  X11, tcltk
 External libraries:readline
 Additional capabilities:   PNG, JPEG, NLS
 Options enabled:   shared BLAS, R profiling

 Recommended packages:  yes

 configure: WARNING: you cannot build DVI versions of the R manuals
 configure: WARNING: you cannot build PDF versions of the R manuals
 # gmake
 gmake[1]: Entering directory `/usr/local/R-2.4.1/m4'
 gmake[1]: Nothing to be done for `R'.
 gmake[1]: Leaving directory `/usr/local/R-2.4.1/m4'
 gmake[1]: Entering directory `/usr/local/R-2.4.1/tools'
 gmake[1]: Nothing to be done for `R'.
 gmake[1]: Leaving directory `/usr/local/R-2.4.1/tools'
 gmake[1]: Entering directory `/usr/local/R-2.4.1/doc'
 gmake[2]: Entering directory `/usr/local/R-2.4.1/doc/html'
 gmake[3]: Entering directory `/usr/local/R-2.4.1/doc/html/search'
 gmake[3]: Leaving directory `/usr/local/R-2.4.1/doc/html/search'
 gmake[2]: Leaving directory `/usr/local/R-2.4.1/doc/html'
 gmake[2]: Entering directory `/usr/local/R-2.4.1/doc/manual'
 gmake[2]: Nothing to be done for `R'.
 gmake[2]: Leaving directory `/usr/local/R-2.4.1/doc/manual'
 gmake[1]: Leaving directory `/usr/local/R-2.4.1/doc'
 gmake[1]: Entering directory `/usr/local/R-2.4.1/etc'
 gmake[1]: Leaving directory `/usr/local/R-2.4.1/etc'
 gmake[1]: Entering directory `/usr/local/R-2.4.1/share'
 gmake[1]: Leaving directory `/usr/local/R-2.4.1/share'
 gmake[1]: Entering directory `/usr/local/R-2.4.1/src'
 gmake[2]: Entering directory `/usr/local/R-2.4.1/src/scripts'
 creating src/scripts/R.fe
 gmake[3]: Entering directory `/usr/local/R-2.4.1/src/scripts'
 gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/scripts'
 gmake[2]: Leaving directory `/usr/local/R-2.4.1/src/scripts'
 gmake[2]: Entering directory `/usr/local/R-2.4.1/src/include'
 config.status: creating src/include/config.h
 config.status: src/include/config.h is unchanged
 Rmath.h is unchanged
 gmake[3]: Entering directory `/usr/local/R-2.4.1/src/include/R_ext'
 gmake[3]: Nothing to be done for `R'.
 gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/include/R_ext'
 gmake[2]: Leaving directory `/usr/local/R-2.4.1/src/include'
 gmake[2]: Entering directory `/usr/local/R-2.4.1/src/extra'
 gmake[3]: Entering directory `/usr/local/R-2.4.1/src/extra/blas'
 gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/blas'
 gmake[4]: `libRblas.so' is up to date.
 gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/blas'
 gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/blas'
 /usr/local/R-2.4.1/lib/libRblas.so is unchanged
 gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/blas'
 gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/extra/blas'
 gmake[3]: Entering directory `/usr/local/R-2.4.1/src/extra/bzip2'
 gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/bzip2'
 gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/bzip2'
 gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/bzip2'
 rm -f libbz2.a
 false cr libbz2.a blocksort.o bzlib.o compress.o crctable.o decompress.o 
 huffman.o randtable.o
 gmake[4]: *** [libbz2.a] Error 1
 gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/bzip2'
 gmake[3]: *** [R] Error 2
 gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/extra/bzip2'
 gmake[2]: *** [R] Error 1
 gmake[2]: Leaving directory `/usr/local/R-2.4.1/src/extra'
 gmake[1]: *** [R] Error 1
 gmake[1]: Leaving directory `/usr/local/R-2.4.1/src'
 gmake: *** [R] Error 1
 


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Christos Hatzis

Thanks Marc and Phil.

My dataset actually consists of 50+ individual files, so I will have to do
this one column at a time in a loop...
I might look into SQL and outer joints as an alternative to avoid looping.

Thanks again.
-Christos 

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 3:29 PM
To: [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Lining up x-y datasets based on values of x

On Thu, 2007-02-01 at 15:05 -0500, Christos Hatzis wrote:
 Hi,
 
 I was wondering if there is a direct approach for lining up 2-column 
 matrices according to the values of the first column.  An example and 
 a brute-force approach is given below:
 
 x - cbind(1:10, runif(10))
 y - cbind(5:14, runif(10))
 z - cbind((-4):5, runif(10))
 
 xx - seq( min(c(x[,1],y[,1],z[,1])), max(c(x[,1],y[,1],z[,1])), 1) w 
 - cbind(xx, matrix(rep(0, 3*length(xx)), ncol=3))
 
 w[ xx = x[1,1]  xx = x[10,1], 2 ] - x[,2] w[ xx = y[1,1]  xx = 
 y[10,1], 3 ] - y[,2] w[ xx = z[1,1]  xx = z[10,1], 4 ] - z[,2]
 
 w
 
 I appreciate any pointers.
 
 Thanks.

How about this:

x - cbind(1:10, runif(10))
y - cbind(5:14, runif(10))
z - cbind((-4):5, runif(10))

colnames(x) - c(X, Y)
colnames(y) - c(X, Y)
colnames(z) - c(X, Y)

xy - merge(x, y, by = X, all = TRUE)
xyz - merge(xy, z, by = X, all = TRUE)

xyz[is.na(xyz)] - 0

 xyz
X   Y.x   Y.y Y
1  -4 0.000 0.000 0.3969099
2  -3 0.000 0.000 0.8943127
3  -2 0.000 0.000 0.4882819
4  -1 0.000 0.000 0.0275787
5   0 0.000 0.000 0.7562341
6   1 0.6873130 0.000 0.6185218
7   2 0.1930880 0.000 0.2318025
8   3 0.1164783 0.000 0.7336057
9   4 0.7408532 0.000 0.3006347
10  5 0.7112887 0.6383823 0.8515126
11  6 0.2719079 0.5952721 0.000
12  7 0.2067017 0.8178048 0.000
13  8 0.2085043 0.5714917 0.000
14  9 0.2251435 0.4032660 0.000
15 10 0.3471888 0.5247478 0.000
16 11 0.000 0.6899197 0.000
17 12 0.000 0.7188912 0.000
18 13 0.000 0.9133252 0.000
19 14 0.000 0.9186001 0.000

Note that 'xyz' will be a data frame, so just use as.matrix(xyz) to coerce
back to a numeric matrix if needed.

See ?merge

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R for bioinformatics

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 21:32 +0100, Peter Dalgaard wrote:
 Marc Schwartz wrote:
  On Thu, 2007-02-01 at 10:45 -0800, Seth Falcon wrote:

  Benoit Ballester [EMAIL PROTECTED] writes:
 
  
  Hi,
 
  I was wondering if someone could tell me more about this book, (if it's 
  a good or bad one).
  I can't find it, as it seems that O'Reilly doesn't publish any more.

  I've never seen a copy so I can't comment about its quality (has
  anyone seen a copy?).
 
  You might want to take a look at _Bioinformatics and Computational
  Biology Solutions Using R and Bioconductor_.
 
  http://www.bioconductor.org/pub/docs/mogr/
  
 
  I'll stand (or sit) to be corrected on this as I cannot find the source,
  but I have a recollection from seeing something quite some time ago that
  the book may have never been published.

 It's been a while since the status was something along the lines that 
 the authors may or may not complete it. Subject matter moving faster 
 than pen, I suspect

Peter, that wording does seem familiar, just cannot recall where I saw
it. Perhaps on the O'Reilly web site, where it is no longer listed.

For confirmation, I called O'Reilly's customer service in Cambridge, MA.
They confirm that the book was indeed cancelled and never published.

No reasons were given.

Regards,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] time series analysis

2007-02-01 Thread lamack lamack

Does anyone know a good introductory book or tutorial about time series 
analysis? (time
series for a beginner).

Thank you so much.

John Lamak

_
Descubra como mandar Torpedos SMS do seu Messenger para o celular dos seus 
amigos. http://mobile.msn.com/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] time series analysis

2007-02-01 Thread Ben Fairbank

John --

Well, as a start, have a look at Modern Applied Statistics with S, by
Venables and Ripley, both of which names you will recognize if you read
this list often.  There is a 30-page chapter on time series (with
suggestions for other readings), obviously geared to S and R, that is a
good jumping-off place.

Ben Fairbank


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of lamack lamack
Sent: Thursday, February 01, 2007 3:12 PM
To: R-help@stat.math.ethz.ch
Subject: [R] time series analysis

Does anyone know a good introductory book or tutorial about time series 
analysis? (time
series for a beginner).

Thank you so much.

John Lamak

_
Descubra como mandar Torpedos SMS do seu Messenger para o celular dos
seus 
amigos. http://mobile.msn.com/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 15:45 -0500, Christos Hatzis wrote:
 Thanks Marc and Phil.
 
 My dataset actually consists of 50+ individual files, so I will have to do
 this one column at a time in a loop...
 I might look into SQL and outer joints as an alternative to avoid looping.
 
 Thanks again.
 -Christos 

If the files conform to some naming convention and/or are all located in
a common sub-directory, you can use list.files() to get the file names
into a vector.  If not, you could use file.choose() interactively.

Then use either a for() loop or sapply() to loop over the filenames,
read them in to data frames using read.table() and merge them together
in the same loop.

When it comes to basic data manipulation like this, loops are not a bad
thing. The overhead of a loop is typically outweighed by the file I/O
and related considerations.

HTH,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Autocorrelated Binomial

2007-02-01 Thread Rick Bilonick

I need to generate autocorrelated binary data. I've found references to
the IEKS package but none of the web pages currently exist. Does anyone
know where I can find this package or suggest another package?

Rick B.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Marc Schwartz

Christos,

Haccording to the Value section in ?merge:

A data frame. The rows are by default lexicographically sorted on the
common columns, but for sort=FALSE are in an unspecified order.


Looking at the code, while there is a lot of time spent on matching
things, the key sort() code seems to be near the end of the function:

  if (sort) 
res - res[if (all.x || all.y) 
do.call(order, x[, 1:l.b, drop = FALSE])
else sort.list(bx[m$xi]), , drop = FALSE]

I wonder if you could create a local version of merge(), say my.merge(),
without that code and without breaking things. A quick glance suggests
that as long as you are not merging on the rownames, I think that you
might be OK. You would want to test that hypothesis however.

HTH,

Marc

On Thu, 2007-02-01 at 16:48 -0500, Christos Hatzis wrote:
 [Sorry I meant to reply to the list]
 
 Thanks, Marc.
 
 That's what I have done.
 However, there seems to be a penalty from using merge repeatedly as it
 appears to internally re-sort the datasets.  In my case the datasets are
 long (~35K rows) and already sorted so this step adds considerable and
 unnecessary overhead.  There doesn't seem to be an option for disabling
 sorting. Setting 'sort=F' only affects sorting of the final data.frame.
 
  system.time(merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], 
  by=V1, all=T, sort=T))
 [1] 6.96 0.00 7.24   NA   NA
  system.time(merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], 
  by=V1, all=T, sort=F))
 [1] 6.82 0.00 7.14   NA   NA
  
 
 I was wondering if perhaps there is a parallel between this problem and
 methods for linining up time-series data, since such data are also usually
 sorted on the time dimension. 
 
 -Christos  
 
 -Original Message-
 From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
 Sent: Thursday, February 01, 2007 4:21 PM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Lining up x-y datasets based on values of x
 
 On Thu, 2007-02-01 at 15:45 -0500, Christos Hatzis wrote:
  Thanks Marc and Phil.
  
  My dataset actually consists of 50+ individual files, so I will have 
  to do this one column at a time in a loop...
  I might look into SQL and outer joints as an alternative to avoid looping.
  
  Thanks again.
  -Christos
 
 If the files conform to some naming convention and/or are all located in a
 common sub-directory, you can use list.files() to get the file names into a
 vector.  If not, you could use file.choose() interactively.
 
 Then use either a for() loop or sapply() to loop over the filenames, read
 them in to data frames using read.table() and merge them together in the
 same loop.
 
 When it comes to basic data manipulation like this, loops are not a bad
 thing. The overhead of a loop is typically outweighed by the file I/O and
 related considerations.
 
 HTH,
 
 Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Christos Hatzis

[Sorry I meant to reply to the list]

Thanks, Marc.

That's what I have done.
However, there seems to be a penalty from using merge repeatedly as it
appears to internally re-sort the datasets.  In my case the datasets are
long (~35K rows) and already sorted so this step adds considerable and
unnecessary overhead.  There doesn't seem to be an option for disabling
sorting. Setting 'sort=F' only affects sorting of the final data.frame.

 system.time(merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], 
 by=V1, all=T, sort=T))
[1] 6.96 0.00 7.24   NA   NA
 system.time(merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], 
 by=V1, all=T, sort=F))
[1] 6.82 0.00 7.14   NA   NA
 

I was wondering if perhaps there is a parallel between this problem and
methods for linining up time-series data, since such data are also usually
sorted on the time dimension. 

-Christos  

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 4:21 PM
To: [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Lining up x-y datasets based on values of x

On Thu, 2007-02-01 at 15:45 -0500, Christos Hatzis wrote:
 Thanks Marc and Phil.
 
 My dataset actually consists of 50+ individual files, so I will have 
 to do this one column at a time in a loop...
 I might look into SQL and outer joints as an alternative to avoid looping.
 
 Thanks again.
 -Christos

If the files conform to some naming convention and/or are all located in a
common sub-directory, you can use list.files() to get the file names into a
vector.  If not, you could use file.choose() interactively.

Then use either a for() loop or sapply() to loop over the filenames, read
them in to data frames using read.table() and merge them together in the
same loop.

When it comes to basic data manipulation like this, loops are not a bad
thing. The overhead of a loop is typically outweighed by the file I/O and
related considerations.

HTH,

Marc

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Affymetrix data analysis

2007-02-01 Thread Tristan Coram

Hi,

 

I am trying to read in my Affymetrix CEL files (48 files, total ~600 MB) but
I keep getting memory errors.  Can somebody please help me with this.  Or is
therea remote server I can send my data to for computation?

 

Any help is much appreciated.

 

Thanks

 

 

Dr. Tristan Coram

Postdoctoral Research Associate

Research Plant Pathologist/Geneticist

 

United States Department of Agriculture

Agricultural Research Service

Wheat Genetics, Quality Physiology  Disease Research

 

209 Johnson Hall

Washington State University

Pullman, WA 99163

 

Office: +1 509 335-1596  Fax: +1 509 335-2553

Email: [EMAIL PROTECTED]

 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Can this loop be delooped?

2007-02-01 Thread jim holtman

This might do what you want:

 # test data
 x - 1:43
 nb - 5  # number of subsets
 # create vector of lengths of subsets
 ns - rep(length(x) %/% nb, nb)
 # see if we have to adjust counts of initial subsets
 if ((.offset - length(x) %% nb) != 0) ns[1:.offset] = ns[1:.offset] + 1
 # create the subsets
 split(x, rep(1:nb,ns))
$`1`
[1] 1 2 3 4 5 6 7 8 9

$`2`
[1] 10 11 12 13 14 15 16 17 18

$`3`
[1] 19 20 21 22 23 24 25 26 27

$`4`
[1] 28 29 30 31 32 33 34 35

$`5`
[1] 36 37 38 39 40 41 42 43



On 2/1/07, Talbot Katz [EMAIL PROTECTED] wrote:
 Hi.

 I have the following code in a loop.  It splits a vector into subvectors of
 equal size.  But if the size of the original vector is not an exact multiple
 of the desired subvector size, then the first few subvectors have one more
 element than the last few.  I know that the cut function could be used to
 determine where to break up the vector, but it doesn't seem to provide
 control over where to put the larger and smaller subvectors.

 numgp1_v=sidect_v%/%compmin
 numgroup_v[small]=max(1,numgp1_v[small])
 sidemin_v=sidect_v%/%numgroup_v
 nummax_v=sidect_v%%sidemin_v
 eix=0
 smallindexlist-list(NULL)
 for(i in 1:numgroup_v[small])   {
bix=eix+1
eix=bix+sidemin_v[small]+(i=nummax_v[small])-1
smallindexlist[[i]]-dlpo_sm_v[bix:eix]
 }

 The key fact is that smallindexlist is a list, each list element is a
 subvector of dlpo_sm_v of the proper size.  The sizes may be different.


 I tried to see whether I could eliminate the loop, as follows.  First I
 defined a function:

 intgpi  -  function(totalength,numgroups,groupnum,place=LEFT){
 #   function to split the integer sequence, 1:totalength, into the 
 groupnum
 group out of numgroups groups of equal size, totalength%/%numgroups
 #   there are totalength%%numgroups number of groups of length
 1+totalength%/%numgroups, with the large groups all to one side, left if
 place=LEFT
 #   totalength = numgroups = groupnum all integers, or it won't work 
 right
if(charmatch(toupper(place),RIGHT,nomatch=FALSE)==1){

 extra1_1=max((groupnum-1)+((totalength%%numgroups)-numgroups),0)
extra1_2=(groupnumnumgroups-totalength%%numgroups)
}
else{
extra1_1=min(totalength%%numgroups,groupnum-1)
extra1_2=(groupnum=totalength%%numgroups)
}
gsize=totalength%/%numgroups
gleft=((groupnum-1)*gsize)+extra1_1+1
gright=gleft+gsize+extra1_2-1
gleft:gright
 }


 The function appears to work okay.  Then I used it as follows:

 numgp1_v=sidect_v%/%compmin
 numgroup_v[small]=max(1,numgp1_v[small])
 smallindexlist-list(NULL)
 smallindexlist=sapply(1:numgroup_v[small],function(i){dlpo_sm_v[intgpi(sidect_v[small],numgroup_v[small],i)]})

 In this case, smallindexlist will be a list like I had before if the
 subvectors are not all the same size, but if the subvectors are all the same
 size, it appears that I get an array.  Can I force this operation to give me
 a list the way I want it in all cases?  Or is there a better way to deloop
 my original code?

 Thanks!

 --  TMK  --
 212-460-5430home
 917-656-5351cell

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wavlet filter using morlet mother wavelet

2007-02-01 Thread rdporto1

Anil Kumar,

it seems there isn't packages for continuous wavelet 
transforms in R. Anyway, take a look at the packages
waveslim, wavethresh, wavelets or rwt. Maybe one of
them can be useful to you.

Rogerio.

-- Cabeçalho original ---

De: [EMAIL PROTECTED]
Para: r-help@stat.math.ethz.ch
Cópia: 
Data: 1 Feb 2007 07:33:52 -
Assunto: [R] Wavlet filter using morlet mother wavelet

 nbsp;   Hi, List ,I am searching any package on R which can do wavelet 
 filtering for mother wavelet morlet ,is anybody having any script for the 
 same ?I am new to the RwAVELET ANALSSIS..THANKS IN ADVANCE  ANIL 
 KUMAR  ANIL KUMAR(nbsp;METEOROLOGIST)LRF SECTIONnbsp;
 NATIONAL CLIMATEnbsp;CENTER ADGM(RESEARCH)INDIA 
 METEOROLOGICALnbsp;DEPARTMENTSHIVIJI NAGARPUNE-411005 INDIA
 MOBILE +919422023277[EMAIL PROTECTED]  
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RSiteSearch() etc. - speed improvement

2007-02-01 Thread Jonathan Baron

My R search page at http://finzi.psych.upenn.edu/
which is also what you get with RSiteSearch()
has been slowing down the last few months (years?).

I thought this was because the archives were just getting too big.
But I discovered a simple fix.  The technical term for the problem is
garbage.  By cleaning up the garbage, I increased the speed to the
point where now most searches - even those with three search terms -
are instantaneous.

Thus, it is probably good for a few more years, before I have to think
about a different search engine or a faster computer (which I should get
anyway).

If you have given up on it because of its slow response, do try again.

Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Prof Brian Ripley

On Thu, 1 Feb 2007, Marc Schwartz wrote:

 Christos,

 Haccording to the Value section in ?merge:

 A data frame. The rows are by default lexicographically sorted on the
 common columns, but for sort=FALSE are in an unspecified order.

There is also a sort in the .Internal code.  But I am not buying 
that this is a major part of the time without detailed evidence from 
profiling.  Sorting 35k numbers should take a few milliseconds, and 
less if they are already sorted.

 x - rnorm(35000)
 system.time(y - sort(x, method=quick))
[1] 0.003 0.001 0.004 0.000 0.000
 system.time(sort(y, method=quick))
[1] 0.002 0.000 0.001 0.000 0.000



 Looking at the code, while there is a lot of time spent on matching
 things, the key sort() code seems to be near the end of the function:

  if (sort)
res - res[if (all.x || all.y)
do.call(order, x[, 1:l.b, drop = FALSE])
else sort.list(bx[m$xi]), , drop = FALSE]

 I wonder if you could create a local version of merge(), say my.merge(),
 without that code and without breaking things. A quick glance suggests
 that as long as you are not merging on the rownames, I think that you
 might be OK. You would want to test that hypothesis however.

 HTH,

 Marc

 On Thu, 2007-02-01 at 16:48 -0500, Christos Hatzis wrote:
 [Sorry I meant to reply to the list]

 Thanks, Marc.

 That's what I have done.
 However, there seems to be a penalty from using merge repeatedly as it
 appears to internally re-sort the datasets.  In my case the datasets are
 long (~35K rows) and already sorted so this step adds considerable and
 unnecessary overhead.  There doesn't seem to be an option for disabling
 sorting. Setting 'sort=F' only affects sorting of the final data.frame.

 system.time(merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]],
 by=V1, all=T, sort=T))
 [1] 6.96 0.00 7.24   NA   NA
 system.time(merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]],
 by=V1, all=T, sort=F))
 [1] 6.82 0.00 7.14   NA   NA


 I was wondering if perhaps there is a parallel between this problem and
 methods for linining up time-series data, since such data are also usually
 sorted on the time dimension.

 -Christos

 -Original Message-
 From: Marc Schwartz [mailto:[EMAIL PROTECTED]
 Sent: Thursday, February 01, 2007 4:21 PM
 To: [EMAIL PROTECTED]
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] Lining up x-y datasets based on values of x

 On Thu, 2007-02-01 at 15:45 -0500, Christos Hatzis wrote:
 Thanks Marc and Phil.

 My dataset actually consists of 50+ individual files, so I will have
 to do this one column at a time in a loop...
 I might look into SQL and outer joints as an alternative to avoid looping.

 Thanks again.
 -Christos

 If the files conform to some naming convention and/or are all located in a
 common sub-directory, you can use list.files() to get the file names into a
 vector.  If not, you could use file.choose() interactively.

 Then use either a for() loop or sapply() to loop over the filenames, read
 them in to data frames using read.table() and merge them together in the
 same loop.

 When it comes to basic data manipulation like this, loops are not a bad
 thing. The overhead of a loop is typically outweighed by the file I/O and
 related considerations.

 HTH,

 Marc

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] memory-efficient column aggregation of a sparse matrix

2007-02-01 Thread Jon Stearley


On Feb 1, 2007, at 6:22 AM, Douglas Bates wrote:

 It turns out that in the sparse matrix code used by the
 Matrix package the triplet representation allows for duplicate index
 positions with the convention that the resulting value at a position
 is the sum of the values of any triplets with that index pair.

Very handy!  I suggest adding this nugget near the (possibly  
redundant) triplets phrase in Matrix.pdf.

 If you decide to use this approach please be aware that the indices
 for the triplet representation in the Matrix package are 0-based (as
 in C code) not 1-based (as in R code).  (I imagine that Martin is
 thinking we really should change that as he reads this part.)

The Value of the appended function is equivalent to my previous  
version, but it runs in 1/10'th the time, uses vastly less memory,  
and is fewer lines of code to boot!  Sure it's tricky, but it does  
the trick.

THANK YOU SO MUCH!

-jon

NEWaggregate.csr - function(x,fac) {
 # cast into handy Matrix sparse Triplet form
 x.T - as(as(x, dgRMatrix), dgTMatrix)

 # factor column indexes (compensating for 0 vs 1 indexing)
 [EMAIL PROTECTED] - as.integer(as.integer([EMAIL PROTECTED])-1)

 # cast back, magically computing factor sums along the way :)
 y - as(x.T, matrix.csr)

 # and fix the dimension (doing this on x.T bus errors!)
 [EMAIL PROTECTED] - as.integer(c(nrow(y),nlevels(fac)))
 y
}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Gabor Grothendieck

The zoo package has a multiway merge with optional zero fill.
Here are two ways:

library(zoo)
merge(x = zoo(x[,2], x[,1]),
  y = zoo(y[,2], y[,1]),
  z = zoo(z[,2], z[,1]),
  fill = 0)

# or

library(zoo)
X - list(x = x, y = y, z = z)
merge0 - function(..., fill = 0) merge(..., fill = fill)
do.call(merge0, lapply(X, function(x) zoo(x[,2], x[,1])))

To get more info on zoo try:

vignette(zoo)

On 2/1/07, Christos Hatzis [EMAIL PROTECTED] wrote:
 Hi,

 I was wondering if there is a direct approach for lining up 2-column
 matrices according to the values of the first column.  An example and a
 brute-force approach is given below:

 x - cbind(1:10, runif(10))
 y - cbind(5:14, runif(10))
 z - cbind((-4):5, runif(10))

 xx - seq( min(c(x[,1],y[,1],z[,1])), max(c(x[,1],y[,1],z[,1])), 1)
 w - cbind(xx, matrix(rep(0, 3*length(xx)), ncol=3))

 w[ xx = x[1,1]  xx = x[10,1], 2 ] - x[,2]
 w[ xx = y[1,1]  xx = y[10,1], 3 ] - y[,2]
 w[ xx = z[1,1]  xx = z[10,1], 4 ] - z[,2]

 w

 I appreciate any pointers.

 Thanks.

 Christos Hatzis, Ph.D.
 Nuvera Biosciences, Inc.
 400 West Cummings Park
 Suite 5350
 Woburn, MA 01801
 Tel: 781-938-3830
 www.nuverabio.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Affymetrix data analysis

2007-02-01 Thread Benilton Carvalho

The bioconductor mailing list is probably a better place to ask this  
type of question.

[EMAIL PROTECTED]

But we also need to know what arrays are you working with, what the  
errors are, what your sessionInfo() is

Let us know, ok?

b

On Feb 1, 2007, at 5:46 PM, Tristan Coram wrote:

 Hi,



 I am trying to read in my Affymetrix CEL files (48 files, total  
 ~600 MB) but
 I keep getting memory errors.  Can somebody please help me with  
 this.  Or is
 therea remote server I can send my data to for computation?



 Any help is much appreciated.



 Thanks





 Dr. Tristan Coram

 Postdoctoral Research Associate

 Research Plant Pathologist/Geneticist



 United States Department of Agriculture

 Agricultural Research Service

 Wheat Genetics, Quality Physiology  Disease Research



 209 Johnson Hall

 Washington State University

 Pullman, WA 99163



 Office: +1 509 335-1596  Fax: +1 509 335-2553

 Email: [EMAIL PROTECTED]






   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Affymetrix data analysis

2007-02-01 Thread Sicotte, Hugues Ph.D.

Tristan, 
I have a soft spot for problems analyzing microarrays with R..

for the memory issue, there have been previous posts to this list..
But here is the answer I gave a few weeks ago.
If you need more memory, you have to move to linux or recompile R for
windows yourself..
.. But you'll still need a computer with more memory.
The long term solution, which we are implementing, is to rewrite the
normalization code so it doesn't
Need to load all those arrays at once.

-- cut previous part of message--
The defaults in R is to play nice and limit your allocation to half
the available RAM. Make sure you have a lot of disk swap space (at least
1G with 2G of RAM) and you can set your memory limit to 2G for R.

See help(memory.size)  and use the memory.limit function


Hugues 


P.s. Someone let me use their 16Gig of RAM linux
And I was able to run R-64 bits with top showing 6Gigs of RAM
allocated (with suitable --max-mem-size command line parameters at
startup for R). 
 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Benilton Carvalho
Sent: Thursday, February 01, 2007 6:47 PM
To: Tristan Coram
Cc: R-help@stat.math.ethz.ch
Subject: Re: [R] Affymetrix data analysis

The bioconductor mailing list is probably a better place to ask this  
type of question.

[EMAIL PROTECTED]

But we also need to know what arrays are you working with, what the  
errors are, what your sessionInfo() is

Let us know, ok?

b

On Feb 1, 2007, at 5:46 PM, Tristan Coram wrote:

 Hi,



 I am trying to read in my Affymetrix CEL files (48 files, total  
 ~600 MB) but
 I keep getting memory errors.  Can somebody please help me with  
 this.  Or is
 therea remote server I can send my data to for computation?



 Any help is much appreciated.



 Thanks





 Dr. Tristan Coram

 Postdoctoral Research Associate

 Research Plant Pathologist/Geneticist



 United States Department of Agriculture

 Agricultural Research Service

 Wheat Genetics, Quality Physiology  Disease Research



 209 Johnson Hall

 Washington State University

 Pullman, WA 99163



 Office: +1 509 335-1596  Fax: +1 509 335-2553

 Email: [EMAIL PROTECTED]






   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Wiki for Graphics tips for MacOS X

2007-02-01 Thread Gabor Grothendieck

I don't have a Linux system to try it with but
omitting both dev.control statements it worked for me
between two Windows XP sessions on the same
machine using this version of R:

 R.version.string # Windows XP
[1] R version 2.4.1 Patched (2006-12-30 r40331)

It also successfully worked with:

 R.version.string # Windows XP
[1] R version 2.5.0 Under development (unstable) (2007-01-31 r40623)



On 2/1/07, Patrick Connolly [EMAIL PROTECTED] wrote:
 On Wed, 31-Jan-2007 at 12:11PM -0500, Gabor Grothendieck wrote:

 | To get the best results you need to transfer it using vector
 | graphics rather than bitmapped graphics:
 |
 | http://www.stc-saz.org/resources/0203_graphics.pdf
 |
 | There are a number of variations described here (see
 | entire thread).  Its for UNIX and Windows but I think
 | it would likely work similarly on Mac and Windows:
 |
 | http://finzi.psych.upenn.edu/R/Rhelp02a/archive/32297.html

 I found that interesting, particularly this part:

 For example, on Linux do this:

   dev.control(displaylist=enable) # enable display list
   plot(1:10)
   myplot - recordPlot() # load displaylist into variable
   save(myplot, file=myplot, ascii=TRUE)

 Send the ascii file, myplot, to the Windows machine and on Windows do this:

   dev.control(displaylist=enable) # enable display list
   load(myplot)
   myplot # displays the plot
   savePlot(myplot, type=wmf) # saves current plot as wmf

 I tried that, but I was never able to load the myplot in the Windows
 R.  I always got a message about a syntax error to do with ' ' but I
 was unable to work out what the problem was.  I thought it was because
 the transfer to Windows wasn't binary, but that wasn't the problem.

 I was unable to get the thread view at that archive to function so I
 was unable to see if there were any follow ups which offered an
 explanation.

 R has changed quite a bit in the years since then, so it might be that
 something needs to be done differently with more recent versions.

 Has anyone done this recently?

 --
 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
   ___Patrick Connolly
  {~._.~} Great minds discuss ideas
  _( Y )_Middle minds discuss events
 (:_~*~_:)Small minds discuss people
  (_)-(_)   . Anon

 ~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 23:34 +, Prof Brian Ripley wrote:
 On Thu, 1 Feb 2007, Marc Schwartz wrote:
 
  Christos,
 
  Haccording to the Value section in ?merge:
 
  A data frame. The rows are by default lexicographically sorted on the
  common columns, but for sort=FALSE are in an unspecified order.
 
 There is also a sort in the .Internal code.  But I am not buying 
 that this is a major part of the time without detailed evidence from 
 profiling.  Sorting 35k numbers should take a few milliseconds, and 
 less if they are already sorted.
 
  x - rnorm(35000)
  system.time(y - sort(x, method=quick))
 [1] 0.003 0.001 0.004 0.000 0.000
  system.time(sort(y, method=quick))
 [1] 0.002 0.000 0.001 0.000 0.000

Having had a chance to mock up some examples, I would have to agree with
Prof. Ripley on this point.

Presuming that we are not missing something about the nature of
Christos' data sets, here are 4 examples, with rows sorted in ascending
order, descending order, reversed sort order and random order. In
theory, the descending order example should, I believe, represent a
worst cast scenario, since reverse sorting a sorted list is typically
slowest. However, note that there is not much time variation below and
running each of the examples several times resulted in material
differences across runs.


1. Ascending order

DF.X - data.frame(X = 1:35000, Y = runif(35000))
DF.Y - data.frame(X = 1:35000, Y = runif(35000))

 system.time(DF.XY - merge(DF.X, DF.Y, by = X, all = TRUE))
[1] 0.249 0.004 0.264 0.000 0.000


2. Descending order

DF.X - data.frame(X = 35000:1, Y = runif(35000))
DF.Y - data.frame(X = 35000:1, Y = runif(35000))

 system.time(DF.XY - merge(DF.X, DF.Y, by = X, all = TRUE))
[1] 0.300 0.007 0.309 0.000 0.000


3. Reversed sort order

DF.X - data.frame(X = 35000:1, Y = runif(35000))
DF.Y - data.frame(X = 1:35000, Y = runif(35000))

 system.time(DF.XY - merge(DF.X, DF.Y, by = X, all = TRUE))
[1] 0.236 0.008 0.245 0.000 0.000


4. Random order

DF.X - data.frame(X = sample(35000), Y = runif(35000))
DF.Y - data.frame(X = sample(35000), Y = runif(35000))

 system.time(DF.XY - merge(DF.X, DF.Y, by = X, all = TRUE))
[1] 0.339 0.016 0.357 0.000 0.000



Spending some time looking at profiling the descending order example, we
get:

 summaryRprof()
$by.self
 self.time self.pct total.time total.pct
duplicated.default  0.16 38.1   0.16  38.1
match   0.08 19.0   0.08  19.0
sort.list   0.08 19.0   0.08  19.0
[.data.frame0.04  9.5   0.24  57.1
merge.data.frame0.02  4.8   0.42 100.0
names.default   0.02  4.8   0.02   4.8
seq_len 0.02  4.8   0.02   4.8
merge   0.00  0.0   0.42 100.0
[   0.00  0.0   0.24  57.1
any 0.00  0.0   0.18  42.9
duplicated  0.00  0.0   0.18  42.9
cbind   0.00  0.0   0.04   9.5
data.frame  0.00  0.0   0.04   9.5
data.row.names  0.00  0.0   0.02   4.8
names   0.00  0.0   0.02   4.8
row.names- 0.00  0.0   0.02   4.8
row.names-.data.frame  0.00  0.0   0.02   4.8

$by.total
 total.time total.pct self.time self.pct
merge.data.frame 0.42 100.0  0.02  4.8
merge0.42 100.0  0.00  0.0
[.data.frame 0.24  57.1  0.04  9.5
[0.24  57.1  0.00  0.0
any  0.18  42.9  0.00  0.0
duplicated   0.18  42.9  0.00  0.0
duplicated.default   0.16  38.1  0.16 38.1
match0.08  19.0  0.08 19.0
sort.list0.08  19.0  0.08 19.0
cbind0.04   9.5  0.00  0.0
data.frame   0.04   9.5  0.00  0.0
names.default0.02   4.8  0.02  4.8
seq_len  0.02   4.8  0.02  4.8
data.row.names   0.02   4.8  0.00  0.0
names0.02   4.8  0.00  0.0
row.names-  0.02   4.8  0.00  0.0
row.names-.data.frame   0.02   4.8  0.00  0.0

$sampling.time
[1] 0.42



The above suggests that a meaningful amount of time is spent in checking
for and dealing with duplicates in the common ('by') columns. To that
end:

DF.X - data.frame(X = sample(1, 35000, replace = TRUE), Y = runif(35000))
DF.Y - data.frame(X = sample(1, 35000, replace = TRUE), Y = runif(35000))

 system.time(DF.XY - merge(DF.X, DF.Y, by = X, all = TRUE))
[1] 3.316 0.148

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Christos Hatzis

Marc,

I don't think the issue is duplicates in the matching columns.  The data
were generated by an instrument (NMR spectrometer), processed by the
instrument's software through an FFT transform and other transformations and
finally reported as a sequence of chemical shift (x) vs intensity (y) pairs.
So all x values are unique.  For the example that I reported earlier:

 length(nmr.spectra.serum[[1]]$V1)
[1] 32768
 length(unique(nmr.spectra.serum[[1]]$V1))
[1] 32768
 length(nmr.spectra.serum[[2]]$V1)
[1] 32768
 length(unique(nmr.spectra.serum[[2]]$V1))
[1] 32768

And most of the x-values are common
 sum(nmr.spectra.serum[[1]]$V1 %in% nmr.spectra.serum[[2]]$V1)
[1] 32625

For this reason, merge is probably an overkill for this problem and my
initial thought was to align the datasets through some simple index-shifting
operation. 

Profiling of the merge code in my case shows that most of the time is spent
on data frame subsetting operations and on internal merge and rbind calls
secondarily (if I read the summary output correctly).  So even if most of
the time in the internal merge function is spent on sorting (haven't checked
the source code), this is in the worst case a rather minor effect, as
suggested by Prof. Ripley.
  
 Rprof(merge.out)
 zz - merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], by=V1,
all=T, sort=T)
 Rprof(NULL)
 summaryRprof(merge.out)

$by.self
   self.time self.pct total.time total.pct
merge.data.frame6.56 50.0  11.84  90.2
[.data.frame2.42 18.4   3.68  28.0
merge   1.28  9.8  13.12 100.0
rbind   1.24  9.5   1.36  10.4
names-.default 1.16  8.8   1.16   8.8
row.names-.data.frame  0.12  0.9   0.18   1.4
duplicated.default  0.12  0.9   0.12   0.9
make.unique 0.10  0.8   0.10   0.8
data.frame  0.02  0.2   0.04   0.3
*   0.02  0.2   0.02   0.2
is.na   0.02  0.2   0.02   0.2
match   0.02  0.2   0.02   0.2
order   0.02  0.2   0.02   0.2
unclass 0.02  0.2   0.02   0.2
[   0.00  0.0   3.68  28.0
do.call 0.00  0.0   1.18   9.0
names- 0.00  0.0   1.16   8.8
row.names- 0.00  0.0   0.18   1.4
any 0.00  0.0   0.14   1.1
duplicated  0.00  0.0   0.12   0.9
cbind   0.00  0.0   0.04   0.3
as.vector   0.00  0.0   0.02   0.2
seq 0.00  0.0   0.02   0.2
seq.default 0.00  0.0   0.02   0.2

$by.total
   total.time total.pct self.time self.pct
merge   13.12 100.0  1.28  9.8
merge.data.frame11.84  90.2  6.56 50.0
[.data.frame 3.68  28.0  2.42 18.4
[3.68  28.0  0.00  0.0
rbind1.36  10.4  1.24  9.5
do.call  1.18   9.0  0.00  0.0
names-.default  1.16   8.8  1.16  8.8
names-  1.16   8.8  0.00  0.0
row.names-.data.frame   0.18   1.4  0.12  0.9
row.names-  0.18   1.4  0.00  0.0
any  0.14   1.1  0.00  0.0
duplicated.default   0.12   0.9  0.12  0.9
duplicated   0.12   0.9  0.00  0.0
make.unique  0.10   0.8  0.10  0.8
data.frame   0.04   0.3  0.02  0.2
cbind0.04   0.3  0.00  0.0
*0.02   0.2  0.02  0.2
is.na0.02   0.2  0.02  0.2
match0.02   0.2  0.02  0.2
order0.02   0.2  0.02  0.2
unclass  0.02   0.2  0.02  0.2
as.vector0.02   0.2  0.00  0.0
seq  0.02   0.2  0.00  0.0
seq.default  0.02   0.2  0.00  0.0

$sampling.time
[1] 13.12


Thanks again for your time in looking into this.
-Christos

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 9:59 PM
To: Prof Brian 

Ripley
Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
Subject: Re: [R] Lining up x-y datasets based on values of x

On Thu, 2007-02-01 at 23:34 +, Prof Brian Ripley wrote:
 On Thu, 1 Feb 2007, Marc Schwartz wrote:
 
  Christos,

Re: [R] Problems installing R-2.4.1 on Solaris 11 x-86 from source: error in gmake after successful configure

2007-02-01 Thread 廖宮毅

On Thu, 2007-02-01 at 20:39 +, Prof Brian Ripley wrote:
 What is 'Solaris 11'?  According to www.sun.com, the latest Solaris 
 version is 10, and my sysadmins have not heard of Solaris 11.
 
That is the solaris express community release, or the pre-release of the
upcoming OpenSolaris
http://www.opensolaris.org

 You seem to be missing the Solaris compilation tools, ar in this case.
 In Solaris = 10 they are in /usr/ccs/bin, not in the path by default.
 
 On Wed, 31 Jan 2007, Octavio Tourinho wrote:
 
  Dear friends,
  I am trying to install R-2.4.1 from source on Solaris 11 x-86. 64 bits,
 
 There is 32-bit x86 and 64-bit amd64 or x86_64.
 
  running on Sun Ultra-20 workstation, and using the SunStudio 11 compilers.
  I was able to configure R correctly, but received an error in gmake, 
  aparently related to bzip2 which I have been unable to debug.
  The messages are listed below.
  The configure.log and configure.status files are attached.
 
  Any help would be sincerely appreciated.
 
  Octavio Tourinho
 
  =
  R is now configured for i386-pc-solaris2.11
 
  Source directory:  .
  Installation directory:/usr/local
 
  C compiler:gcc -std=gnu99 -D__NO_MATH_INLINES -g -O2
  Fortran 77 compiler:   g77  -g -O2
 
  C++ compiler:  g++  -g -O2
  Fortran 90/95 compiler:f95 -g
 
  Interfaces supported:  X11, tcltk
  External libraries:readline
  Additional capabilities:   PNG, JPEG, NLS
  Options enabled:   shared BLAS, R profiling
 
  Recommended packages:  yes
 
  configure: WARNING: you cannot build DVI versions of the R manuals
  configure: WARNING: you cannot build PDF versions of the R manuals
  # gmake
  gmake[1]: Entering directory `/usr/local/R-2.4.1/m4'
  gmake[1]: Nothing to be done for `R'.
  gmake[1]: Leaving directory `/usr/local/R-2.4.1/m4'
  gmake[1]: Entering directory `/usr/local/R-2.4.1/tools'
  gmake[1]: Nothing to be done for `R'.
  gmake[1]: Leaving directory `/usr/local/R-2.4.1/tools'
  gmake[1]: Entering directory `/usr/local/R-2.4.1/doc'
  gmake[2]: Entering directory `/usr/local/R-2.4.1/doc/html'
  gmake[3]: Entering directory `/usr/local/R-2.4.1/doc/html/search'
  gmake[3]: Leaving directory `/usr/local/R-2.4.1/doc/html/search'
  gmake[2]: Leaving directory `/usr/local/R-2.4.1/doc/html'
  gmake[2]: Entering directory `/usr/local/R-2.4.1/doc/manual'
  gmake[2]: Nothing to be done for `R'.
  gmake[2]: Leaving directory `/usr/local/R-2.4.1/doc/manual'
  gmake[1]: Leaving directory `/usr/local/R-2.4.1/doc'
  gmake[1]: Entering directory `/usr/local/R-2.4.1/etc'
  gmake[1]: Leaving directory `/usr/local/R-2.4.1/etc'
  gmake[1]: Entering directory `/usr/local/R-2.4.1/share'
  gmake[1]: Leaving directory `/usr/local/R-2.4.1/share'
  gmake[1]: Entering directory `/usr/local/R-2.4.1/src'
  gmake[2]: Entering directory `/usr/local/R-2.4.1/src/scripts'
  creating src/scripts/R.fe
  gmake[3]: Entering directory `/usr/local/R-2.4.1/src/scripts'
  gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/scripts'
  gmake[2]: Leaving directory `/usr/local/R-2.4.1/src/scripts'
  gmake[2]: Entering directory `/usr/local/R-2.4.1/src/include'
  config.status: creating src/include/config.h
  config.status: src/include/config.h is unchanged
  Rmath.h is unchanged
  gmake[3]: Entering directory `/usr/local/R-2.4.1/src/include/R_ext'
  gmake[3]: Nothing to be done for `R'.
  gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/include/R_ext'
  gmake[2]: Leaving directory `/usr/local/R-2.4.1/src/include'
  gmake[2]: Entering directory `/usr/local/R-2.4.1/src/extra'
  gmake[3]: Entering directory `/usr/local/R-2.4.1/src/extra/blas'
  gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/blas'
  gmake[4]: `libRblas.so' is up to date.
  gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/blas'
  gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/blas'
  /usr/local/R-2.4.1/lib/libRblas.so is unchanged
  gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/blas'
  gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/extra/blas'
  gmake[3]: Entering directory `/usr/local/R-2.4.1/src/extra/bzip2'
  gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/bzip2'
  gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/bzip2'
  gmake[4]: Entering directory `/usr/local/R-2.4.1/src/extra/bzip2'
  rm -f libbz2.a
  false cr libbz2.a blocksort.o bzlib.o compress.o crctable.o decompress.o 
  huffman.o randtable.o
  gmake[4]: *** [libbz2.a] Error 1
  gmake[4]: Leaving directory `/usr/local/R-2.4.1/src/extra/bzip2'
  gmake[3]: *** [R] Error 2
  gmake[3]: Leaving directory `/usr/local/R-2.4.1/src/extra/bzip2'
  gmake[2]: *** [R] Error 1
  gmake[2]: Leaving directory `/usr/local/R-2.4.1/src/extra'
  gmake[1]: *** [R] Error 1
  gmake[1]: Leaving directory `/usr/local/R-2.4.1/src'
  gmake: *** [R] Error 1
  
 
 
I have build

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Christos Hatzis

Thanks Gabor.

This is along the lines of what I was looking for.  In fact the merge
function for zoo objects (ordered) turns out to be almost an order of
magnitude faster than the generic merge function for my problem:

 system.time(
+ zz - merge( spec.1 = zoo(nmr.spectra.serum[[1]]$V2,
nmr.spectra.serum[[1]]$V1),
+spec.2 = zoo(nmr.spectra.serum[[2]]$V2, nmr.spectra.serum[[2]]$V1),
fill=NA )
+ )
[1] 0.74 0.07 0.82   NA   NA
 system.time(
+ ww - merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], by=V1,
all=T, sort=T)
+ )
[1] 6.85 0.05 6.94   NA   NA
 head(zz)
spec.1 spec.2
-1322.2 -0.651 NA
-1321.9 -0.266 NA
-1321.7 -0.962 NA
-1321.4 -0.602 NA
-1321.2  0.753 NA
-1320.9  1.212 NA
 head(ww)
   V1   V2.x V2.y
1 -1322.2 -0.651   NA
2 -1321.9 -0.266   NA
3 -1321.7 -0.962   NA
4 -1321.4 -0.602   NA
5 -1321.2  0.753   NA
6 -1320.9  1.212   NA
 

Thanks again.
-Christos 

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 7:25 PM
To: [EMAIL PROTECTED]
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Lining up x-y datasets based on values of x

The zoo package has a multiway merge with optional zero fill.
Here are two ways:

library(zoo)
merge(x = zoo(x[,2], x[,1]),
  y = zoo(y[,2], y[,1]),
  z = zoo(z[,2], z[,1]),
  fill = 0)

# or

library(zoo)
X - list(x = x, y = y, z = z)
merge0 - function(..., fill = 0) merge(..., fill = fill) do.call(merge0,
lapply(X, function(x) zoo(x[,2], x[,1])))

To get more info on zoo try:

vignette(zoo)

On 2/1/07, Christos Hatzis [EMAIL PROTECTED] wrote:
 Hi,

 I was wondering if there is a direct approach for lining up 2-column 
 matrices according to the values of the first column.  An example and 
 a brute-force approach is given below:

 x - cbind(1:10, runif(10))
 y - cbind(5:14, runif(10))
 z - cbind((-4):5, runif(10))

 xx - seq( min(c(x[,1],y[,1],z[,1])), max(c(x[,1],y[,1],z[,1])), 1) w 
 - cbind(xx, matrix(rep(0, 3*length(xx)), ncol=3))

 w[ xx = x[1,1]  xx = x[10,1], 2 ] - x[,2] w[ xx = y[1,1]  xx = 
 y[10,1], 3 ] - y[,2] w[ xx = z[1,1]  xx = z[10,1], 4 ] - z[,2]

 w

 I appreciate any pointers.

 Thanks.

 Christos Hatzis, Ph.D.
 Nuvera Biosciences, Inc.
 400 West Cummings Park
 Suite 5350
 Woburn, MA 01801
 Tel: 781-938-3830
 www.nuverabio.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Marc Schwartz

On Thu, 2007-02-01 at 22:46 -0500, Christos Hatzis wrote:
 Marc,
 
 I don't think the issue is duplicates in the matching columns.  The data
 were generated by an instrument (NMR spectrometer), processed by the
 instrument's software through an FFT transform and other transformations and
 finally reported as a sequence of chemical shift (x) vs intensity (y) pairs.
 So all x values are unique.  For the example that I reported earlier:
 
  length(nmr.spectra.serum[[1]]$V1)
 [1] 32768
  length(unique(nmr.spectra.serum[[1]]$V1))
 [1] 32768
  length(nmr.spectra.serum[[2]]$V1)
 [1] 32768
  length(unique(nmr.spectra.serum[[2]]$V1))
 [1] 32768
 
 And most of the x-values are common
  sum(nmr.spectra.serum[[1]]$V1 %in% nmr.spectra.serum[[2]]$V1)
 [1] 32625
 
 For this reason, merge is probably an overkill for this problem and my
 initial thought was to align the datasets through some simple index-shifting
 operation. 
 
 Profiling of the merge code in my case shows that most of the time is spent
 on data frame subsetting operations and on internal merge and rbind calls
 secondarily (if I read the summary output correctly).  So even if most of
 the time in the internal merge function is spent on sorting (haven't checked
 the source code), this is in the worst case a rather minor effect, as
 suggested by Prof. Ripley.
   
  Rprof(merge.out)
  zz - merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], by=V1,
 all=T, sort=T)
  Rprof(NULL)
  summaryRprof(merge.out)
 
 $by.self
self.time self.pct total.time total.pct
 merge.data.frame6.56 50.0  11.84  90.2
 [.data.frame2.42 18.4   3.68  28.0
 merge   1.28  9.8  13.12 100.0
 rbind   1.24  9.5   1.36  10.4
 names-.default 1.16  8.8   1.16   8.8
 row.names-.data.frame  0.12  0.9   0.18   1.4
 duplicated.default  0.12  0.9   0.12   0.9
 make.unique 0.10  0.8   0.10   0.8
 data.frame  0.02  0.2   0.04   0.3
 *   0.02  0.2   0.02   0.2
 is.na   0.02  0.2   0.02   0.2
 match   0.02  0.2   0.02   0.2
 order   0.02  0.2   0.02   0.2
 unclass 0.02  0.2   0.02   0.2
 [   0.00  0.0   3.68  28.0
 do.call 0.00  0.0   1.18   9.0
 names- 0.00  0.0   1.16   8.8
 row.names- 0.00  0.0   0.18   1.4
 any 0.00  0.0   0.14   1.1
 duplicated  0.00  0.0   0.12   0.9
 cbind   0.00  0.0   0.04   0.3
 as.vector   0.00  0.0   0.02   0.2
 seq 0.00  0.0   0.02   0.2
 seq.default 0.00  0.0   0.02   0.2
 
 $by.total
total.time total.pct self.time self.pct
 merge   13.12 100.0  1.28  9.8
 merge.data.frame11.84  90.2  6.56 50.0
 [.data.frame 3.68  28.0  2.42 18.4
 [3.68  28.0  0.00  0.0
 rbind1.36  10.4  1.24  9.5
 do.call  1.18   9.0  0.00  0.0
 names-.default  1.16   8.8  1.16  8.8
 names-  1.16   8.8  0.00  0.0
 row.names-.data.frame   0.18   1.4  0.12  0.9
 row.names-  0.18   1.4  0.00  0.0
 any  0.14   1.1  0.00  0.0
 duplicated.default   0.12   0.9  0.12  0.9
 duplicated   0.12   0.9  0.00  0.0
 make.unique  0.10   0.8  0.10  0.8
 data.frame   0.04   0.3  0.02  0.2
 cbind0.04   0.3  0.00  0.0
 *0.02   0.2  0.02  0.2
 is.na0.02   0.2  0.02  0.2
 match0.02   0.2  0.02  0.2
 order0.02   0.2  0.02  0.2
 unclass  0.02   0.2  0.02  0.2
 as.vector0.02   0.2  0.00  0.0
 seq  0.02   0.2  0.00  0.0
 seq.default  0.02   0.2  0.00  0.0
 
 $sampling.time
 [1] 13.12
 
 
 Thanks again for your time in looking into this.
 -Christos

Christos,

Thanks for the follow up.  Thought I had something, but apparently not.

Question: What is the actual structure of the nmr.spectra.serum objects?
The indexing approach that you have suggests they are

Re: [R] Lining up x-y datasets based on values of x

2007-02-01 Thread Christos Hatzis

Marc,

The data structure is a list of data frames generated from read.table:

 class(nmr.spectra.serum)
[1] list
 class(nmr.spectra.serum[[1]])
[1] data.frame 
 dim(nmr.spectra.serum[[1]])
[1] 32768 2

Converting the data.frames to matrices does not have much of an effect on
timing.

-Christos

-Original Message-
From: Marc Schwartz [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 11:06 PM
To: [EMAIL PROTECTED]
Cc: 'Prof Brian Ripley'; r-help@stat.math.ethz.ch
Subject: Re: [R] Lining up x-y datasets based on values of x

On Thu, 2007-02-01 at 22:46 -0500, Christos Hatzis wrote:
 Marc,
 
 I don't think the issue is duplicates in the matching columns.  The 
 data were generated by an instrument (NMR spectrometer), processed by 
 the instrument's software through an FFT transform and other 
 transformations and finally reported as a sequence of chemical shift (x)
vs intensity (y) pairs.
 So all x values are unique.  For the example that I reported earlier:
 
  length(nmr.spectra.serum[[1]]$V1)
 [1] 32768
  length(unique(nmr.spectra.serum[[1]]$V1))
 [1] 32768
  length(nmr.spectra.serum[[2]]$V1)
 [1] 32768
  length(unique(nmr.spectra.serum[[2]]$V1))
 [1] 32768
 
 And most of the x-values are common
  sum(nmr.spectra.serum[[1]]$V1 %in% nmr.spectra.serum[[2]]$V1)
 [1] 32625
 
 For this reason, merge is probably an overkill for this problem and my 
 initial thought was to align the datasets through some simple 
 index-shifting operation.
 
 Profiling of the merge code in my case shows that most of the time is 
 spent on data frame subsetting operations and on internal merge and 
 rbind calls secondarily (if I read the summary output correctly).  So 
 even if most of the time in the internal merge function is spent on 
 sorting (haven't checked the source code), this is in the worst case a 
 rather minor effect, as suggested by Prof. Ripley.
   
  Rprof(merge.out)
  zz - merge(nmr.spectra.serum[[1]], nmr.spectra.serum[[2]], by=V1,
 all=T, sort=T)
  Rprof(NULL)
  summaryRprof(merge.out)
 
 $by.self
self.time self.pct total.time total.pct
 merge.data.frame6.56 50.0  11.84  90.2
 [.data.frame2.42 18.4   3.68  28.0
 merge   1.28  9.8  13.12 100.0
 rbind   1.24  9.5   1.36  10.4
 names-.default 1.16  8.8   1.16   8.8
 row.names-.data.frame  0.12  0.9   0.18   1.4
 duplicated.default  0.12  0.9   0.12   0.9
 make.unique 0.10  0.8   0.10   0.8
 data.frame  0.02  0.2   0.04   0.3
 *   0.02  0.2   0.02   0.2
 is.na   0.02  0.2   0.02   0.2
 match   0.02  0.2   0.02   0.2
 order   0.02  0.2   0.02   0.2
 unclass 0.02  0.2   0.02   0.2
 [   0.00  0.0   3.68  28.0
 do.call 0.00  0.0   1.18   9.0
 names- 0.00  0.0   1.16   8.8
 row.names- 0.00  0.0   0.18   1.4
 any 0.00  0.0   0.14   1.1
 duplicated  0.00  0.0   0.12   0.9
 cbind   0.00  0.0   0.04   0.3
 as.vector   0.00  0.0   0.02   0.2
 seq 0.00  0.0   0.02   0.2
 seq.default 0.00  0.0   0.02   0.2
 
 $by.total
total.time total.pct self.time self.pct
 merge   13.12 100.0  1.28  9.8
 merge.data.frame11.84  90.2  6.56 50.0
 [.data.frame 3.68  28.0  2.42 18.4
 [3.68  28.0  0.00  0.0
 rbind1.36  10.4  1.24  9.5
 do.call  1.18   9.0  0.00  0.0
 names-.default  1.16   8.8  1.16  8.8
 names-  1.16   8.8  0.00  0.0
 row.names-.data.frame   0.18   1.4  0.12  0.9
 row.names-  0.18   1.4  0.00  0.0
 any  0.14   1.1  0.00  0.0
 duplicated.default   0.12   0.9  0.12  0.9
 duplicated   0.12   0.9  0.00  0.0
 make.unique  0.10   0.8  0.10  0.8
 data.frame   0.04   0.3  0.02  0.2
 cbind0.04   0.3  0.00  0.0
 *0.02   0.2  0.02  0.2
 is.na0.02   0.2  0.02  0.2
 match0.02   0.2  0.02  0.2
 order0.02   0.2  0.02  0.2

[R] Regression trees with an ordinal response variable

2007-02-01 Thread Stacey Buckelew

Hi,

I am working on a regression tree in Rpart that uses a continuous response
variable that is ordered.  I read a previous response by Pfr. Ripley to a
inquiry regarding the ability of rpart to handle ordinal responses in
2003.  At that time rpart was unable to implement an algorithm to handle
ordinal responses.  Has there been any effort to rectify this in recent
years?

Thanks!

Stacey



On Mon, 2 Jun 2003, Andreas Christmann wrote:
  1. RE: Ordinal data - Regression Trees  Proportional Odds
 (Liaw, Andy)

  AFAIK there's no implementation (or description) of tree algorithm
  that handles ordinal response.
 

 Regression trees with an ordinal response variable can be computed with
 SPSS Answer Tree 3.0.
They *can* be handled by tree or rpart in R.
I think Andy's point was that there is no consensus as to the right way to
handle them: certainly using the codes of categories works and may often
be reasonable, and treating ordinal responses as categorical is also very
often perfectly adequate.
Note that rpart is user-extensible, so it would be reasonably easy to write
an extension for a proportional-odds logistic regression model, if that is
thought appropriate (and it seems strange to me to impose such strong
structure on the model with such a general `linear predictor': POLR
models are often in my experience a poor reflection of real problems).
-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272860 (secr)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 >

1 - 100 of 104 matches

Mail list logo