date:20130503

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread jpm miao

Hi Anthony,

   Thank you very much. It works very well. However, after this line

 temp - sapply( temp , as.numeric )

   the data becomes a series of numbers instead of a matrix. Is there any
way to keep it a matrix?

   Thanks,

Miao




 temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
startRow=2, endRow= 11, startCol=2, endCol=5)
 temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )
 temp
  Col1 Col2   Col3Col4
 [1,] 647853 1413 57662 27897
 [2,] 491400 1365 40919 20411
 [3,] 38604  -5505  985
 [4,] 576-2054
 [5,] 80845  21   10211 4494
 [6,] 36428  27   1007  1953
 [7,] 269915 587  32988 12779
 [8,] 224494 -30554 9184
 [9,] 11858  587  - 686
[10,] 3742   -81415
 temp - sapply( temp , as.numeric )
Warning messages:
1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 temp
647853 491400  38604576  80845  36428 269915
647853 491400  38604576  80845  36428 269915
224494  11858   3742   1413   1365  -  -
224494  11858   3742   1413   1365 NA NA
21 27587  -587  -  57662
21 27587 NA587 NA  57662
 40919   5505 20  10211   1007  32988  30554
 40919   5505 20  10211   1007  32988  30554
 - 81  27897  20411985 54   4494
NA 81  27897  20411985 54   4494
  1953  12779   9184686415
  1953  12779   9184686415
 temp[ is.na( temp ) ] - 0
 temp
647853 491400  38604576  80845  36428 269915
647853 491400  38604576  80845  36428 269915
224494  11858   3742   1413   1365  -  -
224494  11858   3742   1413   1365  0  0
21 27587  -587  -  57662
21 27587  0587  0  57662
 40919   5505 20  10211   1007  32988  30554
 40919   5505 20  10211   1007  32988  30554
 - 81  27897  20411985 54   4494
 0 81  27897  20411985 54   4494
  1953  12779   9184686415
  1953  12779   9184686415


2013/5/2 Anthony Damico ajdam...@gmail.com

 try adding colTypes = 'numeric' to your readWorkSheetFromFile() call



 if that doesn't work, try a few other steps


 # view what data types your file is being read in as
 sapply( temp , class )


 # convert all fields to character if they're factor variables.. but i
 don't think you need this, readWorksheet defaults to `character`
 temp - sapply( temp , as.character )


 # you can also convert a subset like this
 temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character )



 # remove commas from character strings
 temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )

 # convert all fields to numeric
 temp - sapply( temp , as.numeric )

 # convert all NA fields to zeroes if you prefer
 temp[ is.na( temp ) ] - 0





 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote:

 Hi,

Attached are two datasheet to be read.
My raw data 130502temp.xlsx contains numbers with ' symbols, and they
 can't be read as numbers. Even if I copy and paste as numbers to form a
 new
 file 130502temp_number1.xlsx, they could not be read smoothly.

1. How can I read the datasheet as numbers?
2. How can I treat the notation - as (1) NA or (2) zero?

Thanks,

 Miao




  temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
 startRow=2, endRow= 11, startCol=2, endCol=5)

  temp

   Col1  Col2   Col3   Col4

 1  647,853 1,413 57,662 27,897

 2  491,400 1,365 40,919 20,411

 3   38,604 -  5,505985

 4  576 - 20 54

 5   80,84521 10,211  4,494

 6   36,42827  1,007  1,953

 7  269,915   587 32,988 12,779

 8  224,494 - 30,554  9,184

 9   11,858   587  -686

 10   3,742 - 81415

  temp[2,2]

 [1] 1,365

  temp[2,2]+3

 Error in temp[2, 2] + 3 : non-numeric argument to binary operator

  temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1,
 header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)

  temp_num[2,2]

 [1] 1,365

  temp_num[2,2]+3

 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator

  as.numeric(temp_num[2,2])+3

 [1] NA

 Warning message:

 NAs introduced by coercion

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] Create and read symbolic links in Windows

2013-05-03 Thread Santosh

Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't get
file.symlink to work, but file.link did return the result to be TRUE
but at the target location, I did not see any link.

Not sure I am missing anything more.. Hope it's nothing to do with
administrator accounts and administrator rights... Is it something I should
check with my system administrator?

Thanks,
Santosh


On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote:

 On 02/05/2013 19:50, Santosh wrote:

 Dear Rxperts..
 Got a couple of quick q's..
 I am using R in windows environment (both 32-bit and 64-bit)
 a) Is there a way to create symbolic links to some data files?


 See ?file.symlink.  ??'symbolic link' should have got you there.

 Note that this is not very useful for files, but that is a Windows and not
 an R restriction.


  b) How do I read data from symbolic links?

 The same ways you read data from files.


 Thanks so much..
 Santosh



 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create and read symbolic links in Windows

2013-05-03 Thread Prof Brian Ripley


On 03/05/2013 07:33, Santosh wrote:

Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't
get file.symlink to work, but file.link did return the result to be
TRUE but at the target location, I did not see any link.

Not sure I am missing anything more.. Hope it's nothing to do with
administrator accounts and administrator rights... Is it something I
should check with my system administrator?


You may need to update your R: although the posting guide asked you to 
do that before posting.  There was a relevant bug fix in 2.15.3.




Thanks,
Santosh


On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk wrote:

On 02/05/2013 19:50, Santosh wrote:

Dear Rxperts..
Got a couple of quick q's..
I am using R in windows environment (both 32-bit and 64-bit)
a) Is there a way to create symbolic links to some data files?


See ?file.symlink.  ??'symbolic link' should have got you there.

Note that this is not very useful for files, but that is a Windows
and not an R restriction.


  b) How do I read data from symbolic links?

The same ways you read data from files.


Thanks so much..
Santosh



--
Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk
Professor of Applied Statistics,
http://www.stats.ox.ac.uk/~__ripley/
http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel: +44 1865 272861
tel:%2B44%201865%20272861 (self)
1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA)
Oxford OX1 3TG, UKFax: +44 1865 272595
tel:%2B44%201865%20272595


R-help@r-project.org mailto:R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Likelihood

2013-05-03 Thread Preetam Pal

Hi all,
I have run a regression and want to calculate the likelihood of obtaining
the sample.
Is there a way in which I can use R to get this likelihood value?
Appreciate your help on this.


The following are the details:

raw_ols1=lm(data$LOSS~data$GDP+data$HPI+data$UE)

summary(raw_ols1)



Call:
lm(formula = data$LOSS ~ data$GDP + data$HPI + data$UE)

Residuals:
   Min 1Q Median 3QMax
-0.0023859 -0.0006236  0.0002444  0.0006739  0.0017713

Coefficients:
   Estimate  Std. Erro  t value   Pr(|t|)
(Intercept)-3.940e-02  6.199e-03  -6.356   9.54e-06 ***
data$GDP 3.467e-09  7.652e-09   0.453  0.656580
data$HPI 7.935e-05  1.875e-05   4.2320.000635 ***
data$UE  6.858e-04  2.800e-04   2.449   0.026227 *
---

Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1

Residual standard error: 0.001198 on 16 degrees of freedom
Multiple R-squared:  0.9528,Adjusted R-squared:  0.944
F-statistic: 107.8 on 3 and 16 DF,  p-value: 7.989e-11





Thanks and regards,
Preetam


-- 
Preetam Pal
(+91)-9432212774
M-Stat 2nd Year, Room No. N-114
Statistics Division,   C.V.Raman
Hall
Indian Statistical Institute, B.H.O.S.
Kolkata.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factors and Multinomial Logistic Regression

2013-05-03 Thread Lorenzo Isella

On Thu, 02 May 2013 22:04:26 +0200, peter dalgaard pda...@gmail.com  
wrote:




On May 2, 2013, at 20:33 , Lorenzo Isella wrote:

On Wed, 01 May 2013 23:49:07 +0200, peter dalgaard pda...@gmail.com  
wrote:



It still doesn't work!




Apologies; since I had already imported nnet in my workspace, the  
script worked on my machine even without importing it explicitly (see  
the script at the end of the email).

Sorry for the confusion.


You still owe us an answer why you thought that this:

Coefficients:
 (Intercept) science   socst femalefemale
low 1.912288 -0.02356494 -0.03892428   0.81659717
high   -4.057284  0.02292179  0.04300323  -0.03287211

Std. Errors:
 (Intercept)science  socst femalefemale
low 1.127255 0.02097468 0.019516490.3909804
high1.222937 0.02087182 0.019889330.3500151

Residual Deviance: 388.0697

is at all different from the Stata output. As far as I can tell it is  
EXACTLY the same!


Apologies for being insistent, but this will come up in Internet  
searches as I couldn't make R do what Stata does.




You are right. I must have messed up my workspace...

In any case, the idea that R is somehow inferior to stata never crossed my  
mind.
Rather, I was puzzled because I (not R) could not reproduce an allegedly  
almost textbook-like example I found on the web.

Many thanks for your help.

Lorenzo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] untar() error

2013-05-03 Thread Hakim Abdi

Dear List,

I have a list of 600+ *.gz files that I would like to extract and read the
geotiffs contained within them. I tried using the untar() function to
simplify this task but I am stumped by an error. I've combed the Internet
for a solution without luck. The details are below, and any help in solving
this matter is appreciated.

 files = list.files(path = J:/GIMMS/NDVI, pattern = data.tif.gz,
all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE,
include.dirs = TRUE)

 lapply(files, untar)
Error in rawToChar(block[seq_len(ns)]) :
  embedded nul in string: 'II*\0Ã \001Â´
\0\0`G\0\0\fn\0\0Â¸â\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã|\001\0lÂ£\001\0\030Ã\001\0ÃÃ°\001\0p\027\002\0\034\002\0Ãd\002\0tâ¹\002\0
Â²\002\0ÃÃ\002\0xÃ¿\002\0$\003\0ÃL\003\0|s\003'

 untar(files[1])
Error in rawToChar(block[seq_len(ns)]) :
  embedded nul in string: 'II*\0Ã \001Â´
\0\0`G\0\0\fn\0\0Â¸â\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã|\001\0lÂ£\001\0\030Ã\001\0ÃÃ°\001\0p\027\002\0\034\002\0Ãd\002\0tâ¹\002\0
Â²\002\0ÃÃ\002\0xÃ¿\002\0$\003\0ÃL\003\0|s\003'

 untar(J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz)
Error in rawToChar(block[seq_len(ns)]) :
  embedded nul in string: 'II*\0Ã \001Â´
\0\0`G\0\0\fn\0\0Â¸â\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã|\001\0lÂ£\001\0\030Ã\001\0ÃÃ°\001\0p\027\002\0\034\002\0Ãd\002\0tâ¹\002\0
Â²\002\0ÃÃ\002\0xÃ¿\002\0$\003\0ÃL\003\0|s\003'

 traceback()
3: rawToChar(block[seq_len(ns)])
2: untar2(tarfile, files, list, exdir)
1: untar(files[1])

 sessionInfo()
R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C

[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



___

Hakim Abdi
Doctoral Student

Physical Geography and Ecosystem Science
Lund University
SÃ¶lvegatan 12, 223 62 Lund, Sweden

Office: +46 (0) 46 2223132
Mobile: +46 (0) 73 9300116

Email: hakim.a...@nateko.lu.se

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Edmonton course: Regression, GLM GAM with R intro

2013-05-03 Thread Highland Statistics Ltd




We would like to announce the following statistics course:
Data exploration, regression, GLM  GAM. With introduction to R

When: 26 - 30 August 2013.
Where: Edmonton, Canada

For details, see: http://www.highstat.com/statscourse.htm
Course flyer: http://www.highstat.com/Courses/Flyer2013_09Canada.pdf



Kind regards,

Alain Zuur

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] significant test of two quadratic regression models (lm)

2013-05-03 Thread Elaine Kuo

Hello,

I am work with two quadratic regression models

y=ax^2+bx+c with the function of lm.

y1= observed migration distance of butterflies(y1=a1x^2+b1x+c1)

y2= predicted migration distance of butterflies (based on body mass)

(y2=a2x^2+b2x+c2)

x= body mass of butterflies


Now I would like to check the two regression model differ

by testing if the coeffients (a, b, c) of the y1 and the y2 model differ

(null hypothesis: a1=a2 and b1=b2 and c1=c2)


Please kindly advise any significant test in R for the purpose.

Also, please kindly advise how to apply Bonferroni procedure in the test if
necessary.

Thank you in advance.


Elaine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Self-developed package -- installation

2013-05-03 Thread Uwe Ligges




On 03.05.2013 07:54, PIKAL Petr wrote:

Hi

Probably others can give you some better insight but copying folder with 
package from one machine to another is possible until the installation is 
required by a new version of R (about each 3 years).


Reinstallation may be required more often, and we expect that packages 
need to be reinstalled at least if x or y are increased in a new R-x.y.z 
release. In rather rare cases this also happens for patch level updates.
There are examples where a reinstalltion is not required that often, but 
that is not guaranteed.


Best,
Uwe Ligges






Petr


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
project.org] On Behalf Of Hui Du
Sent: Thursday, May 02, 2013 6:55 PM
To: r-help@r-project.org
Subject: [R] Self-developed package -- installation

Hi All,

I have a question about package installation in R. We have developed a
package, say 'ABC'. We have installed it in two machines, A and B by
running 'Install Package(s) from local zip file'. Everything was fine.
Right now, suppose that package got damaged in machine A and our zipped
file is gone, My question is that may I directly copy ../library/ABC
from machine B to machine A rather than running 'Install Package(s)
from local zip file' (I don't have that zip file anymore)?


Thanks.

HXD



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-
guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Likelihood

2013-05-03 Thread S Ellison

 I have run a regression and want to calculate the likelihood 
 of obtaining the sample.
 Is there a way in which I can use R to get this likelihood value?

See ?logLik

And see also ?help.search and ??. You would have found the above by typing 
??likelihood at the command line in R


S Ellison
 

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R does not subset

2013-05-03 Thread Katarzyna Kulma

Hi everyone,

I know there have been several requests regarding subsetting before, but
none of them really helps with my problem:

I'm trying to subset only infected individuals from the REC2 data.frame:

 str(REC2)
'data.frame':362 obs. of  7 variables:
 $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58
66 17
 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ...
 $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
 $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
 $ binage   : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
 $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2
2 1 ...
 $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...

using either

RECinf-REC2[which (REC2$INFECTION==Infected),]

or

RECinf-subset(REC2,  INFECTION==Infected)

in both cases I get empty data frame (0 observations):

 str(RECinf)
'data.frame':0 obs. of  7 variables:
 $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
 $ year : Factor w/ 8 levels Y2002,Y2003,..:
 $ ccFLEDGE : int
 $ rec2012  : int
 $ binage   : Factor w/ 2 levels ad,juv:
 $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
 $ all.rsLD : num

When subsetting, R doesn't return any warning or error message. Besides, I
used same codes many times before and they worked perfectly well. Any ideas
why this case is different?

Thanks for your help,
Kasia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread David Kulp

You have an extra space in the INFECTION factors.

Use REC2[REC2$INFECTION==Infected ,]
or
subset(REC2, INFECTION==Infected )

No need to use which here.

On May 3, 2013, at 5:48 AM, Katarzyna Kulma wrote:

 Hi everyone,
 
 I know there have been several requests regarding subsetting before, but
 none of them really helps with my problem:
 
 I'm trying to subset only infected individuals from the REC2 data.frame:
 
 str(REC2)
 'data.frame':362 obs. of  7 variables:
 $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58
 66 17
 $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3 ...
 $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
 $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
 $ binage   : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
 $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2
 2 1 ...
 $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
 
 using either
 
 RECinf-REC2[which (REC2$INFECTION==Infected),]
 
 or
 
 RECinf-subset(REC2,  INFECTION==Infected)
 
 in both cases I get empty data frame (0 observations):
 
 str(RECinf)
 'data.frame':0 obs. of  7 variables:
 $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
 $ year : Factor w/ 8 levels Y2002,Y2003,..:
 $ ccFLEDGE : int
 $ rec2012  : int
 $ binage   : Factor w/ 2 levels ad,juv:
 $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
 $ all.rsLD : num
 
 When subsetting, R doesn't return any warning or error message. Besides, I
 used same codes many times before and they worked perfectly well. Any ideas
 why this case is different?
 
 Thanks for your help,
 Kasia
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Size of a refClass instance

2013-05-03 Thread David Kulp

Good tip.  Thanks Morgan.
I agree that a different structure might (necessarily) be in order.  I wanted 
to create a tree where nodes in a tree were of different derived sub-classes -- 
possibly holding more data and behaving polymorphically.  OO programming seemed 
ideal for this: lots of small things with specialized behavior -- but this 
isn't R's strength.

On May 2, 2013, at 4:57 PM, Martin Morgan wrote:

 On 05/01/2013 11:20 AM, David Kulp wrote:
 I'm using refClass for a complex multi-directional tree structure with
 possibly 100,000s of nodes.  The refClass design is very impressive and I'd
 love to use it, but I've found that the size of refClass instances are very
 large and creation time is slow.  For example, below is a RefClass and normal
 S4 class.  The RefClass requires about 4KB per instance vs 500B for the S4
 class -- based on adding the Ncells and Vcells of used memory reported by
 gc().  And instantiation is more than twice as slow for a RefClass.  (R
 2.14.2)
 
 Anyone have thoughts on this and whether there's any hope for improving
 resources on either front?
 
 Hi David -- not necessarily helpful but creating a few large objects is 
 always better than creating many small in R, so perhaps re-conceptualize your 
 data structure? As a rough analogy, instead of constructing a graph as a 
 large number of 'Node' instances each pointing to one another, a graph could 
 be represented as a data.frame containing columns of 'from' and 'to' indexes 
 (neighbour-edge list, a few large objects) or as an adjacency matrix. One 
 would also implement creation and update of the few large objects in an 
 R-friendly (vectorized) way.
 
 Perhaps there are existing packages that already model the data you're 
 interested in? If your multi-directional tree can be represented as a graph, 
 then perhaps
 
  http://bioconductor.org/packages/release/bioc/html/graph.html
 
 including facilities in the Boost graph library (RBGL, on the Bioconductor 
 web site, too) or the igraph package can be put to use.
 
 Martin
 
 
 I wonder what others are doing.  I've been thinking about lightweight
 alternative implementations, but nothing particularly elegant has come to
 mind, yet!
 
 Thanks!
 
 
 simple - setRefClass('simple', fields = list(a = character, b=numeric)
 ) gc() system.time(simple.list - lapply(1:10, function(i) {
 simple$new(a='foo',b=i) })) gc()
 
 setClass('simple2', representation(a=character,b=numeric))
 setMethod(initialize, simple2, function(.Object, a, b) { .Object@a - a
 .Object@b - b .Object })
 
 gc() system.time(simple2.list - lapply(1:10, function(i) {
 new('simple2',a='foo',b=i) })) gc()
 
 __ R-help@r-project.org mailing
 list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
 guide http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.
 
 
 
 -- 
 Computational Biology / Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N.
 PO Box 19024 Seattle, WA 98109
 
 Location: Arnold Building M1 B861
 Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] cURL ?

2013-05-03 Thread jawad hussain

Dear Sir 
I tried to find cURL on web but I do not find reliable file; there are some 
files on http://curl.haxx.se/. But I do not know which is suitable for R and 
how to install?
Kind Regards 

 
Jawad Hussain Ashraf 
VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275


 Date: Sun, 28 Apr 2013 19:07:05 +0100
 From: rip...@stats.ox.ac.uk
 To: miyanja...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] unsupported url scheme
 
 On 28/04/2013 15:32, jawad hussain wrote:
  fileUrl - 
  https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOADdownload.file(fileUrl,destfile=./data/Cameras.csv,method=curl)
   I tried it after installing package RCurl but it give error message: 
  Error in download.file(fileUrl, destfile = Cameras.csv) :
 unsupported URL schemeI can you help me to solve this problem. JAWAD 
  HUSSAIN ASHRAF
 
 
 Yes, simply install a version of cURL which supports that scheme, then 
 re-install RCurl.
 
   
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 That does apply to you, too.  No HTML, tell us your sessionInfo() 
 
 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Luis Iván Ortiz Valencia

$ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2

it is a factor variable, so it takes numeric values, for Infected   it is
assigned value 1.

subset(REC2,  INFECTION==1)


2013/5/3 Jorge I Velez jorgeivanve...@gmail.com

 Hi Kasia,

 You need

 subset(REC2,  INFECTION==Infected )

 (note the space after Infected).

 HTH,
 Jorge.-


 On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
 katarzyna.ku...@gmail.comwrote:

  Hi everyone,
 
  I know there have been several requests regarding subsetting before, but
  none of them really helps with my problem:
 
  I'm trying to subset only infected individuals from the REC2 data.frame:
 
   str(REC2)
  'data.frame':362 obs. of  7 variables:
   $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58
  66 17
   $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3
  ...
   $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
   $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
   $ binage   : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
   $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2
 1 2
  2 1 ...
   $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
 
  using either
 
  RECinf-REC2[which (REC2$INFECTION==Infected),]
 
  or
 
  RECinf-subset(REC2,  INFECTION==Infected)
 
  in both cases I get empty data frame (0 observations):
 
   str(RECinf)
  'data.frame':0 obs. of  7 variables:
   $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
   $ year : Factor w/ 8 levels Y2002,Y2003,..:
   $ ccFLEDGE : int
   $ rec2012  : int
   $ binage   : Factor w/ 2 levels ad,juv:
   $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
   $ all.rsLD : num
 
  When subsetting, R doesn't return any warning or error message. Besides,
 I
  used same codes many times before and they worked perfectly well. Any
 ideas
  why this case is different?
 
  Thanks for your help,
  Kasia
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Luis Iván Ortiz Valencia
Doutorando Saúde Pública - Epidemiologia, IESC, UFRJ
Estatístico Msc.
Spatial Analyst Msc.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Katarzyna Kulma

Hi Luis,

thanks for the suggestion, but still nothing:

 RECinf2-subset(REC2,  INFECTION==1)
 head(RECinf2)
[1] RINGNOyear  ccFLEDGE  rec2012   binageINFECTION all.rsLD
0 rows (or 0-length row.names)

cheers,
Kasia


Katarzyna Kulma

PhD Student
Department of Ecology and Genetics
Institute of Ecology and Evolution/Animal Ecology
Uppsala University
Norbyvägen 18D
SE-752 36 Uppsala, Sweden

email: katarzyna.ku...@ebc.uu.se
Tel.+46 (0)18 471 2672
Fax.+46 18 471 6484


On 3 May 2013 14:13, Luis Iván Ortiz Valencia liov2...@gmail.com wrote:

 $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2

 it is a factor variable, so it takes numeric values, for Infected   it
 is assigned value 1.

 subset(REC2,  INFECTION==1)


 2013/5/3 Jorge I Velez jorgeivanve...@gmail.com

 Hi Kasia,

 You need

 subset(REC2,  INFECTION==Infected )

 (note the space after Infected).

 HTH,
 Jorge.-


 On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
 katarzyna.ku...@gmail.comwrote:

  Hi everyone,
 
  I know there have been several requests regarding subsetting before, but
  none of them really helps with my problem:
 
  I'm trying to subset only infected individuals from the REC2 data.frame:
 
   str(REC2)
  'data.frame':362 obs. of  7 variables:
   $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41
 58
  66 17
   $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3
  ...
   $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
   $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
   $ binage   : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
   $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2
 1 2
  2 1 ...
   $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
 
  using either
 
  RECinf-REC2[which (REC2$INFECTION==Infected),]
 
  or
 
  RECinf-subset(REC2,  INFECTION==Infected)
 
  in both cases I get empty data frame (0 observations):
 
   str(RECinf)
  'data.frame':0 obs. of  7 variables:
   $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
   $ year : Factor w/ 8 levels Y2002,Y2003,..:
   $ ccFLEDGE : int
   $ rec2012  : int
   $ binage   : Factor w/ 2 levels ad,juv:
   $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
   $ all.rsLD : num
 
  When subsetting, R doesn't return any warning or error message.
 Besides, I
  used same codes many times before and they worked perfectly well. Any
 ideas
  why this case is different?
 
  Thanks for your help,
  Kasia
 
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Luis Iván Ortiz Valencia
 Doutorando Saúde Pública - Epidemiologia, IESC, UFRJ
 Estatístico Msc.
 Spatial Analyst Msc.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Jorge I Velez

Hi Kasia,

You need

subset(REC2,  INFECTION==Infected )

(note the space after Infected).

HTH,
Jorge.-


On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
katarzyna.ku...@gmail.comwrote:

 Hi everyone,

 I know there have been several requests regarding subsetting before, but
 none of them really helps with my problem:

 I'm trying to subset only infected individuals from the REC2 data.frame:

  str(REC2)
 'data.frame':362 obs. of  7 variables:
  $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58
 66 17
  $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3
 ...
  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
  $ binage   : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
  $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2
 2 1 ...
  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...

 using either

 RECinf-REC2[which (REC2$INFECTION==Infected),]

 or

 RECinf-subset(REC2,  INFECTION==Infected)

 in both cases I get empty data frame (0 observations):

  str(RECinf)
 'data.frame':0 obs. of  7 variables:
  $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
  $ year : Factor w/ 8 levels Y2002,Y2003,..:
  $ ccFLEDGE : int
  $ rec2012  : int
  $ binage   : Factor w/ 2 levels ad,juv:
  $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
  $ all.rsLD : num

 When subsetting, R doesn't return any warning or error message. Besides, I
 used same codes many times before and they worked perfectly well. Any ideas
 why this case is different?

 Thanks for your help,
 Kasia

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Katarzyna Kulma

Jorge, thanks for your suggestions, but they give the same (empty) result:

 RECinf-subset(REC2,  INFECTION==Infected)
 head(RECinf)
[1] RINGNOyear  ccFLEDGE  rec2012   binageINFECTION all.rsLD
0 rows (or 0-length row.names)

but David's suggestion worked! :

 RECinf-REC2[REC2$INFECTION==Infected ,]
 head(RECinf)
RINGNO  year ccFLEDGE rec2012 binage INFECTION   all.rsLD
2  BX23298 Y20036   1juv Infected  -6.1938776
4  BT53646 Y20035   2 ad Infected  -4.1938776
7  BT53248 Y20036   1 ad Infected  -2.1938776
11 BY75833 Y20045   0 ad Infected  -4.6574803
13 BX23067 Y20046   0 ad Infected  -3.6574803
17 BX24240 Y20046   0 ad Infected   0.3425197


still not sure why the subset() function didn't work, though.

Thanks for your help!



Katarzyna Kulma

PhD Student
Department of Ecology and Genetics
Institute of Ecology and Evolution/Animal Ecology
Uppsala University
Norbyvägen 18D
SE-752 36 Uppsala, Sweden

email: katarzyna.ku...@ebc.uu.se
Tel.+46 (0)18 471 2672
Fax.+46 18 471 6484


On 3 May 2013 13:13, David Kulp dk...@fiksu.com wrote:

 You have an extra space in the INFECTION factors.

 Use REC2[REC2$INFECTION==Infected ,]
 or
 subset(REC2, INFECTION==Infected )

 No need to use which here.

 On May 3, 2013, at 5:48 AM, Katarzyna Kulma wrote:

  Hi everyone,
 
  I know there have been several requests regarding subsetting before, but
  none of them really helps with my problem:
 
  I'm trying to subset only infected individuals from the REC2 data.frame:
 
  str(REC2)
  'data.frame':362 obs. of  7 variables:
  $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58
  66 17
  $ year : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3
 ...
  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
  $ binage   : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
  $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1
 2
  2 1 ...
  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...
 
  using either
 
  RECinf-REC2[which (REC2$INFECTION==Infected),]
 
  or
 
  RECinf-subset(REC2,  INFECTION==Infected)
 
  in both cases I get empty data frame (0 observations):
 
  str(RECinf)
  'data.frame':0 obs. of  7 variables:
  $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
  $ year : Factor w/ 8 levels Y2002,Y2003,..:
  $ ccFLEDGE : int
  $ rec2012  : int
  $ binage   : Factor w/ 2 levels ad,juv:
  $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
  $ all.rsLD : num
 
  When subsetting, R doesn't return any warning or error message. Besides,
 I
  used same codes many times before and they worked perfectly well. Any
 ideas
  why this case is different?
 
  Thanks for your help,
  Kasia
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] untar() error

2013-05-03 Thread Prof Brian Ripley


On 03/05/2013 08:31, Hakim Abdi wrote:

Dear List,

I have a list of 600+ *.gz files that I would like to extract and read the
geotiffs contained within them. I tried using the untar() function to
simplify this task but I am stumped by an error. I've combed the Internet
for a solution without luck. The details are below, and any help in solving
this matter is appreciated.


Those are most likely not tar files.  What does file (the command-line 
program contained in Rtools) say they are?





files = list.files(path = J:/GIMMS/NDVI, pattern = data.tif.gz,

all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case = TRUE,
include.dirs = TRUE)


lapply(files, untar)

Error in rawToChar(block[seq_len(ns)]) :
   embedded nul in string: 'II*\0ÃŒ \001Â´
\0\0`G\0\0\fn\0\0Â¸â€�\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034\002\0Ãˆd\002\0tâ€¹\002\0
Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$\003\0Ã�L\003\0|s\003'


untar(files[1])

Error in rawToChar(block[seq_len(ns)]) :
   embedded nul in string: 'II*\0ÃŒ \001Â´
\0\0`G\0\0\fn\0\0Â¸â€�\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034\002\0Ãˆd\002\0tâ€¹\002\0
Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$\003\0Ã�L\003\0|s\003'


untar(J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz)

Error in rawToChar(block[seq_len(ns)]) :
   embedded nul in string: 'II*\0ÃŒ \001Â´
\0\0`G\0\0\fn\0\0Â¸â€�\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034\002\0Ãˆd\002\0tâ€¹\002\0
Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$\003\0Ã�L\003\0|s\003'


traceback()

3: rawToChar(block[seq_len(ns)])
2: untar2(tarfile, files, list, exdir)
1: untar(files[1])


sessionInfo()

R version 2.15.2 (2012-10-26)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C

[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



___

Hakim Abdi
Doctoral Student

Physical Geography and Ecosystem Science
Lund University
SÃ¶lvegatan 12, 223 62 Lund, Sweden

Office: +46 (0) 46 2223132
Mobile: +46 (0) 73 9300116

Email: hakim.a...@nateko.lu.se

[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Mihai Nica

Hi:

(note the space after Infected)

Since I lost a morning too with this issue, I am just curious, why is there a 
space? 

I know, it must be a dumb question, a reasonable programming rule, but that's 
my level :-)
 
mike



 From: Jorge I Velez jorgeivanve...@gmail.com
To:Katarzyna Kulma katarzyna.ku...@gmail.com 
Cc: R mailing list r-help@r-project.org 
Sent: Friday, May 3, 2013 6:01 AM
Subject: Re: [R] R does not subset
 

Hi Kasia,

You need

subset(REC2,  INFECTION==Infected )

(note the space after Infected).

HTH,
Jorge.-


On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
katarzyna.ku...@gmail.comwrote:

 Hi everyone,

 I know there have been several requests regarding subsetting before, but
 none of them really helps with my problem:

 I'm trying to subset only infected individuals from the REC2 data.frame:

  str(REC2)
 'data.frame':    362 obs. of  7 variables:
  $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67 41 58
 66 17
  $ year     : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1 1 3
 ...
  $ ccFLEDGE : int  6 6 6 5 6 7 6 7 6 5 ...
  $ rec2012  : int  2 1 2 2 1 2 1 1 1 0 ...
  $ binage  : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
  $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1 2 2 1 2
 2 1 ...
  $ all.rsLD : num  -4.62 -6.19 -3.62 -4.19 -2.62 ...

 using either

 RECinf-REC2[which (REC2$INFECTION==Infected),]

 or

 RECinf-subset(REC2,  INFECTION==Infected)

 in both cases I get empty data frame (0 observations):

  str(RECinf)
 'data.frame':    0 obs. of  7 variables:
  $ RINGNO   : Factor w/ 370 levels BL17546,BL17577,..:
  $ year     : Factor w/ 8 levels Y2002,Y2003,..:
  $ ccFLEDGE : int
  $ rec2012  : int
  $ binage  : Factor w/ 2 levels ad,juv:
  $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
  $ all.rsLD : num

 When subsetting, R doesn't return any warning or error message. Besides, I
 used same codes many times beforeand they worked perfectly well. Any ideas
 why this case is different?

 Thanks for your help,
 Kasia

         [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Very basic statistics in R

2013-05-03 Thread Xavier Prudent

Dear all,

Very simple question, but apparently uneasy to solve in R:

I have a sampling of a variable x: (3, 4. 5, 2, ...)

I want to know:
 - the mean x   - mean(x)
 - the uncertainty on x   - std.error(x) ? Or sd(x)?
 - the standard deviation of x  - ?
 - the uncertainty on the standard deviation - ?

Anyone has an idea?

Thanks in advance,

regards,
Xavier



-- 
*---
Xavier Prudent
*
*
Computational biology and evolutionary genomics
*
*
*
*Guest scientist at the Max-Planck-Institut für Physik komplexer Systeme*
*(MPI-PKS)*
*Noethnitzer Str. 38*
*01187 Dresden
*
*
*
*Max Planck-Institute for Molecular Cell Biology and Genetics*
*
(MPI-CBG)
*
*
Pfotenhauerstraße 108
*
*
01307 Dresden
*
*

*
*
Phone: +49 351 210-2621
*
*Mail: prudent [ at ] mpi-cbg.de
**---*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Courses: Statistical Analysis with R - Bayesian Data Analysis with R and WinBUGS

2013-05-03 Thread Dr. Pablo E. Verde


Dear list members,

Apologies for cross-posting. Please, find below the information of
two statistical courses with R:

1) Statistical Analysis with R
2) Bayesian Data Analysis with R and WinBUGS

If you have any question don't hesitate to contact me.

Best regards,

Pablo

++
*Two days course in: Statistical Analysis with R
*Where:  Linux Hotel, Essen-Horst, Germany
*When:
14.06-15.06.2013
22.11-23.11.2013
13.12-14.12.2013
*Instructor:
Dr. Pablo E. Verde
++
*Target audience:
Data analysis with basic knowledge in statistics will benefit from this 
course.

The course is intended as a first course in R but not as a first course in
statistics or data analysis.
++
*Course content:
Day 1:
*Introduction to statistical analysis with R
*Classical graphical functions (scatter plots, conditional plots, 
histograms, etc)

*Data management with R (indexing and other advanced techniques)
*Advance graphical techniques for data analysis: lattice plots and 
ggplot2


Day 2:
*Statistical analysis based on computer simulation (bootstrap methods)
*Regression modeling (linear/non-linear/logistic regression)
*Issues in regression modeling (variable selection, model checking, 
etc.)


*Prices:
Public sector and commercial: 737.8 Euros (two days course, included VAT)
Student:  450 Euro (two days course, included VAT). Some of the courses are
frequently fully booked. So please notice that you may have to try several
times, until you get a spare place.
++

++
*Three days course in: Bayesian Data Analysis with R and WinBUGS
*Where: Linux Hotel, Essen-Horst, Germany
*When:
11.07-13.07.2013
07.11-09.11.2013
*Instructor:
Dr. Pablo E. Verde
++
*Target audience:
This course is for data analyst who are familiar with classical statistics
and they want to get a working knowledge in Bayesian analysis. This is a 
3 days
intensive training course with 8 hours per day including lecturing and 
exercises.
The course presentation is practical with many worked examples. To 
attend the

course you do NOT need experience with R or with WinBUGS. Lectures are given
in English. Discussions can be in English, German or Spanish.
++
*Course content:
Day 1
*Lecture 1: Introduction to Bayesian Inference
*Lecture 2: Bayesian analysis for single parameter models
*Lecture 3: Prior distributions: univariate

Day 2
*Lecture 4: Bayesian analysis for multiple parameter models
*Lecture 5: An introduction to WinBUGS
*Lecture 6: Multivariate models with WinBUGS

Day 3
*Lecture 7: An introduction to MCMC computations
*Lecture 8: Bayesian regression with WinBUGS
*Lecture 9: Introduction to Hierarchical Statistical modeling

*Prices:
Public sector and commercial: 1088,85 Euro (three days course, included VAT)
Student:  675 Euro (three days course, included VAT). Some of the 
courses are

frequently fully booked. So please notice that you may have to try several
times, until you get a spare place.

++
**For more information, please contact:  i...@linuxhotel.de
++

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cURL ?

2013-05-03 Thread Jeff Newmiller

If you don't know, we certainly don't. This is not a question about R or RCurl 
anymore... it is a question about cURL. You need to know what operating system 
your computer uses and how to enable SSL for cURL on that operating system... 
perhaps you need local technical assistance.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

jawad hussain miyanja...@hotmail.com wrote:

Dear Sir 
I tried to find cURL on web but I do not find reliable file; there are
some files on http://curl.haxx.se/. But I do not know which is suitable
for R and how to install?
Kind Regards 

 
Jawad Hussain Ashraf 
VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275


 Date: Sun, 28 Apr 2013 19:07:05 +0100
 From: rip...@stats.ox.ac.uk
 To: miyanja...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] unsupported url scheme
 
 On 28/04/2013 15:32, jawad hussain wrote:
  fileUrl -
https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOADdownload.file(fileUrl,destfile=./data/Cameras.csv,method=curl)
I tried it after installing package RCurl but it give error message:
Error in download.file(fileUrl, destfile = Cameras.csv) :
 unsupported URL schemeI can you help me to solve this problem.
JAWAD HUSSAIN ASHRAF
 
 
 Yes, simply install a version of cURL which supports that scheme,
then 
 re-install RCurl.
 
  
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 That does apply to you, too.  No HTML, tell us your sessionInfo()

 
 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595
   
   
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Jeff Newmiller

This typically occurs because of sloppy manual data entry outside of R. To 
relieve further analysis pain, you can manually clean the data (usually only 
effective for one-time analyses) or use R to fix problems right after loading 
the data (there are multiple methods for doing this... I prefer using ?sub on 
character data before creating the factor).
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Mihai Nica mihain...@yahoo.com wrote:

Hi:

(note the space after Infected)

Since I lost a morning too with this issue, I am just curious, why is
there a space?�

I know, it must be a dumb question, a reasonable programming rule, but
that's my level :-)
�
mike



 From: Jorge I Velez jorgeivanve...@gmail.com
To:Katarzyna Kulma katarzyna.ku...@gmail.com 
Cc: R mailing list r-help@r-project.org 
Sent: Friday, May 3, 2013 6:01 AM
Subject: Re: [R] R does not subset
 

Hi Kasia,

You need

subset(REC2,� INFECTION==Infected )

(note the space after Infected).

HTH,
Jorge.-


On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
katarzyna.ku...@gmail.comwrote:

 Hi everyone,

 I know there have been several requests regarding subsetting before,
but
 none of them really helps with my problem:

 I'm trying to subset only infected individuals from the REC2
data.frame:

  str(REC2)
 'data.frame':� � 362 obs. of� 7 variables:
� $ RINGNO�  : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67
41 58
 66 17
� $ year� �  : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1
1 3
 ...
� $ ccFLEDGE : int� 6 6 6 5 6 7 6 7 6 5 ...
� $ rec2012� : int� 2 1 2 2 1 2 1 1 1 0 ...
� $ binage� : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
� $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1
2 2 1 2
 2 1 ...
� $ all.rsLD : num� -4.62 -6.19 -3.62 -4.19 -2.62 ...

 using either

 RECinf-REC2[which (REC2$INFECTION==Infected),]

 or

 RECinf-subset(REC2,� INFECTION==Infected)

 in both cases I get empty data frame (0 observations):

  str(RECinf)
 'data.frame':� � 0 obs. of� 7 variables:
� $ RINGNO�  : Factor w/ 370 levels BL17546,BL17577,..:
� $ year� �  : Factor w/ 8 levels Y2002,Y2003,..:
� $ ccFLEDGE : int
� $ rec2012� : int
� $ binage� : Factor w/ 2 levels ad,juv:
� $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
� $ all.rsLD : num

 When subsetting, R doesn't return any warning or error message.
Besides, I
 used same codes many times beforeand they worked perfectly well. Any
ideas
 why this case is different?

 Thanks for your help,
 Kasia

� � � �  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


��� [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



   [[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very basic statistics in R

2013-05-03 Thread S Ellison

 

  - the mean x   - mean(x)
  - the uncertainty on x   - std.error(x) ? Or sd(x)?
  - the standard deviation of x  - ?
  - the uncertainty on the standard deviation - ?
 
 Anyone has an idea?

1. Use R's help system to look up 'standard deviation' and 'mean'
e.g.:
??'standard deviation' 
??'mean'

For the other two questions, consult your basic stats textbook; the answers can 
be calculated from the two above together with the number of observations.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] print multiple plots to jpeg, one lattice and one ggplot2

2013-05-03 Thread Christophe Bouffioux

hello everybody,

I want to print two plots in one png file, I tried several options but i
didn't succeed
the first plot (bwplot) print to the defined position, but the second
(ggplot) doesn't
Any idea?
Thanks a lot
Christophe


#   Example:
#-

library(ggplot2)
library(lattice)
library(grid)

one - bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
   panel = panel.superpose,
   panel.groups = panel.linejoin,
   xlab = treatment,
   key = list(lines = Rows(trellis.par.get(superpose.line),
  c(1:7, 1)),
  text = list(lab =
as.character(unique(OrchardSprays$rowpos))),
  columns = 4, title = Row position))


df - data.frame(gp = factor(rep(letters[1:3], each = 10)),
 y = rnorm(30))
# Compute sample mean and standard deviation in each group
library(plyr)
ds - ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y))

two - ggplot(df, aes(x = gp, y = y)) +
 geom_point() +
 geom_point(data = ds, aes(y = mean),
  colour = 'red', size = 3)



# 1. not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units=px, res=100)
print(one, position=c(0,0,0.5,1), more=TRUE)
print(two, position=c(0.5,0,1,1), )
dev.off()


# 2 not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units=px, res=100)
 grid.newpage()
 pushViewport(viewport(layout = grid.layout(1, 2)))

  print(one, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
# ça ne fonctionne pas
  print(two, vp = viewport(layout.pos.row = 1, layout.pos.col = 2))
dev.off()



# 3 not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units=px, res=100)
 par(mfrow=c(1,2))
  one
  two
dev.off()

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating distance matrix for large dataset

2013-05-03 Thread David Carlson

Here's the result on R 3.0.0 64 bit under Windows 8:

 A-matrix(1:365000*144,nrow=365000,ncol=144)
 dim(A)
[1] 365000144
 d - dist(mydata_nor, method = euclidean)
Error in as.matrix(x) : object 'mydata_nor' not found
 d - dist(A, method = euclidean)
Error: cannot allocate vector of size 496.3 Gb
In addition: Warning messages:
1: In dist(A, method = euclidean) :
  Reached total allocation of 8078Mb: see help(memory.size)
2: In dist(A, method = euclidean) :
  Reached total allocation of 8078Mb: see help(memory.size)
3: In dist(A, method = euclidean) :
  Reached total allocation of 8078Mb: see help(memory.size)
4: In dist(A, method = euclidean) :
  Reached total allocation of 8078Mb: see help(memory.size)

Your message suggests that your system could not accurately compute the
requirements. Unless you have access to a computer with 500 gigabytes, you
need to consider alternate approaches such as aggregating the data into
longer time blocks or using kmeans.

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of HJ YAN
Sent: Thursday, May 2, 2013 6:02 PM
To: r-help@r-project.org
Subject: [R] Calculating distance matrix for large dataset

Dear R users


I wondered if any of you ever tried to calculate distance matrix with very
large data set, and if anyone out there can confirm this error message I got
actually mean that my data is too large for this task.

negative length vectors are not allowed


My data size and code used

 dim(mydata_nor)[1] 365000144 d - dist(mydata_nor, method =
euclidean)



Here my data has 1000 samples each has a year data observed by 10 minutes
interval daily, so the size is  (365* 1000) * 144.


I checked the manual of function 'dist' but can not see the upper limit size
allowed, and I bet there should be one, so any hints is appreciated.


I would also be grateful if any other method for calculating distance matrix
for large dataset could be advised.



I appreciate reproducible code should be provided for your advice, so try
below if needed:

A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A)[1] 365000144
d1-dist(A,method=euclidean)Error in dist(A, method = euclidean) :
  negative length vectors are not allowed




Many thanks in advance!

HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Very basic statistics in R

2013-05-03 Thread Jeff Newmiller

I recommend you read the Introduction to R document that comes with R. Look for 
making vectors with the c() function, and using the mean() and sd() functions. 

Note that this is not a homework help forum (read the Posting Guide mentioned 
at the bottom of every message). If this is not homework, you are going to need 
to do quite a bit of self study before you can ask questions clearly enough to 
get useful responses on this list. See

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Xavier Prudent prudentxav...@gmail.com wrote:

Dear all,

Very simple question, but apparently uneasy to solve in R:

I have a sampling of a variable x: (3, 4. 5, 2, ...)

I want to know:
 - the mean x   - mean(x)
 - the uncertainty on x   - std.error(x) ? Or sd(x)?
 - the standard deviation of x  - ?
 - the uncertainty on the standard deviation - ?

Anyone has an idea?

Thanks in advance,

regards,
Xavier

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] untar() error

2013-05-03 Thread Jeff Newmiller

untar != gunzip
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Prof Brian Ripley rip...@stats.ox.ac.uk wrote:

On 03/05/2013 08:31, Hakim Abdi wrote:
 Dear List,

 I have a list of 600+ *.gz files that I would like to extract and
read the
 geotiffs contained within them. I tried using the untar() function to
 simplify this task but I am stumped by an error. I've combed the
Internet
 for a solution without luck. The details are below, and any help in
solving
 this matter is appreciated.

Those are most likely not tar files.  What does file (the command-line 
program contained in Rtools) say they are?


 files = list.files(path = J:/GIMMS/NDVI, pattern = data.tif.gz,
 all.files = TRUE, full.names = TRUE, recursive = TRUE, ignore.case =
TRUE,
 include.dirs = TRUE)

 lapply(files, untar)
 Error in rawToChar(block[seq_len(ns)]) :
embedded nul in string: 'II*\0ÃŒ \001Â´

\0\0`G\0\0\fn\0\0Â¸â€\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034\002\0Ãˆd\002\0tâ€¹\002\0
 Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$\003\0ÃL\003\0|s\003'

 untar(files[1])
 Error in rawToChar(block[seq_len(ns)]) :
embedded nul in string: 'II*\0ÃŒ \001Â´

\0\0`G\0\0\fn\0\0Â¸â€\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034\002\0Ãˆd\002\0tâ€¹\002\0
 Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$\003\0ÃL\003\0|s\003'


untar(J:/GIMMS/NDVI/1981/81aug15a.n07-VIg/81aug15a.n07-VIg_data.tif.gz)
 Error in rawToChar(block[seq_len(ns)]) :
embedded nul in string: 'II*\0ÃŒ \001Â´

\0\0`G\0\0\fn\0\0Â¸â€\0\0dÂ»\0\0\020Ã¢\0\0Â¼\b\001\0h/\001\0\024V\001\0Ã€|\001\0lÂ£\001\0\030ÃŠ\001\0Ã„Ã°\001\0p\027\002\0\034\002\0Ãˆd\002\0tâ€¹\002\0
 Â²\002\0ÃŒÃ˜\002\0xÃ¿\002\0$\003\0ÃL\003\0|s\003'

 traceback()
 3: rawToChar(block[seq_len(ns)])
 2: untar2(tarfile, files, list, exdir)
 1: untar(files[1])

 sessionInfo()
 R version 2.15.2 (2012-10-26)
 Platform: x86_64-w64-mingw32/x64 (64-bit)

 locale:
 [1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United
 States.1252LC_MONETARY=English_United States.1252 LC_NUMERIC=C

 [5] LC_TIME=English_United States.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base



 ___

 Hakim Abdi
 Doctoral Student

 Physical Geography and Ecosystem Science
 Lund University
 SÃ¶lvegatan 12, 223 62 Lund, Sweden

 Office: +46 (0) 46 2223132
 Mobile: +46 (0) 73 9300116

 Email: hakim.a...@nateko.lu.se

  [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Size of a refClass instance

2013-05-03 Thread Jeff Newmiller

Interesting conclusion. Alternatively, that representation of your object model 
may not be computationally effective. This discrepancy may be less exaggerated 
in C++, but you may still find that large numbers of objects are less efficient 
in their use of memory or cpu time than vector processing even there. I would 
read the point of Martin's response as Don't confuse your mental model of the 
solution with its implementation.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

David Kulp dk...@fiksu.com wrote:

Good tip.  Thanks Morgan.
I agree that a different structure might (necessarily) be in order.  I
wanted to create a tree where nodes in a tree were of different derived
sub-classes -- possibly holding more data and behaving polymorphically.
OO programming seemed ideal for this: lots of small things with
specialized behavior -- but this isn't R's strength.

On May 2, 2013, at 4:57 PM, Martin Morgan wrote:

 On 05/01/2013 11:20 AM, David Kulp wrote:
 I'm using refClass for a complex multi-directional tree structure
with
 possibly 100,000s of nodes.  The refClass design is very impressive
and I'd
 love to use it, but I've found that the size of refClass instances
are very
 large and creation time is slow.  For example, below is a RefClass
and normal
 S4 class.  The RefClass requires about 4KB per instance vs 500B for
the S4
 class -- based on adding the Ncells and Vcells of used memory
reported by
 gc().  And instantiation is more than twice as slow for a RefClass. 
(R
 2.14.2)
 
 Anyone have thoughts on this and whether there's any hope for
improving
 resources on either front?
 
 Hi David -- not necessarily helpful but creating a few large objects
is always better than creating many small in R, so perhaps
re-conceptualize your data structure? As a rough analogy, instead of
constructing a graph as a large number of 'Node' instances each
pointing to one another, a graph could be represented as a data.frame
containing columns of 'from' and 'to' indexes (neighbour-edge list, a
few large objects) or as an adjacency matrix. One would also implement
creation and update of the few large objects in an R-friendly
(vectorized) way.
 
 Perhaps there are existing packages that already model the data
you're interested in? If your multi-directional tree can be represented
as a graph, then perhaps
 
  http://bioconductor.org/packages/release/bioc/html/graph.html
 
 including facilities in the Boost graph library (RBGL, on the
Bioconductor web site, too) or the igraph package can be put to use.
 
 Martin
 
 
 I wonder what others are doing.  I've been thinking about
lightweight
 alternative implementations, but nothing particularly elegant has
come to
 mind, yet!
 
 Thanks!
 
 
 simple - setRefClass('simple', fields = list(a = character,
b=numeric)
 ) gc() system.time(simple.list - lapply(1:10, function(i) {
 simple$new(a='foo',b=i) })) gc()
 
 setClass('simple2', representation(a=character,b=numeric))
 setMethod(initialize, simple2, function(.Object, a, b) {
.Object@a - a
 .Object@b - b .Object })
 
 gc() system.time(simple2.list - lapply(1:10, function(i) {
 new('simple2',a='foo',b=i) })) gc()
 
 __ R-help@r-project.org
mailing
 list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the
posting
 guide http://www.R-project.org/posting-guide.html and provide
commented,
 minimal, self-contained, reproducible code.
 
 
 
 -- 
 Computational Biology / Fred Hutchinson Cancer Research Center
 1100 Fairview Ave. N.
 PO Box 19024 Seattle, WA 98109
 
 Location: Arnold Building M1 B861
 Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Write date class as number of days from 1970

2013-05-03 Thread arun

Hi,
May be this helps:
set.seed(24)
dat1- 
data.frame(date1=sample(seq(as.Date(2012-09-14,format=%Y-%m-%d),length.out=40,by=day),20,replace=FALSE),
 value=sample(1:60,20,replace=TRUE))
dat1$days1- as.numeric(difftime(dat1$date1,as.Date(1970-01-01)))
#or
library(lubridate) 
dat1$days2- days(dat1$date1)$day
head(dat1)
#   date1 value days1 days2
#1 2012-09-25 6 15608 15608
#2 2012-09-22    34 15605 15605
#3 2012-10-10    44 15623 15623
#4 2012-10-03 9 15616 15616
#5 2012-10-07    14 15620 15620
#6 2012-10-16    42 15629 15629
#or
library(chron)
as.numeric(as.chron(dat1$date1)-chron(0))
 #[1] 15608 15605 15623 15616 15620 15629 15606 15622 15631 15604 15615 15607
#[13] 15626 15624 15635 15619 15601 15598 15636 15599


A.K.

Dear all, 

I have a dataset with one column being of class Date. When I 
write the output, I would like that column being written as number of 
days from 1970-01-01. I could not find anywhere a way to do it. 

Thanks, 
Marco

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read .csv file and plot a graph

2013-05-03 Thread Vahe nr

Hi all,

I have a big .csv file (21Mb with 100 rows) it has this shape:
x
1 NaN
2 NaN
3 0.23

and so on.

So the first column has x as a header then row number, the second column
contains values between -1,1 and NaN for empty values.

What should I need to do is: create a new .csv file from this one excluding
NaN values and plot a line graph using the new .csv file.

Or can I use the old .csv file to plot a graph excluding NaN values.

Thanks in advance for any help or suggestions.

Regards,
 Vahe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Write date class as number of days from 1970

2013-05-03 Thread Manta

Dear all,

I have a dataset with one column being of class Date. When I write the
output, I would like that column being written as number of days from
1970-01-01. I could not find anywhere a way to do it.

Thanks,
Marco



--
View this message in context: 
http://r.789695.n4.nabble.com/Write-date-class-as-number-of-days-from-1970-tp4666155.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread Anthony Damico

sorry, i had assumed readWorksheetFromFile would give you back a data
frame.  all of the operations i recommended work on data.frame objects

at different points in the code, check if it's a data.frame or a matrix..

class( temp )

..you can check its current class at any point.


and if it's a matrix, you can convert it to a data frame with

temp - as.data.frame( temp )







On Fri, May 3, 2013 at 2:00 AM, jpm miao miao...@gmail.com wrote:

 Hi Anthony,

Thank you very much. It works very well. However, after this line

  temp - sapply( temp , as.numeric )

the data becomes a series of numbers instead of a matrix. Is there any
 way to keep it a matrix?

Thanks,

 Miao




  temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
 startRow=2, endRow= 11, startCol=2, endCol=5)
  temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )
  temp
   Col1 Col2   Col3Col4
  [1,] 647853 1413 57662 27897
  [2,] 491400 1365 40919 20411
  [3,] 38604  -5505  985
  [4,] 576-2054
  [5,] 80845  21   10211 4494
  [6,] 36428  27   1007  1953
  [7,] 269915 587  32988 12779
  [8,] 224494 -30554 9184
  [9,] 11858  587  - 686
 [10,] 3742   -81415
   temp - sapply( temp , as.numeric )
 Warning messages:
 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
  temp
 647853 491400  38604576  80845  36428 269915
 647853 491400  38604576  80845  36428 269915
 224494  11858   3742   1413   1365  -  -
 224494  11858   3742   1413   1365 NA NA
 21 27587  -587  -  57662
 21 27587 NA587 NA  57662
  40919   5505 20  10211   1007  32988  30554
  40919   5505 20  10211   1007  32988  30554
  - 81  27897  20411985 54   4494
 NA 81  27897  20411985 54   4494
   1953  12779   9184686415
   1953  12779   9184686415
  temp[ is.na( temp ) ] - 0
  temp
 647853 491400  38604576  80845  36428 269915
 647853 491400  38604576  80845  36428 269915
 224494  11858   3742   1413   1365  -  -
 224494  11858   3742   1413   1365  0  0
 21 27587  -587  -  57662
 21 27587  0587  0  57662
  40919   5505 20  10211   1007  32988  30554
  40919   5505 20  10211   1007  32988  30554
  - 81  27897  20411985 54   4494
  0 81  27897  20411985 54   4494
   1953  12779   9184686415
   1953  12779   9184686415


 2013/5/2 Anthony Damico ajdam...@gmail.com

 try adding colTypes = 'numeric' to your readWorkSheetFromFile() call



 if that doesn't work, try a few other steps


 # view what data types your file is being read in as
 sapply( temp , class )


 # convert all fields to character if they're factor variables.. but i
 don't think you need this, readWorksheet defaults to `character`
 temp - sapply( temp , as.character )


 # you can also convert a subset like this
 temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character )



 # remove commas from character strings
 temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )

 # convert all fields to numeric
 temp - sapply( temp , as.numeric )

 # convert all NA fields to zeroes if you prefer
 temp[ is.na( temp ) ] - 0





 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote:

 Hi,

Attached are two datasheet to be read.
My raw data 130502temp.xlsx contains numbers with ' symbols, and
 they
 can't be read as numbers. Even if I copy and paste as numbers to form a
 new
 file 130502temp_number1.xlsx, they could not be read smoothly.

1. How can I read the datasheet as numbers?
2. How can I treat the notation - as (1) NA or (2) zero?

Thanks,

 Miao




  temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
 startRow=2, endRow= 11, startCol=2, endCol=5)

  temp

   Col1  Col2   Col3   Col4

 1  647,853 1,413 57,662 27,897

 2  491,400 1,365 40,919 20,411

 3   38,604 -  5,505985

 4  576 - 20 54

 5   80,84521 10,211  4,494

 6   36,42827  1,007  1,953

 7  269,915   587 32,988 12,779

 8  224,494 - 30,554  9,184

 9   11,858   587  -686

 10   3,742 - 81415

  temp[2,2]

 [1] 1,365

  temp[2,2]+3

 Error in temp[2, 2] + 3 : non-numeric argument to binary operator

  temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1,
 header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)

  temp_num[2,2]

 [1] 1,365

  temp_num[2,2]+3

 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator

  as.numeric(temp_num[2,2])+3

 [1] NA

 Warning message:

 NAs introduced by coercion

Re: [R] print multiple plots to jpeg, one lattice and one ggplot2

2013-05-03 Thread Felipe Carrillo

Something like this?
library(gridExtra)
grid.arrange(one,two)

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA
http://www.fws.gov/redbluff/rbdd_jsmp.aspx




From: Christophe Bouffioux christophe@gmail.com
To: r-help@r-project.org r-help@r-project.org 
Sent: Friday, May 3, 2013 6:33 AM
Subject: [R] print multiple plots to jpeg, one lattice and one ggplot2


hello everybody,

I want to print two plots in one png file, I tried several options but i
didn't succeed
the first plot (bwplot) print to the defined position, but the second
(ggplot) doesn't
Any idea?
Thanks a lot
Christophe


#  Example:
#-

library(ggplot2)
library(lattice)
library(grid)

one - bwplot(decrease ~ treatment, OrchardSprays, groups = rowpos,
      panel = panel.superpose,
      panel.groups = panel.linejoin,
      xlab = treatment,
      key = list(lines = Rows(trellis.par.get(superpose.line),
                  c(1:7, 1)),
                  text = list(lab =
as.character(unique(OrchardSprays$rowpos))),
                  columns = 4, title = Row position))


df - data.frame(gp = factor(rep(letters[1:3], each = 10)),
                y = rnorm(30))
# Compute sample mean and standard deviation in each group
library(plyr)
ds - ddply(df, .(gp), summarise, mean = mean(y), sd = sd(y))

two - ggplot(df, aes(x = gp, y = y)) +
    geom_point() +
    geom_point(data = ds, aes(y = mean),
              colour = 'red', size = 3)



# 1. not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units=px, res=100)
    print(one, position=c(0,0,0.5,1), more=TRUE)
    print(two, position=c(0.5,0,1,1), )
dev.off()


# 2 not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units=px, res=100)
grid.newpage()
pushViewport(viewport(layout = grid.layout(1, 2)))

      print(one, vp = viewport(layout.pos.row = 1, layout.pos.col = 1))
# ça ne fonctionne pas
      print(two, vp = viewport(layout.pos.row = 1, layout.pos.col = 2))
dev.off()



# 3 not working
jpeg(file=paste(pathgraph,'/fig03_profiltot','.png',sep=''),width = 600,
height = 400, units=px, res=100)
    par(mfrow=c(1,2))
      one
      two
dev.off()

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Declare a set (list?) of many dataframes or matrices

2013-05-03 Thread jpm miao

Hi,

   I would like to read several datasets and would like to create a set
(list? sequence?) of many empty dataframes. How could this be done? How
could I declare a  set (list? sequence?) of many empty matrices?

   Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Why can't R understand if(num!=NA)?

2013-05-03 Thread jpm miao

I have a program, when I write

if(num!=NA)

it yields an error message.

However, if I write

if(is.na(num)==FALSE)

it works.

Why doesn't the first statement work?

Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] cURL ?

2013-05-03 Thread R. Michael Weylandt

On Fri, May 3, 2013 at 11:31 AM, jawad hussain miyanja...@hotmail.com wrote:
 Dear Sir
 I tried to find cURL on web but I do not find reliable file; there are some 
 files on http://curl.haxx.se/. But I do not know which is suitable for R and 
 how to install?
 Kind Regards

As usual, the OS is relevant here. What are you running?

Linux package managers should be able to handle this for you. And I'd
have guessed this was a Just works for OS X.

MW



 Jawad Hussain Ashraf
 VPO Aroop, Tehsil and District GujranwalaMobile phone# 03016673275


 Date: Sun, 28 Apr 2013 19:07:05 +0100
 From: rip...@stats.ox.ac.uk
 To: miyanja...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] unsupported url scheme

 On 28/04/2013 15:32, jawad hussain wrote:
  fileUrl - 
  https://data.baltimorecity.gov/api/views/dz54-2aru/rows.csv?accessType=DOWNLOADdownload.file(fileUrl,destfile=./data/Cameras.csv,method=curl)
   I tried it after installing package RCurl but it give error message: 
  Error in download.file(fileUrl, destfile = Cameras.csv) :
 unsupported URL schemeI can you help me to solve this problem. JAWAD 
  HUSSAIN ASHRAF


 Yes, simply install a version of cURL which supports that scheme, then
 re-install RCurl.


  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 That does apply to you, too.  No HTML, tell us your sessionInfo() 

 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] MANOVA summary.manova(m) : residuals have rank

2013-05-03 Thread Ozgul Inceoglu

Dear All, I am trying to perform MANOVA. I have table with 504 columns(species) 
and 36 rows) with two grouping (season and location)
 
Zx - Z[c(4:504)]
Zxm - as.matrix(Z)
m- manova(Zxm~Season*location, data=Z)

when I do summary.aov, I get respond for each species but summary.manova
summary.manova(m) : residuals have rank 24501.

What can it be the reason for this error message?

Thank you,

Ozgul

 Below you can see part of the table.
nameSeason  locationAcetobacter Aerococcus  Alishewanella   
Amaricoccus
xls-nord-01 J   w   0   0,024078979 0   0
bxls-sud-01 J   w   0   0   0   0
brux-nord-04A   w   0   0   0   0
brux-sud-04 A   w   0   0   0   0
br-nord-07  Ju  w   0   0   0   0
br-sud-07   Ju  w   0   0   0   0
b-nord-10   O   w   0   0   0   0
bsud-10 O   w   0,107836089 0   0,107836089 0,035945363
Z1-01   J   u   0   0   0   0,040567951
Z3-01   J   u   0   0   0   0
Z5-01   J   d   0,023116043 0   0   0
Z7-01   J   d   0,014130281 0   0   0
Z9-01   J   d   0   0   0   0
Z10-01  J   d   0   0   0   0
Z12-01  J   d   0   0   0   0
Z1-04   A   u   0   0   0   0
Z3-04   A   u   0   0   0   0
Z5-04   A   d   0   0   0   0
Z7-04   A   d   0   0   0   0
Z9-04   A   d   0   0,013839873 0   0
Z10-04  A   d   0   0   0   0
Z12-04  A   d   0   0   0   0
Z1-07   Ju  u   0   0   0   0
Z3-07   Ju  u   0   0   0   0
Z5-07   Ju  d   0   0   0   0
Z7-07   Ju  d   0   0   0   0
Z9-07   Ju  d   0   0   0   0
Z10-07  Ju  d   0   0   0   0
Z12-07  Ju  d   0   0,022301517 0   0
Z1-10   O   u   0   0   0   0
Z3-10   O   u   0   0   0   0
Z5-10   O   d   0   0   0   0
Z7-10   O   d   0   0   0,052924054 0
Z9-10   O   d   0   0   0,035050824 0
Z10-10  O   d   0   0   0   0,040783034
Z12-10  O   d   0   0   0   0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Is it a Headless problem? - Same code runs well in interactive R shell, but never terminates with Rscript

2013-05-03 Thread Asis Hallab

Dear R-Experts,

I seem to be dealing with a so called headless problem in R.

I wrote a quite extensive program that generates a Bayesian network
from a query protein's Phylogenetic Tree and subsequently uses a
message passing algorithm to infer the most likely annotation for the
query leaf in the tree using the other leaves known -and proven-
protein function annotations.

The program uses the following libraries:
library(tools)
library(Biostrings)
library(RCurl)
library(stringr)
library(ape)
library(gRain) # gRain implements the message passing algorithm
library(RMySQL)
library(XML)
library(parallel)
library(brew)
library(xtable)

When the program is run from the command line as:
Rscript prog.r inp.file
with certain input data inp it gets stuck and does not terminate
ever. Memory usage sky-rockets and the process spends almost all of
its time on system calls.

Using the identical R code inside an interactive R shell with the very
same input data inp the script does not have any problems and
finishes actually amazingly fast.

I am flabbergasted and do require help.
Hence my questions:

* Is anything known about a problem similar to mine appearing when
using the above libraries?

* What is the difference -aside from the obvious missing
interactiveness- between running the very same R code inside an
interactive R shell or inside a file as an argument to Rscript?

* Does my problem indeed fall into the headless category?

The problem occurs in
R version 2.15.2 (2012-10-26) -- Trick or Treat
on Debian 6.0.2
uname -or gives
3.2.0-0.bpo.3-amd64 GNU/Linux

Any help will be much appreciated.
Have a pleasant day!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Package survey: singularities in linear regression models

2013-05-03 Thread Sebastian Weirich

Well, I have uploaded the data in the public folder of my dropbox. Due 
to data confidentiality, I haved to change the labels. To load the data:

con - url( http://dl.dropboxusercontent.com/u/101865137/datEx.rda; )
print(load(con))

# The replicate weights were created according to the jackknife (JK2) 
procedure in the same way as implemented in WesVar.
# According to 100 JK zones, 100 replicate weights result. The replicate 
weights are labelled totwgtM_1 to totwgtM_100
# The regression I want to specify is achievement on group and origin. 
Both predictors are factors.

library(survey)
design   - svrepdesign(data = datEx[, c(origin, group, 
achievement)], weights = datEx[ ,pweight],
 type=JKn, scale = 1, rscales = 1, repweights = 
datEx[,grep(^totwgtM_, colnames(datEx))], combined.weights = TRUE, mse 
= TRUE)

# This works
mod1 - svyglm(formula = achievement ~ origin + group, design = 
design, return.replicates = FALSE, family = gaussian(link=identity))

# I get the error message when specifying the interaction
mod2 - svyglm(formula = achievement ~ origin * group, design = 
design, return.replicates = FALSE, family = gaussian(link=identity))

# The output of the conventional glm() function reports singularities 
for one coefficient of the interaction
mod3 - glm(formula = achievement ~ origin * group, data = datEx, 
family = gaussian(link = identity))

Thanks again,
Sebastian

-- 
Sebastian Weirich, Dipl.-Psych.

Institut zur Qualitätsentwicklung im Bildungswesen
Humboldt-Universität zu Berlin
Sitz: Hannoversche Straße 19, 10115 Berlin
Postadresse: Unter den Linden 6, 10099 Berlin

Tel: +49-(0)30-2093-46512

Am 02.05.2013 22:02, schrieb Thomas Lumley:
 On Fri, May 3, 2013 at 2:27 AM, Sebastian Weirich 
 sebastian.weir...@iqb.hu-berlin.de 
 mailto:sebastian.weir...@iqb.hu-berlin.de wrote:

 Hello,

 I want to specify a linear regression model in which the metric
 outcome is predicted by two factors and their interaction. glm()
 computes effects for each factor level and the levels of the
 interaction. In the case of singularities glm() displays NA for
 the corresponding coefficients. However, svyglm() aborts with an
 error message. Is there a possibility that svyglm() provides
 output for coefficients without singularities like glm()?


 It's not true that svyglm() aborts with an error message whenever 
 there are singularities, eg

  svyglm(enroll~stype+I(stype),design=dclus1)
 1 - level Cluster Sampling design
 With (15) clusters.
 svydesign(id = ~dnum, weights = ~pw, data = apiclus1, fpc = ~fpc)

 Call:  svyglm(formula = enroll ~ stype + I(stype), design = dclus1)

 Coefficients:
 (Intercept)   stypeH   stypeMI(stype)H  I(stype)M
   432.9697.4464.9   NA NA

 Degrees of Freedom: 182 Total (i.e. Null);  12 Residual
 Null Deviance:   2483
 Residual Deviance: 1512 AIC: 2599


 So, perhaps you could show us what you actually did, and what actually 
 happened, as the posting guidelines request.

 -thomas

 -- 
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread S Ellison

 

 -Original Message-
 if(num!=NA)
 it yields an error message.

 Why doesn't the first statement work?
Because you just compared something with NA (usually interpreted as 'missing')  
and because of that the comparison result is also NA. 
'if' then tells you that you have a missing value where you need either TRUE or 
FALSE.
Play with
num!=NA #returns NA
and
if(NA) Not there  #returns error

is.na() returns TRUE for NA's, so 'if' knows what to do with the answer.

S Ellison

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Leandro Marino

You can use only

if(!is.na(num))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2013-05-03 Thread David Winsemius


On May 2, 2013, at 4:15 PM, T P Kharel wrote:

 I have posted a R copula question yesterday but it is not accepted yet. How
 long does it take?

Generally moderated postings are accepted within 4-6 hours, usually sooner.

 I am waiting if some one can help me on my Copula
 package related question. Thanks

I do not see any posting from a sender with a name containing the letters  
kharel on May 1, 2, or 3 in the archives and since I just cleared the 
moderation queue it was not waiting there.  Some postings from non-subscribed 
individuals are tossed away automatically by the spam filter and are never seen 
by the moderators as they(we) process the moderation queue. But in your case I 
see that you have subscribed. I am unable to explain why your posting did not 
reach the list. You should be able to see whehter your psotng was received by 
looking at the May 2013 threads at: https://stat.ethz.ch/pipermail/r-help/

   [[alternative HTML version deleted]]

The HTML notice is evidence that you have not yet understood parts of the 
Posting Guide.

 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Declare a set (list?) of many dataframes or matrices

2013-05-03 Thread Rui Barradas


Hello,

I can't say I understand the question, but if you want a list of empty 
dfs and a list of empty matrices, the following will do.


replicate(10, data.frame())
replicate(10, matrix(NA, nrow = 0, ncol = 0))


Hope this helps,

Rui Barradas

Em 03-05-2013 16:20, jpm miao escreveu:

Hi,

I would like to read several datasets and would like to create a set
(list? sequence?) of many empty dataframes. How could this be done? How
could I declare a  set (list? sequence?) of many empty matrices?

Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Carlson

A logical operation involving NA returns NA, never TRUE or FALSE:

See the 8th Circle of the R Inferno (8.1.4):

http://www.burns-stat.com/pages/Tutor/R_inferno.pdf

 num - 1
 num==NA
[1] NA
 is.na(num)
[1] FALSE

-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of jpm miao
Sent: Friday, May 3, 2013 10:25 AM
To: r-help
Subject: [R] Why can't R understand if(num!=NA)?

I have a program, when I write

if(num!=NA)

it yields an error message.

However, if I write

if(is.na(num)==FALSE)

it works.

Why doesn't the first statement work?

Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread David Winsemius


On May 2, 2013, at 11:00 PM, jpm miao wrote:

 Hi Anthony,
 
   Thank you very much. It works very well. However, after this line
 
 temp - sapply( temp , as.numeric )
 
   the data becomes a series of numbers instead of a matrix. Is there any
 way to keep it a matrix?

Perhaps (assuming this were a data.frame to be coerced:

temp - matrix( sapply( temp , as.numeric ), dim(temp)[1]) 

But the persistence of the -'s is puzzling. You should (as always) have 
posted the output from dput(temp).



  Thanks,
 
 Miao
 
 
 
 
 temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
 startRow=2, endRow= 11, startCol=2, endCol=5)
 temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )
 temp
  Col1 Col2   Col3Col4
 [1,] 647853 1413 57662 27897
 [2,] 491400 1365 40919 20411
 [3,] 38604  -5505  985
 [4,] 576-2054
 [5,] 80845  21   10211 4494
 [6,] 36428  27   1007  1953
 [7,] 269915 587  32988 12779
 [8,] 224494 -30554 9184
 [9,] 11858  587  - 686
 [10,] 3742   -81415
 temp - sapply( temp , as.numeric )
 Warning messages:
 1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
 temp
 647853 491400  38604576  80845  36428 269915
 647853 491400  38604576  80845  36428 269915
 224494  11858   3742   1413   1365  -  -
 224494  11858   3742   1413   1365 NA NA
21 27587  -587  -  57662
21 27587 NA587 NA  57662
 40919   5505 20  10211   1007  32988  30554
 40919   5505 20  10211   1007  32988  30554
 - 81  27897  20411985 54   4494
NA 81  27897  20411985 54   4494
  1953  12779   9184686415
  1953  12779   9184686415
 temp[ is.na( temp ) ] - 0
 temp
 647853 491400  38604576  80845  36428 269915
 647853 491400  38604576  80845  36428 269915
 224494  11858   3742   1413   1365  -  -
 224494  11858   3742   1413   1365  0  0
21 27587  -587  -  57662
21 27587  0587  0  57662
 40919   5505 20  10211   1007  32988  30554
 40919   5505 20  10211   1007  32988  30554
 - 81  27897  20411985 54   4494
 0 81  27897  20411985 54   4494
  1953  12779   9184686415
  1953  12779   9184686415
 
 
 2013/5/2 Anthony Damico ajdam...@gmail.com
 
 try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
 
 
 
 if that doesn't work, try a few other steps
 
 
 # view what data types your file is being read in as
 sapply( temp , class )
 
 
 # convert all fields to character if they're factor variables.. but i
 don't think you need this, readWorksheet defaults to `character`
 temp - sapply( temp , as.character )
 
 
 # you can also convert a subset like this
 temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character )
 
 
 
 # remove commas from character strings
 temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )
 
 # convert all fields to numeric
 temp - sapply( temp , as.numeric )
 
 # convert all NA fields to zeroes if you prefer
 temp[ is.na( temp ) ] - 0
 
 
 
 
 
 On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote:
 
 Hi,
 
   Attached are two datasheet to be read.
   My raw data 130502temp.xlsx contains numbers with ' symbols, and they
 can't be read as numbers. Even if I copy and paste as numbers to form a
 new
 file 130502temp_number1.xlsx, they could not be read smoothly.
 
   1. How can I read the datasheet as numbers?
   2. How can I treat the notation - as (1) NA or (2) zero?
 
   Thanks,
 
 Miao
 
 
 
 
 temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
 startRow=2, endRow= 11, startCol=2, endCol=5)
 
 temp
 
  Col1  Col2   Col3   Col4
 
 1  647,853 1,413 57,662 27,897
 
 2  491,400 1,365 40,919 20,411
 
 3   38,604 -  5,505985
 
 4  576 - 20 54
 
 5   80,84521 10,211  4,494
 
 6   36,42827  1,007  1,953
 
 7  269,915   587 32,988 12,779
 
 8  224,494 - 30,554  9,184
 
 9   11,858   587  -686
 
 10   3,742 - 81415
 
 temp[2,2]
 
 [1] 1,365
 
 temp[2,2]+3
 
 Error in temp[2, 2] + 3 : non-numeric argument to binary operator
 
 temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1,
 header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)
 
 temp_num[2,2]
 
 [1] 1,365
 
 temp_num[2,2]+3
 
 Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator
 
 as.numeric(temp_num[2,2])+3
 
 [1] NA
 
 Warning message:
 
 NAs introduced by coercion
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Marc Schwartz


On May 3, 2013, at 10:24 AM, jpm miao miao...@gmail.com wrote:

 I have a program, when I write
 
 if(num!=NA)
 
 it yields an error message.
 
 However, if I write
 
 if(is.na(num)==FALSE)
 
 it works.
 
 Why doesn't the first statement work?
 
 Thanks,
 
 Miao


NA is undefined:

 NA == NA
[1] NA

 NA != NA
[1] NA


Therefore the equality you are attempting does not return a TRUE or FALSE 
result, it is unknown and NA is returned. ?is.na was designed specifically to 
test for the presence of an NA value and return a TRUE or FALSE result which 
can then be tested.

See: http://cran.r-project.org/doc/manuals/r-release/R-intro.html#Missing-values


Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Winsemius


On May 3, 2013, at 8:24 AM, jpm miao wrote:

 I have a program, when I write
 
 if(num!=NA)
 
 it yields an error message.
 
 However, if I write
 
 if(is.na(num)==FALSE)
 
 it works.
 
 Why doesn't the first statement work?

Read the manual:

  ?NA


-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MANOVA summary.manova(m) : residuals have rank

2013-05-03 Thread peter dalgaard


On May 3, 2013, at 14:59 , Ozgul Inceoglu wrote:

 Dear All, I am trying to perform MANOVA. I have table with 504 
 columns(species) and 36 rows) with two grouping (season and location)
 
 Zx - Z[c(4:504)]
 Zxm - as.matrix(Z)
 m- manova(Zxm~Season*location, data=Z)
 
 when I do summary.aov, I get respond for each species but summary.manova
 summary.manova(m) : residuals have rank 24501.
 
 What can it be the reason for this error message?

Too many columns and too few rows. Multivariate tests require more degrees of 
freedom than response variables.


 
 Thank you,
 
 Ozgul
 
 Below you can see part of the table.
 name  Season  locationAcetobacter Aerococcus  Alishewanella   
 Amaricoccus
 xls-nord-01   J   w   0   0,024078979 0   0
 bxls-sud-01   J   w   0   0   0   0
 brux-nord-04  A   w   0   0   0   0
 brux-sud-04   A   w   0   0   0   0
 br-nord-07Ju  w   0   0   0   0
 br-sud-07 Ju  w   0   0   0   0
 b-nord-10 O   w   0   0   0   0
 bsud-10   O   w   0,107836089 0   0,107836089 
 0,035945363
 Z1-01 J   u   0   0   0   0,040567951
 Z3-01 J   u   0   0   0   0
 Z5-01 J   d   0,023116043 0   0   0
 Z7-01 J   d   0,014130281 0   0   0
 Z9-01 J   d   0   0   0   0
 Z10-01J   d   0   0   0   0
 Z12-01J   d   0   0   0   0
 Z1-04 A   u   0   0   0   0
 Z3-04 A   u   0   0   0   0
 Z5-04 A   d   0   0   0   0
 Z7-04 A   d   0   0   0   0
 Z9-04 A   d   0   0,013839873 0   0
 Z10-04A   d   0   0   0   0
 Z12-04A   d   0   0   0   0
 Z1-07 Ju  u   0   0   0   0
 Z3-07 Ju  u   0   0   0   0
 Z5-07 Ju  d   0   0   0   0
 Z7-07 Ju  d   0   0   0   0
 Z9-07 Ju  d   0   0   0   0
 Z10-07Ju  d   0   0   0   0
 Z12-07Ju  d   0   0,022301517 0   0
 Z1-10 O   u   0   0   0   0
 Z3-10 O   u   0   0   0   0
 Z5-10 O   d   0   0   0   0
 Z7-10 O   d   0   0   0,052924054 0
 Z9-10 O   d   0   0   0,035050824 0
 Z10-10O   d   0   0   0   0,040783034
 Z12-10O   d   0   0   0   0
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Berend Hasselman


On 03-05-2013, at 17:24, jpm miao miao...@gmail.com wrote:

 I have a program, when I write
 
 if(num!=NA)
 
 it yields an error message.
 

it?
What is unclear about the error message?

 However, if I write
 
 if(is.na(num)==FALSE)
 
 it works.
 
 Why doesn't the first statement work?
 

Read section 2.5 'Missing values of the manual An Introduction to R.

Berend


 Thanks,
 
 Miao
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read .csv file and plot a graph

2013-05-03 Thread jim holtman

Just read in and plot the data.  The NaN will not be plotted:

 input - read.table(text = x
+ 1 NaN
+ 2 NaN
+ 3 0.23
+ 4 .34
+ 5 .55
+ 6 .66
+ 7 NaN
+ 8 .88, header = TRUE)
 plot(input$x)




On Fri, May 3, 2013 at 9:49 AM, Vahe nr vne...@gmail.com wrote:

 Hi all,

 I have a big .csv file (21Mb with 100 rows) it has this shape:
 x
 1 NaN
 2 NaN
 3 0.23

 and so on.

 So the first column has x as a header then row number, the second column
 contains values between -1,1 and NaN for empty values.

 What should I need to do is: create a new .csv file from this one excluding
 NaN values and plot a line graph using the new .csv file.

 Or can I use the old .csv file to plot a graph excluding NaN values.

 Thanks in advance for any help or suggestions.

 Regards,
  Vahe

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Declare a set (list?) of many dataframes or matrices

2013-05-03 Thread arun





Hi,
I am not sure about what you meant.
lapply(1:5,function(i) data.frame())
[[1]]
data frame with 0 columns and 0 rows

[[2]]
data frame with 0 columns and 0 rows

[[3]]
data frame with 0 columns and 0 rows

[[4]]
data frame with 0 columns and 0 rows

[[5]]
data frame with 0 columns and 0 rows

A.K.


- Original Message -
From: jpm miao miao...@gmail.com
To: r-help r-help@r-project.org
Cc: 
Sent: Friday, May 3, 2013 11:20 AM
Subject: [R] Declare a set (list?) of many dataframes or matrices

Hi,

   I would like to read several datasets and would like to create a set
(list? sequence?) of many empty dataframes. How could this be done? How
could I declare a  set (list? sequence?) of many empty matrices?

   Thanks,

Miao

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread arun

 num1- c(0,NA,1,3)
 num1==NA
#[1] NA NA NA NA
 num1!=NA
#[1] NA NA NA NA
 is.na(num1)
#[1] FALSE  TRUE FALSE FALSE
A.K.



- Original Message -
From: jpm miao miao...@gmail.com
To: r-help r-help@r-project.org
Cc: 
Sent: Friday, May 3, 2013 11:24 AM
Subject: [R] Why can't R understand if(num!=NA)?

I have a program, when I write

if(num!=NA)

it yields an error message.

However, if I write

if(is.na(num)==FALSE)

it works.

Why doesn't the first statement work?

Thanks,

Miao

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change Selected Variables from Numeric to Factors

2013-05-03 Thread arun

Hi ST,
Try this:
set.seed(51)
df1- as.data.frame(matrix(sample(1:40,60,replace=TRUE),ncol=10))
df2- df1
check- c(V3,V7,V9)
 
df1[,match(check,colnames(df1))]-lapply(df1[,match(check,colnames(df1))],as.factor)

str(df1)
#'data.frame':    6 obs. of  10 variables:
# $ V1 : int  32 9 12 40 9 34
# $ V2 : int  31 17 39 5 21 28
# $ V3 : Factor w/ 6 levels 1,6,7,10,..: 3 5 1 6 2 4
# $ V4 : int  26 4 8 18 39 2
# $ V5 : int  39 21 4 26 6 21
# $ V6 : int  27 33 35 8 17 8
# $ V7 : Factor w/ 5 levels 4,8,9,24,..: 2 3 4 1 3 5
# $ V8 : int  4 12 12 32 13 37
# $ V9 : Factor w/ 5 levels 10,31,33,..: 1 4 2 3 5 5
# $ V10: int  13 26 20 22 14 5

#or
 df2[check]- lapply(check,function(x) as.factor(df2[[x]]))
# str(df2)
#'data.frame':    6 obs. of  10 variables:
# $ V1 : int  32 9 12 40 9 34
# $ V2 : int  31 17 39 5 21 28
# $ V3 : Factor w/ 6 levels 1,6,7,10,..: 3 5 1 6 2 4
# $ V4 : int  26 4 8 18 39 2
# $ V5 : int  39 21 4 26 6 21
# $ V6 : int  27 33 35 8 17 8
# $ V7 : Factor w/ 5 levels 4,8,9,24,..: 2 3 4 1 3 5
# $ V8 : int  4 12 12 32 13 37
# $ V9 : Factor w/ 5 levels 10,31,33,..: 1 4 2 3 5 5
# $ V10: int  13 26 20 22 14 5


A.K.

I have a dataframe df with several columns. I need to change some of 
these to factors. What colums I need to change to factors is in another 
vector check. 
I am using this command 
sapply(check , function(x) df[[x]] - as.factor(df[[x]])) 

But this is not working. Can someone please advise. 

Thanks. 
-ST

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problems with reading data by readWorksheetFromFile of XLConnect Package

2013-05-03 Thread jim holtman

you can also try:

temp[] - lapply(temp, as.numeric)


On Fri, May 3, 2013 at 11:54 AM, David Winsemius dwinsem...@comcast.netwrote:


 On May 2, 2013, at 11:00 PM, jpm miao wrote:

  Hi Anthony,
 
Thank you very much. It works very well. However, after this line
 
  temp - sapply( temp , as.numeric )
 
the data becomes a series of numbers instead of a matrix. Is there any
  way to keep it a matrix?

 Perhaps (assuming this were a data.frame to be coerced:

 temp - matrix( sapply( temp , as.numeric ), dim(temp)[1])

 But the persistence of the -'s is puzzling. You should (as always) have
 posted the output from dput(temp).



   Thanks,
 
  Miao
 
 
 
 
  temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
  startRow=2, endRow= 11, startCol=2, endCol=5)
  temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )
  temp
   Col1 Col2   Col3Col4
  [1,] 647853 1413 57662 27897
  [2,] 491400 1365 40919 20411
  [3,] 38604  -5505  985
  [4,] 576-2054
  [5,] 80845  21   10211 4494
  [6,] 36428  27   1007  1953
  [7,] 269915 587  32988 12779
  [8,] 224494 -30554 9184
  [9,] 11858  587  - 686
  [10,] 3742   -81415
  temp - sapply( temp , as.numeric )
  Warning messages:
  1: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
  2: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
  3: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
  4: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
  5: In lapply(X = X, FUN = FUN, ...) : NAs introduced by coercion
  temp
  647853 491400  38604576  80845  36428 269915
  647853 491400  38604576  80845  36428 269915
  224494  11858   3742   1413   1365  -  -
  224494  11858   3742   1413   1365 NA NA
 21 27587  -587  -  57662
 21 27587 NA587 NA  57662
  40919   5505 20  10211   1007  32988  30554
  40919   5505 20  10211   1007  32988  30554
  - 81  27897  20411985 54   4494
 NA 81  27897  20411985 54   4494
   1953  12779   9184686415
   1953  12779   9184686415
  temp[ is.na( temp ) ] - 0
  temp
  647853 491400  38604576  80845  36428 269915
  647853 491400  38604576  80845  36428 269915
  224494  11858   3742   1413   1365  -  -
  224494  11858   3742   1413   1365  0  0
 21 27587  -587  -  57662
 21 27587  0587  0  57662
  40919   5505 20  10211   1007  32988  30554
  40919   5505 20  10211   1007  32988  30554
  - 81  27897  20411985 54   4494
  0 81  27897  20411985 54   4494
   1953  12779   9184686415
   1953  12779   9184686415
 
 
  2013/5/2 Anthony Damico ajdam...@gmail.com
 
  try adding colTypes = 'numeric' to your readWorkSheetFromFile() call
 
 
 
  if that doesn't work, try a few other steps
 
 
  # view what data types your file is being read in as
  sapply( temp , class )
 
 
  # convert all fields to character if they're factor variables.. but i
  don't think you need this, readWorksheet defaults to `character`
  temp - sapply( temp , as.character )
 
 
  # you can also convert a subset like this
  temp[ , c( 1 , 3:4 ) ] - sapply( temp[ , c( 1 , 3:4 ) ] , as.character
 )
 
 
 
  # remove commas from character strings
  temp - sapply( temp , function( x ) gsub( ',' , '' , x ) )
 
  # convert all fields to numeric
  temp - sapply( temp , as.numeric )
 
  # convert all NA fields to zeroes if you prefer
  temp[ is.na( temp ) ] - 0
 
 
 
 
 
  On Wed, May 1, 2013 at 11:55 PM, jpm miao miao...@gmail.com wrote:
 
  Hi,
 
Attached are two datasheet to be read.
My raw data 130502temp.xlsx contains numbers with ' symbols, and
 they
  can't be read as numbers. Even if I copy and paste as numbers to form a
  new
  file 130502temp_number1.xlsx, they could not be read smoothly.
 
1. How can I read the datasheet as numbers?
2. How can I treat the notation - as (1) NA or (2) zero?
 
Thanks,
 
  Miao
 
 
 
 
  temp-readWorksheetFromFile(130502temp.xlsx, sheet=1, header=FALSE,
  startRow=2, endRow= 11, startCol=2, endCol=5)
 
  temp
 
   Col1  Col2   Col3   Col4
 
  1  647,853 1,413 57,662 27,897
 
  2  491,400 1,365 40,919 20,411
 
  3   38,604 -  5,505985
 
  4  576 - 20 54
 
  5   80,84521 10,211  4,494
 
  6   36,42827  1,007  1,953
 
  7  269,915   587 32,988 12,779
 
  8  224,494 - 30,554  9,184
 
  9   11,858   587  -686
 
  10   3,742 - 81415
 
  temp[2,2]
 
  [1] 1,365
 
  temp[2,2]+3
 
  Error in temp[2, 2] + 3 : non-numeric argument to binary operator
 
  temp_num-readWorksheetFromFile(130502temp_number1.xlsx, sheet=1,
  header=FALSE, startRow=2, endRow= 11, startCol=2, endCol=5)
 
  temp_num[2,2]
 
  [1] 1,365
 
  temp_num[2,2]+3
 
  Error in temp_num[2, 2] + 3 : non-numeric argument to binary operator

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread William Dunlap

 if(num!=NA)
 Why doesn't the first statement work?

An NA value means that the value is unknown.  E.g.,
  age - NA
means the you do not know the age of your subject.
(The subject has an age, NA means you did not collect
that data.)  Thus you do not know the value of 
  age == 6
either, the subject might be 6 or it might not be.
Hence R makes the value of age==6  NA.

Since R does not have different evaluation rules for literal values
and expressions that means that NA==6 and NA==someAge
must evaluate to NA as well.

The second part of the question is why
   if (NA) { } else { }
causes an error.  It is a bit arbitrary, but there is a mismatch
between a 2-way 'if' statement and 3-valued logical data
and R deals with it by insisting that the condition in
   if (condition) { } else {}
be either TRUE or FALSE, not NA.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of jpm miao
 Sent: Friday, May 03, 2013 8:25 AM
 To: r-help
 Subject: [R] Why can't R understand if(num!=NA)?
 
 I have a program, when I write
 
 if(num!=NA)
 
 it yields an error message.
 
 However, if I write
 
 if(is.na(num)==FALSE)
 
 it works.
 
 Why doesn't the first statement work?
 
 Thanks,
 
 Miao
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread Kevin Wright

At a minimum, the first statement needs ==.

Also, is.na() gives TRUE/FALSE.  While a logical comparison to NA gives NA
as a value.

Kevin



On Fri, May 3, 2013 at 10:24 AM, jpm miao miao...@gmail.com wrote:

 I have a program, when I write

 if(num!=NA)

 it yields an error message.

 However, if I write

 if(is.na(num)==FALSE)

 it works.

 Why doesn't the first statement work?

 Thanks,

 Miao

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Kevin Wright

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Daniel Nordlund

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Katarzyna Kulma
 Sent: Friday, May 03, 2013 4:21 AM
 To: David Kulp
 Cc: r-help@r-project.org
 Subject: Re: [R] R does not subset

 Jorge, thanks for your suggestions, but they give the same (empty) result:

  RECinf-subset(REC2,  INFECTION==Infected)
  head(RECinf)
 [1] RINGNOyear  ccFLEDGE  rec2012   binageINFECTION all.rsLD
 0 rows (or 0-length row.names)

 but David's suggestion worked! :

  RECinf-REC2[REC2$INFECTION==Infected ,]
  head(RECinf)
 RINGNO  year ccFLEDGE rec2012 binage INFECTION   all.rsLD
 2  BX23298 Y20036   1juv Infected  -6.1938776
 4  BT53646 Y20035   2 ad Infected  -4.1938776
 7  BT53248 Y20036   1 ad Infected  -2.1938776
 11 BY75833 Y20045   0 ad Infected  -4.6574803
 13 BX23067 Y20046   0 ad Infected  -3.6574803
 17 BX24240 Y20046   0 ad Infected   0.3425197

 still not sure why the subset() function didn't work, though.

 Thanks for your help!

Maybe it didn't work because you still didn't have a space at the end of the 
value you were comparing (apparently the factor was defined with a space). 
with).  Try the following (and notice the space at the end of Infected . 

RECinf-subset(REC2,  INFECTION==Infected )

David's suggestion worked because you did include a space there.

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Counting number of consecutive occurrences per rows

2013-05-03 Thread zuzana zajkova

Hi,

I'm sorry that it takes me so much time to respond, finally yesterday I got
time to try your suggestions. Thank you for them!

I tried both, they give the same results, but in both there are some things
I still need to solve. I would appreciate your help.
I include a little bigger dataframe (test2, in the end of this email), with
more differencies in variables, to be able to better explain what I would
like to calculate in addition.

*Jim's code:*
I needed to make some changes in assigning the key. Yours worked ok for
that small test data, but when I tried it on my dataframe which has
around 25000rows, it didn't work properly.

test2$key[test2$act == 0] - 1
test2$key[test2$act  0  test2$act  200] - 2
test2$key[test2$act == 200] - 3

# this works ok
test2$resChange - cumsum(c(1, abs(diff(test2$key
test2$res - ave(test2$resChange, test2$resChange, FUN = length)
# I added new column by jul date
test2$resJ - ave(test2$resChange, test2$resChange, test2$juln, FUN =
length)
# this works fine as well, for dividing between day 0 and day 1
test2$resJD - ave(test2$resChange, test2$resChange, test2$juln, test2$day,
FUN = length)
# resume
test2Resume - test2[ , list(maxres = max(res)
   , minres = min(res)
   , sumres = length(unique(resChange)))
   , keyby = c('day', 'key')]
# change 'key'
 test2Resume_day$key - c('0', '1-199', '200')[test2Resume_day$key]
 test2Resume_day
   day   key maxres minres sumres
1:   0 0  2  2  3
2:   0 1-199  3  1  9
3:   0   200  6  1  7
4:   1 0  1  1  1
5:   1 1-199 10  1  7
6:   1   200  6  1  6

# resume by juln
 test2Resume_jul - test2[ , list(maxres = max(res)
   , minres = min(res)
  , sumres = length(unique(resChange)))
  , keyby = c('juln', 'key')]  # by juln
 # change 'key'
 test2Resume_jul$key - c('0', '1-199', '200')[test2Resume_jul$key]
 test2Resume_jul
juln   key maxres minres sumres
1: 15173 0  2  2  1
2: 15173 1-199  3  1  7
3: 15173   200  6  1  6
4: 15174 0  2  1  3
5: 15174 1-199 10  1  8
6: 15174   200  6  1  6

It is ok, but what I would like to get is resume for juln and for  variable
day (0 and 1) aswell.
Like this:

juln   day  key   maxres   minressumres
15173   00
15173   01-199
15173   0200
15173   10
15173   11-199
15173   1200
15174  0 0
15174  0 1-199
15174  0 200
15174  1 0
15174  1 1-199
15174  1 200
...

The other thing is that the sumres I would like to calculate like a sum
of values of occurencies for each key.
For example, if in the test2 dataframe res values for key 200 (juln 15173)
are 1, 1, 2,2,1,2 the sumres should be 9 (1+1+2+2+1+2), not 6 (which I
suppose come form sum of number of unique occurencies).


*Petr's code:*
This works fine also, the thing is that doing the aggregation I would need
the intervals to be like this
[0, 1)
[1, 199]
(199, 200]
what I don't know if is possible... I checked the hepl for cut, but I found
that it can be closed just right or left...

Thank you very much for your time and sharing your knowledge!

Zuzana


## here is the bigger test2 dataframe
 dput(test2)
structure(list(daten = structure(c(15173, 15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174), class = Date), juln = c(15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173, 15173,
15173, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174, 15174,
15174, 15174, 15174), fen = c(win, win, win, win, win,
win, win, win, win, win, win, win, win, win,
win, win, win, win, win, win, win, win, win,
win, win, win, win, win, win, win, win, win,
win, win, win, win, win, win, win, win, win,
win, win, win, win, win, win, win, win, win,
win, win, win, win, win, win, win, win, win,
win), night = structure(c(1310962792, 1310963392, 1310963992,
1310964592, 1310965192, 1310965792, 1310966392, 1310966992, 1310967592,
1310968192, 1310968792, 1310969392, 1310969992, 1310970592, 1310971192,
1310971792, 1310972392, 1310972992, 1310973592,

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread peter dalgaard


On May 3, 2013, at 17:24 , jpm miao wrote:

 I have a program, when I write
 
 if(num!=NA)
 
 it yields an error message.
 
 However, if I write
 
 if(is.na(num)==FALSE)
 
 it works.
 
 Why doesn't the first statement work?


Because comparison with an unknown value yields an unknown result. 

By the way, comparing a logical value to FALSE is silly: 

if ( !is.na(num) ) will do it.



-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating distance matrix for large dataset

2013-05-03 Thread Uwe Ligges




On 03.05.2013 15:36, David Carlson wrote:

Here's the result on R 3.0.0 64 bit under Windows 8:


A-matrix(1:365000*144,nrow=365000,ncol=144)
dim(A)

[1] 365000144

d - dist(mydata_nor, method = euclidean)

Error in as.matrix(x) : object 'mydata_nor' not found

d - dist(A, method = euclidean)

Error: cannot allocate vector of size 496.3 Gb
In addition: Warning messages:
1: In dist(A, method = euclidean) :
   Reached total allocation of 8078Mb: see help(memory.size)
2: In dist(A, method = euclidean) :
   Reached total allocation of 8078Mb: see help(memory.size)
3: In dist(A, method = euclidean) :
   Reached total allocation of 8078Mb: see help(memory.size)
4: In dist(A, method = euclidean) :
   Reached total allocation of 8078Mb: see help(memory.size)

Your message suggests that your system could not accurately compute the
requirements. Unless you have access to a computer with 500 gigabytes, you
need to consider alternate approaches such as aggregating the data into
longer time blocks or using kmeans.



Or to show how we can calculate it:
Or simpler speaking, you need to calculate 365000 * (365000-1) / 2 = 
66612317500 distances and with 8 bytes each, hence you need 66612317500 
* 8 = 53289854 Bytes = 53289854 / (1024)^3 GB ~= 496.3 Gb to 
store it in memory.


Best,
Uwe Ligges






-
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77840-4352

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of HJ YAN
Sent: Thursday, May 2, 2013 6:02 PM
To: r-help@r-project.org
Subject: [R] Calculating distance matrix for large dataset

Dear R users


I wondered if any of you ever tried to calculate distance matrix with very
large data set, and if anyone out there can confirm this error message I got
actually mean that my data is too large for this task.

negative length vectors are not allowed


My data size and code used

  dim(mydata_nor)[1] 365000144 d - dist(mydata_nor, method =
euclidean)



Here my data has 1000 samples each has a year data observed by 10 minutes
interval daily, so the size is  (365* 1000) * 144.


I checked the manual of function 'dist' but can not see the upper limit size
allowed, and I bet there should be one, so any hints is appreciated.


I would also be grateful if any other method for calculating distance matrix
for large dataset could be advised.



I appreciate reproducible code should be provided for your advice, so try
below if needed:

A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A)[1] 365000144
d1-dist(A,method=euclidean)Error in dist(A, method = euclidean) :
   negative length vectors are not allowed




Many thanks in advance!

HJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Empirica Copula

2013-05-03 Thread T P Kharel

Dear users
I am reposting this and hope it will be accepted this time.

I am using copula package to fit my bivariate data and simulation. As
explained in package documentation we can use our own data distribution to
feed on copula as long as we have d, p and q (pdf, cdf and quantile)
functions are available.  Hence my code for those are:

# Make the functions for data distribution
dSAR-function(SAR){dexp(SAR, rate=0.5)}
pSAR-function(SAR){pexp(SAR, rate=0.5)}
qSAR-function(SAR){qexp(c(seq(0,1, .01)),SAR, rate=0.5)}


dper-function(per) {dexp(per,rate=0.5)}
pper-function(per){pexp(per,rate=0.5)}
qper-function(per){qexp(c(seq(0,1,.01)),per, rate=0.5)}

gmb-gumbelCopula(3,dim=2) # create bivariate copula object with dim=2

#tau(gmb)
## construct a bivariate distribution with defined marginals
 myCDF- mvdc(gmb, margins=c(exp,exp),
paramMargins=list(list(rate=0.5),list(rate=0.5)))

# Use own data for bivariate CDF construction
myCDF2- mvdc(gmb, margins=c(SAR,per),
paramMargins=list(list(rate=.5),list(rate=.5)))

# Generate (bivariate) random numbers from that, and visualize
x - rMvdc(1000, myCDF2)

And I get error message everytime as:
 x - rMvdc(1000, myCDF2)
Error in qSAR(x, rate = 0.5) : unused argument(s) (rate = 0.5)

It works fine with  myCDF and generate bivariate data:
x - rMvdc(1000, myCDF2)

But my problem is simulated data (using myCDF) does not show the same
relationship as in original data.  Hence I want to use my own empirical
distribution (myCDF2) to simulate data.  It looks like it is not taking the
quantile function, qSAR. Is there any other way I can define my data
distribution and feed  to copula ?   Thanks for help.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R does not subset

2013-05-03 Thread Mihai Nica

Thi is great! Thank you so much for taking the time to give is this hint.
Â 
mike



 From: Jeff Newmiller jdnew...@dcn.davis.ca.us
To:Mihai Nica mihain...@yahoo.com; Mihai Nica mihain...@yahoo.com; Jorge I 
Velez jorgeivanve...@gmail.com; Katarzyna Kulma katarzyna.ku...@gmail.com 
Cc: R mailing list r-help@r-project.org 
Sent: Friday, May 3, 2013 8:16 AM
Subject: Re: [R] R does not subset
 

This typically occurs because of sloppy manual data entry outside of R. To 
relieve further analysis pain, you can manually clean the data (usually only 
effective for one-time analyses) or use R to fix problems right after loading 
the data (there are multiple methods for doing this... I prefer using ?sub on 
character data before creating the factor).
---
Jeff NewmillerÂ  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  TheÂ  Â   .Â  Â  Â   
.Â  Go Live...
DCN:jdnew...@dcn.davis.ca.usÂ  Â  Â  Â  Basics: ##.#.Â  Â  Â   ##.#.Â  Live 
Go...
Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Â  Live:Â  OO#.. Dead: 
OO#..Â  Playing
Research Engineer (Solar/BatteriesÂ  Â  Â  Â  Â  Â  O.O#.Â  Â  Â  #.O#.Â  with
/Software/Embedded Controllers)Â  Â  Â  Â  Â  Â  Â  .OO#.Â  Â  Â  .OO#.Â  
rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Mihai Nica mihain...@yahoo.com wrote:

Hi:

(note the space after Infected)

Since I lost a morning too with this issue, I am just curious, why is
there a space?ï¿½

I know, it must be a dumb question, a reasonable programming rule, but
that's my level :-)
ï¿½
mike



 From: Jorge I Velez jorgeivanve...@gmail.com
To:Katarzyna Kulma katarzyna.ku...@gmail.com 
Cc: R mailing list r-help@r-project.org 
Sent: Friday, May 3, 2013 6:01 AM
Subject: Re: [R] R does not subset
 

Hi Kasia,

You need

subset(REC2,ï¿½ INFECTION==Infected )

(note the space after Infected).

HTH,
Jorge.-


On Fri, May 3, 2013 at 7:48 PM, Katarzyna Kulma
katarzyna.ku...@gmail.comwrote:

 Hi everyone,

 I know there have been several requests regarding subsetting before,
but
 none of them really helps with my problem:

 I'm trying to subset only infected individuals from the REC2
data.frame:

  str(REC2)
 'data.frame':ï¿½ ï¿½ 362 obs. ofï¿½ 7 variables:
ï¿½ $ RINGNOï¿½Â  : Factor w/ 370 levels BL17546,BL17577,..: 78 81 67
41 58
 66 17
ï¿½ $ yearï¿½ ï¿½Â  : Factor w/ 8 levels Y2002,Y2003,..: 1 2 1 2 1 1 2 1
1 3
 ...
ï¿½ $ ccFLEDGE : intï¿½ 6 6 6 5 6 7 6 7 6 5 ...
ï¿½ $ rec2012ï¿½ : intï¿½ 2 1 2 2 1 2 1 1 1 0 ...
ï¿½ $ binageï¿½ : Factor w/ 2 levels ad,juv: 1 2 1 1 1 1 1 1 1 1 ...
ï¿½ $ INFECTION: Factor w/ 2 levels Infected ,Uninfected : 2 1 2 1
2 2 1 2
 2 1 ...
ï¿½ $ all.rsLD : numï¿½ -4.62 -6.19 -3.62 -4.19 -2.62 ...

 using either

 RECinf-REC2[which (REC2$INFECTION==Infected),]

 or

 RECinf-subset(REC2,ï¿½ INFECTION==Infected)

 in both cases I get empty data frame (0 observations):

  str(RECinf)
 'data.frame':ï¿½ ï¿½ 0 obs. ofï¿½ 7 variables:
ï¿½ $ RINGNOï¿½Â  : Factor w/ 370 levels BL17546,BL17577,..:
ï¿½ $ yearï¿½ ï¿½Â  : Factor w/ 8 levels Y2002,Y2003,..:
ï¿½ $ ccFLEDGE : int
ï¿½ $ rec2012ï¿½ : int
ï¿½ $ binageï¿½ : Factor w/ 2 levels ad,juv:
ï¿½ $ INFECTION: Factor w/ 2 levels Infected ,Uninfected :
ï¿½ $ all.rsLD : num

 When subsetting, R doesn't return any warning or error message.
Besides, I
 used same codes many times beforeand they worked perfectly well. Any
ideas
 why this case is different?

 Thanks for your help,
 Kasia

ï¿½ ï¿½ ï¿½ ï¿½Â  [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


ï¿½ï¿½ï¿½ [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Â Â Â  [[alternative HTML version deleted]]





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create and read symbolic links in Windows

2013-05-03 Thread Santosh

Thanks for your suggestion... I upgraded to R.3.0.0 in 64-bit Windows 7
environment..

This time when I use file.link..
I get the following error message: 'Cannot create a file when that file
already exists
And I don't see the link.

The other function, file.copy, correctly copies to the target location.

Still confuse with the error msges...

Thanks,
Santosh


On Thu, May 2, 2013 at 11:42 PM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote:

 On 03/05/2013 07:33, Santosh wrote:

 Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't
 get file.symlink to work, but file.link did return the result to be
 TRUE but at the target location, I did not see any link.

 Not sure I am missing anything more.. Hope it's nothing to do with
 administrator accounts and administrator rights... Is it something I
 should check with my system administrator?


 You may need to update your R: although the posting guide asked you to do
 that before posting.  There was a relevant bug fix in 2.15.3.


 Thanks,
 Santosh


 On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley
 rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk** wrote:

 On 02/05/2013 19:50, Santosh wrote:

 Dear Rxperts..
 Got a couple of quick q's..
 I am using R in windows environment (both 32-bit and 64-bit)
 a) Is there a way to create symbolic links to some data files?


 See ?file.symlink.  ??'symbolic link' should have got you there.

 Note that this is not very useful for files, but that is a Windows
 and not an R restriction.


   b) How do I read data from symbolic links?

 The same ways you read data from files.


 Thanks so much..
 Santosh



 --
 Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk
 Professor of Applied Statistics,
 
 http://www.stats.ox.ac.uk/~__**ripley/http://www.stats.ox.ac.uk/~__ripley/

 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 
 University of Oxford, Tel: +44 1865 272861
 tel:%2B44%201865%20272861 (self)
 1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA)

 Oxford OX1 3TG, UKFax: +44 1865 272595
 tel:%2B44%201865%20272595

 __**__
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 
 https://stat.ethz.ch/mailman/_**_listinfo/r-helphttps://stat.ethz.ch/mailman/__listinfo/r-help

 
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 
 PLEASE do read the posting guide
 
 http://www.R-project.org/__**posting-guide.htmlhttp://www.R-project.org/__posting-guide.html

 
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained, reproducible code.




 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create and read symbolic links in Windows

2013-05-03 Thread Santosh

Just got it right please ignore the previous posting...

It worked!
 Prof Ripley made my day!! :) THANK YOU!


On Fri, May 3, 2013 at 11:23 AM, Santosh santosh2...@gmail.com wrote:

 Thanks for your suggestion... I upgraded to R.3.0.0 in 64-bit Windows 7
 environment..

 This time when I use file.link..
 I get the following error message: 'Cannot create a file when that file
 already exists
 And I don't see the link.

 The other function, file.copy, correctly copies to the target location.

 Still confuse with the error msges...

 Thanks,
 Santosh


 On Thu, May 2, 2013 at 11:42 PM, Prof Brian Ripley 
 rip...@stats.ox.ac.ukwrote:

 On 03/05/2013 07:33, Santosh wrote:

 Thanks for the suggestions. In windows (Windows 7, 64-bit), I couldn't
 get file.symlink to work, but file.link did return the result to be
 TRUE but at the target location, I did not see any link.

 Not sure I am missing anything more.. Hope it's nothing to do with
 administrator accounts and administrator rights... Is it something I
 should check with my system administrator?


 You may need to update your R: although the posting guide asked you to do
 that before posting.  There was a relevant bug fix in 2.15.3.


 Thanks,
 Santosh


 On Thu, May 2, 2013 at 12:22 PM, Prof Brian Ripley
 rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk** wrote:

 On 02/05/2013 19:50, Santosh wrote:

 Dear Rxperts..
 Got a couple of quick q's..
 I am using R in windows environment (both 32-bit and 64-bit)
 a) Is there a way to create symbolic links to some data files?


 See ?file.symlink.  ??'symbolic link' should have got you there.

 Note that this is not very useful for files, but that is a Windows
 and not an R restriction.


   b) How do I read data from symbolic links?

 The same ways you read data from files.


 Thanks so much..
 Santosh



 --
 Brian D. Ripley, rip...@stats.ox.ac.uk mailto:rip...@stats.ox.ac.uk
 
 Professor of Applied Statistics,
 
 http://www.stats.ox.ac.uk/~__**ripley/http://www.stats.ox.ac.uk/~__ripley/

 
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 
 University of Oxford, Tel: +44 1865 272861
 tel:%2B44%201865%20272861 (self)
 1 South Parks Road, +44 1865 272866 tel:%2B44%201865%20272866 (PA)

 Oxford OX1 3TG, UKFax: +44 1865 272595
 tel:%2B44%201865%20272595

 __**__
 R-help@r-project.org mailto:R-help@r-project.org mailing list
 
 https://stat.ethz.ch/mailman/_**_listinfo/r-helphttps://stat.ethz.ch/mailman/__listinfo/r-help

 
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 
 PLEASE do read the posting guide
 
 http://www.R-project.org/__**posting-guide.htmlhttp://www.R-project.org/__posting-guide.html

 
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 
 and provide commented, minimal, self-contained, reproducible code.




 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~**ripley/http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Winsemius




 
 On May 3, 2013, at 17:24 , jpm miao wrote:
 
 I have a program, when I write
 
 if(num!=NA)
 
 snipped

On May 3, 2013, at 10:46 AM, peter dalgaard wrote:

 Because comparison with an unknown value yields an unknown result. 

Anything else would violate the Second Law of Thermodynamics. We cannot have 
comparisons reducing entropy, now can we? Uncertainty cannot run uphill.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fortune candidate! Re: Why can't R understand if(num!=NA)?

2013-05-03 Thread Sarah Goslee

On Fri, May 3, 2013 at 3:36 PM, David Winsemius dwinsem...@comcast.net wrote:

 On May 3, 2013, at 17:24 , jpm miao wrote:

 I have a program, when I write

 if(num!=NA)

 snipped

 On May 3, 2013, at 10:46 AM, peter dalgaard wrote:

 Because comparison with an unknown value yields an unknown result.

 Anything else would violate the Second Law of Thermodynamics. We cannot have 
 comparisons reducing entropy, now can we? Uncertainty cannot run uphill.

 --

 David Winsemius
 Alameda, CA, USA


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] (no subject)

2013-05-03 Thread Tien trung Dinh

Hi.

After I installed R 3.0.0.pkg for mac version , when click the icon R to 
startup . I receive the annoucement in red color to inform that something 
wrongs , but I do not know how to fix them .
R version 3.0.0 (2013-04-03) -- Masked Marvel
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-apple-darwin10.8.0 (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

During startup - Warning messages:
1: Setting LC_CTYPE failed, using C 
2: Setting LC_COLLATE failed, using C 
3: Setting LC_TIME failed, using C 
4: Setting LC_MESSAGES failed, using C 
5: Setting LC_PAPER failed, using C 
[R.app GUI 1.60 (6476) x86_64-apple-darwin10.8.0]

WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will 
work.
Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system 
preferences accordingly.
[History restored from /Users/dinhtientrung/.Rapp.history]

starting httpd help server ... done
 


Would you mind sharing your experiences in this situation for me please !

Thank you so much .

Hope to hear the answer from you soon 

Trung 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why can't R understand if(num!=NA)?

2013-05-03 Thread David Winsemius


On May 3, 2013, at 10:46 AM, peter dalgaard wrote:

 
 On May 3, 2013, at 17:24 , jpm miao wrote:
 
 I have a program, when I write
 
 if(num!=NA)
 
 it yields an error message.
 
 However, if I write
 
 if(is.na(num)==FALSE)
 
 it works.
 
 Why doesn't the first statement work?
 
 
 Because comparison with an unknown value yields an unknown result. 

Anything else would violate the Second Law of Thermodynamics. We cannot have 
comparisons reducing entropy, now can we? Uncertainty cannot run uphill.

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] (no subject)

2013-05-03 Thread David Winsemius


On May 3, 2013, at 9:44 AM, Tien trung Dinh wrote:

 Hi.
 
 After I installed R 3.0.0.pkg for mac version , when click the icon R to 
 startup . I receive the annoucement in red color to inform that something 
 wrongs , but I do not know how to fix them .
 R version 3.0.0 (2013-04-03) -- Masked Marvel
 Copyright (C) 2013 The R Foundation for Statistical Computing
 Platform: x86_64-apple-darwin10.8.0 (64-bit)
 
 R is free software and comes with ABSOLUTELY NO WARRANTY.
 You are welcome to redistribute it under certain conditions.
 Type 'license()' or 'licence()' for distribution details.
 
   Natural language support but running in an English locale
 
 R is a collaborative project with many contributors.
 Type 'contributors()' for more information and
 'citation()' on how to cite R or R packages in publications.
 
 Type 'demo()' for some demos, 'help()' for on-line help, or
 'help.start()' for an HTML browser interface to help.
 Type 'q()' to quit R.
 
 During startup - Warning messages:
 1: Setting LC_CTYPE failed, using C 
 2: Setting LC_COLLATE failed, using C 
 3: Setting LC_TIME failed, using C 
 4: Setting LC_MESSAGES failed, using C 
 5: Setting LC_PAPER failed, using C 
 [R.app GUI 1.60 (6476) x86_64-apple-darwin10.8.0]
 
 WARNING: You're using a non-UTF8 locale, therefore only ASCII characters will 
 work.
 Please read R for Mac OS X FAQ (see Help) section 9 and adjust your system 
 preferences accordingly.
 [History restored from /Users/dinhtientrung/.Rapp.history]
 
 starting httpd help server ... done
  
 
 
 Would you mind sharing your experiences in this situation for me please !

Why have you stopped at this point?  (My experiences have bee quite good with 
following advice.)  You have been given a very specific warning (not an error). 
It is telling you where to find additional information. It is your 
responsibility to educate yourself further. The document referred to can be 
found by pulling down the Help menu (while running R.app)  and choosing R 
Help.

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] color by group in ggplot

2013-05-03 Thread Ye Lin

Hey,

I have a dataset like this:

ID Var1   Var2 Group
A1 11BB
A2 1   2AA
B1  2  1 CC
B2 13DD
C1  12EE

I would like to plot the points of Var1 and Var2, use ID as X-axis, but
color the points by Group. I can only manage to color the points by ID
after transform the dataset to tall using reshape package.

Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] selecting certain rows from data frame

2013-05-03 Thread arun



Hi,
You can use ?split()
 lst1-split(DF,DF$ID)
lst1[1:2]
#$`1`
#  ID  drugs month
#1  1 drug x 1
#4  1 drug x 1
#5  1 drug y 2
#6  1 drug z 3
#
#$`2`
 # ID  drugs month
#2  2 drug y 2
#7  2 drug x 1

mean(sapply(lst1,nrow))
#[1] 2.4
#or
library(plyr)
 mean(ddply(DF,.(ID),nrow)[,2])
#[1] 2.4
#or
mean(with(DF,tapply(ID,ID,FUN=length)))
#[1] 2.4
A.K.





From: Sarah Jo Sinnott 105405...@umail.ucc.ie
To: arun smartpink...@yahoo.com 
Sent: Friday, May 3, 2013 4:35 PM
Subject: Re: selecting certain rows from data frame



Yes - but if I can count the number of rows for each ID, this equates to number 
of drugs per each ID. So that way I can get a mean #rows(drugs). 

e.g., 

ID 1 = 4 rows (approx=4drugs)
ID2= 2 rows
ID 3 = 3 rows
ID 4 = 2 rows
ID 5 = 1 row

12 rows/5people = 2.4rows/person

that is 2.4 drugs per person. 

Do you think it is possible to isolate the number of rows per unique ID? It 
would be great if you could! I'v etried reorganising my data into wide format - 
but it doesn't work very well, so I'm left with his option really!

Thank you for you help thus far

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread David Winsemius


On May 3, 2013, at 1:37 PM, Ye Lin wrote:

 Hey,
 
 I have a dataset like this:
 
 ID Var1   Var2 Group
 A1 11BB
 A2 1   2AA
 B1  2  1 CC
 B2 13DD
 C1  12EE
 
 I would like to plot the points of Var1 and Var2, use ID as X-axis, but
 color the points by Group. I can only manage to color the points by ID
 after transform the dataset to tall using reshape package.

If I were given the task of designing a plotting system that would decide 
what to do with a categorical x-axis request, it would probably deliver a 
barplot. My guess is that you do not want that. But what do you mean by a 
point whose x-value is A1?

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread Ye Lin

I want to plot the values of Var1 and Var2 on the same plot, with
x-axis labeling as the list of IDs. But I want to color the points by
 their category in Group. Is it possible to do in ggplot, or do i have to
plot from scratch using basic plot?


On Fri, May 3, 2013 at 1:49 PM, David Winsemius dwinsem...@comcast.netwrote:


 On May 3, 2013, at 1:37 PM, Ye Lin wrote:

  Hey,
 
  I have a dataset like this:
 
  ID Var1   Var2 Group
  A1 11BB
  A2 1   2AA
  B1  2  1 CC
  B2 13DD
  C1  12EE
 
  I would like to plot the points of Var1 and Var2, use ID as X-axis, but
  color the points by Group. I can only manage to color the points by
 ID
  after transform the dataset to tall using reshape package.

 If I were given the task of designing a plotting system that would
 decide what to do with a categorical x-axis request, it would probably
 deliver a barplot. My guess is that you do not want that. But what do you
 mean by a point whose x-value is A1?

 --

 David Winsemius
 Alameda, CA, USA


attachment: image.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread arun

HI,
May be this helps:

dat1- read.table(text=
ID    Var1  Var2    Group
A1    1    1    BB
A2    1  2    AA
B1  2  1    CC
B2    1    3    DD
C1  1    2    EE
,sep=,header=TRUE)
library(reshape2)
dat2-melt(dat1,id.var=c(ID,Group))
library(ggplot2)
ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point()
A.K.



- Original Message -
From: Ye Lin ye...@lbl.gov
To: R help r-help@r-project.org
Cc: 
Sent: Friday, May 3, 2013 4:37 PM
Subject: [R] color by group in ggplot

Hey,

I have a dataset like this:

ID     Var1       Var2     Group
A1         1            1            BB
A2         1           2            AA
B1          2          1             CC
B2         1            3            DD
C1          1            2            EE

I would like to plot the points of Var1 and Var2, use ID as X-axis, but
color the points by Group. I can only manage to color the points by ID
after transform the dataset to tall using reshape package.

Thanks for your help!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread Ye Lin

Thanks A.K 
I also add shape=variable so that it is much easier to tell two variables
by color +shape



On Fri, May 3, 2013 at 2:14 PM, arun smartpink...@yahoo.com wrote:

 HI,
 May be this helps:

 dat1- read.table(text=
 IDVar1  Var2Group
 A111BB
 A21  2AA
 B1  2  1CC
 B213DD
 C1  12EE
 ,sep=,header=TRUE)
 library(reshape2)
 dat2-melt(dat1,id.var=c(ID,Group))
 library(ggplot2)
 ggplot(dat2,aes(x=ID,y=value,group=Group,colour=Group))+geom_point()
 A.K.



 - Original Message -
 From: Ye Lin ye...@lbl.gov
 To: R help r-help@r-project.org
 Cc:
 Sent: Friday, May 3, 2013 4:37 PM
 Subject: [R] color by group in ggplot

 Hey,

 I have a dataset like this:

 ID Var1   Var2 Group
 A1 11BB
 A2 1   2AA
 B1  2  1 CC
 B2 13DD
 C1  12EE

 I would like to plot the points of Var1 and Var2, use ID as X-axis, but
 color the points by Group. I can only manage to color the points by ID
 after transform the dataset to tall using reshape package.

 Thanks for your help!

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vector allocation problem while trying to plot 6 MB data file

2013-05-03 Thread Uwe Ligges




On 02.05.2013 14:37, Ramon Hofer wrote:

Hi all

I'm trying to analyse the network speed and used iperf to create a csv
file containing the link test data. It's only about 6 MB big but
contains about 40'000 samples.

I can do boxplots (apart from printing the number of samples but I ask
separately for that).

To find the behaviour over time I wanted to plot the throuphput. So I
have this command:

plot(A$Timestamp, A$Bandwidth.bit.sec., xlab = Timestamp, ylab =
Bandwidth [bit/s], ylim = quantile(A$Bandwidth.bit.sec., c(0, .99),
na.rm = TRUE))

Unfortunately I get this:
Error: cannot allocate vector of size 12.5 Gb


4 samples and 6MB can't be the issue unless this is not a regular 
plot but the classes of A$Timestamp or A$Bandwidth.bit.sec are rather 
special.


What do
str(A$Timestamp)
str(A$Bandwidth.bit.sec.)
tell us?

Can you make a reprducible examples available?

Best,
Uwe Ligges





Is there a way around this problem or will I have to split the data?


Best
Ramon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Write date class as number of days from 1970

2013-05-03 Thread Uwe Ligges




On 03.05.2013 15:59, Manta wrote:

Dear all,

I have a dataset with one column being of class Date. When I write the
output, I would like that column being written as number of days from
1970-01-01. I could not find anywhere a way to do it.



as.numeric(x)

where x is the Date object.

Uwe Ligges




Thanks,
Marco



--
View this message in context: 
http://r.789695.n4.nabble.com/Write-date-class-as-number-of-days-from-1970-tp4666155.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD building SPEEDY

2013-05-03 Thread Uwe Ligges




On 02.05.2013 05:10, ren_az wrote:

Hello every one:

I get following warning when building my R package with R-3.0.0.



building 'SPEEDY.tar.gz' Warning in utils::tar(filepath, pkgname,
compression = gzip, compression_level = 9L, : number of items to replace
is not a multiple of replacement length thanks Michael



I have no idea for this, can you help me.


Can you show us the package?
I am not able to generate a problem like this one with R-3.0.0, i.e. 
that R tries to create a file called SPEEDY.tar.gz without version number.


Best,
Uwe Ligges




Best regard






$BG$!!0DA(B/ Ren Aizhen

r...@bi.cs.titech.ac.jp

$BEl5~9)6HBg3XpJsM}9)3X85f2J!!7W;;9)3X@l96!!=);385f%I%/%?!(B4$BG/(B

$B)(B152-8552$B!!L\9u6hBg2,;3(B2-12-1$B!!(BW8-76 
$B!J@(B8$B9f4[(BE507$B9f(B)

Tel:03-5734-3645, Fax:03-5734-3646


-




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] color by group in ggplot

2013-05-03 Thread David Winsemius


On May 3, 2013, at 1:57 PM, Ye Lin wrote:

 I want to plot the values of Var1 and Var2 on the same plot, with x-axis 
 labeling as the list of IDs. Sth like this:
 image.png
 
  But I want to color the points based on the category in Group, I dont know 
 how to do it with ggplot.

You didn't say what class the ID variable was, but if it were a factor ( as is 
most likely), then:

plot(  as.numeric(dfrm$ID), Var1)
points( as.numeric(dfrm$ID), Var2) 

With whatever means of disiguishing overlapping points (pch, col, jittering)  
might suit you.

-- 
David.

 Thanks!
 
 
 On Fri, May 3, 2013 at 1:49 PM, David Winsemius dwinsem...@comcast.net 
 wrote:
 
 On May 3, 2013, at 1:37 PM, Ye Lin wrote:
 
  Hey,
 
  I have a dataset like this:
 
  ID Var1   Var2 Group
  A1 11BB
  A2 1   2AA
  B1  2  1 CC
  B2 13DD
  C1  12EE
 
  I would like to plot the points of Var1 and Var2, use ID as X-axis, but
  color the points by Group. I can only manage to color the points by ID
  after transform the dataset to tall using reshape package.
 
 If I were given the task of designing a plotting system that would decide 
 what to do with a categorical x-axis request, it would probably deliver a 
 barplot. My guess is that you do not want that. But what do you mean by a 
 point whose x-value is A1?
 
 --
 
 David Winsemius
 Alameda, CA, USA
 
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread Aldi Kraja


Hi,
Based on par function, I can split the screen into  two parts left and 
right.
I wish x occupies the half left screen, and all plants occupy half right 
screen, which happens right now.


But I wish the right screen, to be split in 3 or more vertical parts 
where each pair of the same type of plant, are together in its own block 
of boxplot, because each plant has its own unit of measure.
Let's say wheat is measured in ton, tomato in pound and cucumbers as 
counts. :-)


x-rnorm(1000,mean=0,sd=1,main=Right screen)

wheat1-rnorm(100,mean=0,sd=1)
wheat2-rnorm(150,mean=0,sd=2)
tomatos3-rnorm(200,mean=0,sd=3)
tomatos4-rnorm(250,mean=0,sd=4)
cucumbers5-rnorm(300,mean=0,sd=5)
cucumbers6-rnorm(400,mean=0,sd=6)
par(mfrow=c(1,2))

hist(x, main=Left screen OK)

boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
title (Right screen: boxplot with plants)

Thank you in advance for any suggestions,

Aldi

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread Aldi Kraja


Hmm,
I had a typo paste by mistake in my x vector
It has to be:

x-rnorm(1000,mean=0,sd=1)
wheat1-rnorm(100,mean=0,sd=1)
wheat2-rnorm(150,mean=0,sd=2)
tomatos3-rnorm(200,mean=0,sd=3)
tomatos4-rnorm(250,mean=0,sd=4)
cucumbers5-rnorm(300,mean=0,sd=5)
cucumbers6-rnorm(400,mean=0,sd=6)
par(mfrow=c(1,2))

hist(x, main=Left screen OK)

boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
title (Right screen: boxplot with plants)

Thanks,

Aldi

On 5/3/2013 4:46 PM, Aldi Kraja wrote:

Hi,
Based on par function, I can split the screen into  two parts left and 
right.
I wish x occupies the half left screen, and all plants occupy half 
right screen, which happens right now.


But I wish the right screen, to be split in 3 or more vertical parts 
where each pair of the same type of plant, are together in its own 
block of boxplot, because each plant has its own unit of measure.
Let's say wheat is measured in ton, tomato in pound and cucumbers as 
counts. :-)


x-rnorm(1000,mean=0,sd=1,main=Right screen)

wheat1-rnorm(100,mean=0,sd=1)
wheat2-rnorm(150,mean=0,sd=2)
tomatos3-rnorm(200,mean=0,sd=3)
tomatos4-rnorm(250,mean=0,sd=4)
cucumbers5-rnorm(300,mean=0,sd=5)
cucumbers6-rnorm(400,mean=0,sd=6)
par(mfrow=c(1,2))

hist(x, main=Left screen OK)

boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
title (Right screen: boxplot with plants)

Thank you in advance for any suggestions,

Aldi

--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread Sarah Goslee

Hi Aldi,

You might want
?layout
instead.

Sarah

On Fri, May 3, 2013 at 5:54 PM, Aldi Kraja a...@wustl.edu wrote:
 Hmm,
 I had a typo paste by mistake in my x vector
 It has to be:

 x-rnorm(1000,mean=0,sd=1)
 wheat1-rnorm(100,mean=0,sd=1)
 wheat2-rnorm(150,mean=0,sd=2)
 tomatos3-rnorm(200,mean=0,sd=3)
 tomatos4-rnorm(250,mean=0,sd=4)
 cucumbers5-rnorm(300,mean=0,sd=5)
 cucumbers6-rnorm(400,mean=0,sd=6)
 par(mfrow=c(1,2))

 hist(x, main=Left screen OK)

 boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
 title (Right screen: boxplot with plants)

 Thanks,

 Aldi

 On 5/3/2013 4:46 PM, Aldi Kraja wrote:

 Hi,
 Based on par function, I can split the screen into  two parts left and
 right.
 I wish x occupies the half left screen, and all plants occupy half right
 screen, which happens right now.

 But I wish the right screen, to be split in 3 or more vertical parts where
 each pair of the same type of plant, are together in its own block of
 boxplot, because each plant has its own unit of measure.
 Let's say wheat is measured in ton, tomato in pound and cucumbers as
 counts. :-)

 x-rnorm(1000,mean=0,sd=1,main=Right screen)

 wheat1-rnorm(100,mean=0,sd=1)
 wheat2-rnorm(150,mean=0,sd=2)
 tomatos3-rnorm(200,mean=0,sd=3)
 tomatos4-rnorm(250,mean=0,sd=4)
 cucumbers5-rnorm(300,mean=0,sd=5)
 cucumbers6-rnorm(400,mean=0,sd=6)
 par(mfrow=c(1,2))

 hist(x, main=Left screen OK)

 boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
 title (Right screen: boxplot with plants)

 Thank you in advance for any suggestions,

 Aldi



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R package for bootstrapping (comparing two quadratic regression models)

2013-05-03 Thread Elaine Kuo

Hello ,

I want to compare two quadratic regression models with non-parametric
bootstrap.
However, I do not know which R package can serve the purpose,
such as boot, rms, or bootstrap, DeltaR.
Please kindly advise and thank you.

Elaine

The two quadratic regression models are

y1=a1x^2+b1x+c1

y1= observed migration distance of butterflies()

y2=a2x^2+b2x+c2

y2= predicted migration distance of butterflies (based on body mass)

x= body mass of butterflies


null hypothesis: a1=a2 and b1=b2 and c1=c2

bootstrap to test if the coeffients (a, b, c) of the y1 and the y2 model
differ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A problem of splitting the right screen in 3 or more independent vertical boxes:

2013-05-03 Thread David Winsemius


On May 3, 2013, at 3:21 PM, Sarah Goslee wrote:

 Hi Aldi,
 
 You might want
 ?layout
 instead.
 

Indeed. In particular a matrix argument might be:

matrix(c(1,2,3, 4,4,4)


 Sarah
 
 On Fri, May 3, 2013 at 5:54 PM, Aldi Kraja a...@wustl.edu wrote:
 Hmm,
 I had a typo paste by mistake in my x vector
 It has to be:
 
 x-rnorm(1000,mean=0,sd=1)
 wheat1-rnorm(100,mean=0,sd=1)
 wheat2-rnorm(150,mean=0,sd=2)
 tomatos3-rnorm(200,mean=0,sd=3)
 tomatos4-rnorm(250,mean=0,sd=4)
 cucumbers5-rnorm(300,mean=0,sd=5)
 cucumbers6-rnorm(400,mean=0,sd=6)
 par(mfrow=c(1,2))
 
 hist(x, main=Left screen OK)
 
 boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)

I think you will need a separate call to boxplot for each grouping. The 
`boxplot` function will nto be able to access the device specifications.


-- 
David.


 title (Right screen: boxplot with plants)
 
 Thanks,
 
 Aldi
 
 On 5/3/2013 4:46 PM, Aldi Kraja wrote:
 
 Hi,
 Based on par function, I can split the screen into  two parts left and
 right.
 I wish x occupies the half left screen, and all plants occupy half right
 screen, which happens right now.
 
 But I wish the right screen, to be split in 3 or more vertical parts where
 each pair of the same type of plant, are together in its own block of
 boxplot, because each plant has its own unit of measure.
 Let's say wheat is measured in ton, tomato in pound and cucumbers as
 counts. :-)
 
 x-rnorm(1000,mean=0,sd=1,main=Right screen)
 
 wheat1-rnorm(100,mean=0,sd=1)
 wheat2-rnorm(150,mean=0,sd=2)
 tomatos3-rnorm(200,mean=0,sd=3)
 tomatos4-rnorm(250,mean=0,sd=4)
 cucumbers5-rnorm(300,mean=0,sd=5)
 cucumbers6-rnorm(400,mean=0,sd=6)
 par(mfrow=c(1,2))
 
 hist(x, main=Left screen OK)
 
 boxplot(wheat1,wheat2,tomatos3,tomatos4,cucumbers5,cucumbers6)
 title (Right screen: boxplot with plants)
 
 Thank you in advance for any suggestions,
 
 Aldi
 
 
 


David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to parallelize 'apply' across multiple cores on a Mac

2013-05-03 Thread David Romano

Hi everyone,

I'm trying to use apply (with a call to zoo's rollapply within) on the
columns of a 1.5Kx165K matrix, and I'd like to make use of the other cores
on my machine to speed it up. (And hopefully also leave more memory free: I
find that after I create a big object like this, I have to save my
workspace and then close and reopen R to be able to recover memory tied up
by R, but maybe that's a separate issue -- if so, please let me know!)

It seems the package 'multicore' has a parallel version of 'lapply', which
I suppose I could combine with a 'do.call' (I think) to gather the elements
of the output list into a matrix, but I was wondering whether there might
be another route.

And, in case the particular way I constructed the call to 'apply' might be
the source of the problem, here is a deconstructed version of what I did to
each column, for easier parsing:
-  begin call to 'apply'

Step 1:  Identify several disjoint subsequences of fixed length, say length
three, of a column.

column.values - 1:16
desired.subseqs - c( NA, NA, NA, 1, 1, 1, NA, 1, 1, 1, NA, NA, 1,1,1, NA
)   # this vector is used for every column.
desired.values - desired.subseq * column.values

Step 2:  Find the average value of each subsequence.

desired.means - rollapply( desired.values, 3, mean, fill=NA, align =
right, na.rm = FALSE)  # put mean in highest index of subsequence and
retain original vector length
desired.means
[1] NA NA NA NA NA 5 NA NA NA 9 NA NA NA NA NA 14 NA

Step 3:   Shift values forward by one index value, retaining original
vector length.

desired.means - zoo( desired.means )  # in order to be able to use lag.zoo
desired.means - lag( desired.means, k = -1, na.pad = TRUE)
desired.means
[1] NA NA NA NA NA NA 5 NA NA NA 9 NA NA NA NA 14

Step 4:   Use last-observation-carried-forward, retaining original vector
length.

desired.means - na.locf( desired.means, na.rm = FALSE )
desired.means
[1] NA NA NA NA NA NA 5 5 5 5 9 9 9 9 9 14

Step 5:  Use next-observation-carried-backward to assign values to initial
sequence of NAs.

desired.means - na.locf( desired.means, fromLast = TRUE)
desired.means
[1] 5 5 5 5 5 5 5 5 5 5 9 9 9 9 9 14

Step 6:  Convert back to vector (from zoo object), and subtract from column.

desired.column - vector.values - coredata(desired.means)
desired.column
[1] -4 -3 -2 -1 0 1 2 3 4 5 2 3 4 5 6 2
-  end call to 'apply' 

Thanks,
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to best add columns to a matrix with many columns

2013-05-03 Thread David Romano

Hi everyone,

I have large data frame, say df1,  with 165K columns, and all but the first
four columns of df1 are numeric.   I transformed the numeric data and
obtained a matrix, call it data.m, with 165K - 4 columns, and then tried to
create a second data frame by replacing the numeric columns of df1 by
data.m.  I did this in two ways, and both ways instantly used up all the
available memory, so I was wondering whether there was a better way to do
this.

Here's what I tried:

df2 - df1
df2[ ,5:length(df1)] - data.m

and

df2 - cbind( df1[1:4], data.m)

Thanks,
David

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mean for each observation

2013-05-03 Thread arun

HI,
Not sure I understand it correctly.

dat1- read.table(text=
site Year doy fish Feed swim agr_1 agr_2 agr_3 rest hide
3 2012 203 1 1 0 0 0 0 0 0
3 2012 203 1 0 1 0 0 0 0 0
3 2012 203 1 0 1 0 0 0 0 0
3 2012 203 2 0 0 0 0 0 1 0
3 2012 203 2 1 0 0 0 0 0 0
3 2012 203 2 1 0 0 0 0 0 0
4 2012 197 1 0 0 0 0 0 1 0
4 2012 197 1 1 0 0 0 0 0 0
4 2012 197 1 0 1 0 0 0 0 0
4 2012 197 3 0 0 0 0 0 0 1
4 2012 197 3 1 0 0 0 0 0 0
,sep=,header=TRUE) 
dat2-reshape(dat1,direction=long,varying=7:9,sep=_)
row.names(dat2)- 1:nrow(dat2)
 head(dat2)
#  site Year doy fish Feed swim rest hide time agr id
#1    3 2012 203    1    1    0    0    0    1   0  1
#2    3 2012 203    1    0    1    0    0    1   0  2
#3    3 2012 203    1    0    1    0    0    1   0  3
#4    3 2012 203    2    0    0    1    0    1   0  4
#5    3 2012 203    2    1    0    0    0    1   0  5
#6    3 2012 203    2    1    0    0    0    1   0  6

library(plyr)
#fish, year, site
ddply(dat2,.(fish,Year,site),function(x) numcolwise(mean)(x[,c(5:8)])) 
#  fish Year site  Feed  swim  rest hide
#1    1 2012    3 0.333 0.667 0.000  0.0
#2    1 2012    4 0.333 0.333 0.333  0.0
#3    2 2012    3 0.667 0.000 0.333  0.0
#4    3 2012    4 0.500 0.000 0.000  0.5

#fish 

 ddply(dat2,.(fish),function(x) numcolwise(mean)(x[,c(5:8)]))
#  fish  Feed swim  rest hide
#1    1 0.333  0.5 0.167  0.0
#2    2 0.667  0.0 0.333  0.0
#3    3 0.500  0.0 0.000  0.5
A.K.

Hi 
I did fish behavior at different sites. 
Each fish represent a rep at each site. 
e.g for my data 
site   Yeardoy fishFeedswimagr_1   agr_2   agr_3   rest
hide 
3  2012203 1   1   0   0   0   0   0   
0 
3  2012203 1   0   1   0   0   0   0   
0 
3  2012203 1   0   1   0   0   0   0   
0 
3  2012203 2   0   0   0   0   0   1   
0 
3  2012203 2   1   0   0   0   0   0   
0 
3  2012203 2   1   0   0   0   0   0   
0 
4  2012197 1   0   0   0   0   0   1   
0 
4  2012197 1   1   0   0   0   0   0   
0 
4  2012197 1   0   1   0   0   0   0   
0 
4  2012197 3   0   0   0   0   0   0   
1 
4  2012197 3   1   0   0   0   0   0   
0 

1. I would like to combine column agr_1, agr_2 and agr_3 
2. How to calculate mean for each fish for each behavior 
Any suggestion is appreciated. 
Thanks 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to best add columns to a matrix with many columns

2013-05-03 Thread Jeff Newmiller

I am not seeing any good justification in your description for converting to 
matrix if you are planning to convert it back to data frame. Memory is going to 
be inefficiently-used if you do this.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

David Romano drom...@stanford.edu wrote:

Hi everyone,

I have large data frame, say df1,  with 165K columns, and all but the
first
four columns of df1 are numeric.   I transformed the numeric data and
obtained a matrix, call it data.m, with 165K - 4 columns, and then
tried to
create a second data frame by replacing the numeric columns of df1 by
data.m.  I did this in two ways, and both ways instantly used up all
the
available memory, so I was wondering whether there was a better way to
do
this.

Here's what I tried:

df2 - df1
df2[ ,5:length(df1)] - data.m

and

df2 - cbind( df1[1:4], data.m)

Thanks,
David

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating distance matrix for large dataset

2013-05-03 Thread steven mosher

 I have a version that uses bigmemory on my blog, but looks at distance on
a sphere for a 36k * 36K  matrix

 not hundreds of Gb  so I dont know if the approach will work for you


http://stevemosher.wordpress.com/2012/04/12/nick-stokes-distance-code-now-with-big-memory/


Steve

However,  I never tested it with
On May 2, 2013 9:40 PM, HJ YAN yhj...@googlemail.com wrote:

 Dear R users


 I wondered if any of you ever tried to calculate distance matrix with very
 large data set, and if anyone out there can confirm this error message I
 got actually mean that my data is too large for this task.

 negative length vectors are not allowed


 My data size and code used

  dim(mydata_nor)[1] 365000144 d - dist(mydata_nor, method =
 euclidean)



 Here my data has 1000 samples each has a year data observed by 10 minutes
 interval daily, so the size is  (365* 1000) * 144.


 I checked the manual of function 'dist' but can not see the upper limit
 size allowed, and I bet there should be one, so any hints is appreciated.


 I would also be grateful if any other method for calculating distance
 matrix for large dataset could be advised.



 I appreciate reproducible code should be provided for your advice, so try
 below if needed:

 A-matrix(1:365000*144,nrow=365000,ncol=144) dim(A)[1] 365000144
 d1-dist(A,method=euclidean)Error in dist(A, method = euclidean) :
   negative length vectors are not allowed




 Many thanks in advance!

 HJ

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R 2.15.2 Failed to load sRGB colorspace file

2013-05-03 Thread Beto .

Hello,

I built R 2.15.2 on Solaris X64,
I have an issue when trying to execute the check target to test if everything 
goes ok.
Do you have any idea what could be causing this issue?

Error message:

Examples/tools-Ex.Rout.fail

 cat(Time elapsed: , proc.time() - get(ptime, pos = 'CheckExEnv'),\n)
Time elapsed:  1.483 0.045 2.195 0 0
 grDevices::dev.off()
Error in grDevices::dev.off() : Failed to load sRGB colorspace file
Execution halted



Thanks for your help,
Humberto.

  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Factor deletion criteria

2013-05-03 Thread Iuri Gavronski

Hi,
I would like to know the criteria by which R removes a factor in linear
models. For example, I have a four level factor, and R creates 3 dummies to
estimate coefficients. Which level is chosen? Can I chance it?
Thanks,
Iuri

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to best add columns to a matrix with many columns

2013-05-03 Thread David Romano

Sorry, Jeff, I misspoke:  the 'matrix' data.m is really a data frame -- I
was just thinking about it as a matrix since it's the numeric part of df1,
and didn't realize the thought made it's way in the message.   So the
memory issues are unrelated to converting between data frames and
matrices.  -David

On Fri, May 3, 2013 at 8:20 PM, Jeff Newmiller jdnew...@dcn.davis.ca.uswrote:

 I am not seeing any good justification in your description for converting
 to matrix if you are planning to convert it back to data frame. Memory is
 going to be inefficiently-used if you do this.
 ---
 Jeff NewmillerThe .   .  Go Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
 ---
 Sent from my phone. Please excuse my brevity.

 David Romano drom...@stanford.edu wrote:

 Hi everyone,
 
 I have large data frame, say df1,  with 165K columns, and all but the
 first
 four columns of df1 are numeric.   I transformed the numeric data and
 obtained a matrix, call it data.m, with 165K - 4 columns, and then
 tried to
 create a second data frame by replacing the numeric columns of df1 by
 data.m.  I did this in two ways, and both ways instantly used up all
 the
 available memory, so I was wondering whether there was a better way to
 do
 this.
 
 Here's what I tried:
 
 df2 - df1
 df2[ ,5:length(df1)] - data.m
 
 and
 
 df2 - cbind( df1[1:4], data.m)
 
 Thanks,
 David
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Factor deletion criteria

2013-05-03 Thread David Winsemius


On May 3, 2013, at 3:32 PM, Iuri Gavronski wrote:

 Hi,
 I would like to know the criteria by which R removes a factor in linear
 models. For example, I have a four level factor, and R creates 3 dummies to
 estimate coefficients. Which level is chosen? Can I chance it?

The default order is alphabetical. Lowest lexical sorted item is the reference 
level.

Changing levels is possible:

?levels
?factor

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read .csv file and plot a graph

2013-05-03 Thread Jim Lemon


On 05/03/2013 11:49 PM, Vahe nr wrote:

Hi all,

I have a big .csv file (21Mb with 100 rows) it has this shape:
x
1 NaN
2 NaN
3 0.23

and so on.

So the first column has x as a header then row number, the second column
contains values between -1,1 and NaN for empty values.

What should I need to do is: create a new .csv file from this one excluding
NaN values and plot a line graph using the new .csv file.

Or can I use the old .csv file to plot a graph excluding NaN values.


Hi Vahe,
If you want to plot the line ignoring the NaN values, rather than having 
the line break at each NaN, use this:


vndat-data.frame(1:10,
 x=c(-1,-0.6,-0.4,NaN,-0.2,0.2,0.4,NaN,0.6,0.8))
plot(vndat$x[complete.cases(vndat$x)],type=l)

Jim (the other one)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

96 matches

Mail list logo