Re: [R] R looks for a folder not specified

2011-01-17 Thread Jim Lemon

On 01/17/2011 01:31 PM, l.chhay wrote:


Dear R community,

I have been getting this warning message after running a function sourced
from an R script, and can't seem to work out why R is looking for a folder
that wasn't even specified (it attaches a \NA to the specified directory,
where assess_rev has not asked to do so at all. R code has been tested by
another user and that works fine).

I have tried saving the files in a new folder and onto a different drive,
but that doesn't seem to fix the problem.. I have also checked to make sure
that I've the appropriate access options.

Has anyone encountered this problem where R starts looking for this
non-existent \NA folder?


source(S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\assess_rev3.R)
assess_rev(data.dir=S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\NEPAL_JUL81_B1\\lam0\\,rev.lag=13,series=NEPAL_JUL81_B1,comp=adj)

Error in file(file, r) : cannot open the connection
In addition: Warning message:
In file(file, r) :
   cannot open file
'S:\research\boxcox\Series_to_experiment_with\Revisions_analysis\NEPAL_JUL81_B1\lam0\NA':
No such file or directory


Hi Leanne,
I thought it was the line break in the directory string, but the line 
wasn't broken when I started to answer. My next guess is that you don't 
need the trailing \\ on the directory string.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Truetype and Opentype font in pdf device

2011-01-17 Thread Prof Brian Ripley

On Sat, 15 Jan 2011, Kohske Takahashi wrote:


Deal all,

I want to know if truetype or opentype fonts are available in pdf
device (i.e., pdf() or dev.copy2pdf()), and if so, how to do it?


They are not in general available in PDF, the language.  The 
cairo-based device embed individual glyph information from such fonts 
(perhaps as vectors and perhaps as bitmaps).



Now I can do as followings:

1. convert ttf to afm using ttf2afm, e.g.: $ ttf2afm Impact.ttf  Impact.afm
2. put the afm file in $R_HOME/library/grDevices/afm
3. register a new type1 font: pdfFonts(Impact=Type1Font(Impact,
rep(Impact.afm, 4), encoding = TeXtext.enc))
4. specify the fontfamily in gpar: grid.text('hello grid world',
gp=gpar(fontfamily=Impact))

but obviously, it is better if truetype or opentype fonts are directly
available without conversion to type1 font.


But you would still need to make the font available to your PDF 
viewer/printer, and in general that does need coversion (to a font 
type supported in PDF or to bitmaps/vectors).



Also, I found that Cairo package can handle truetype or opnetype font.
However, the package seems not to support fontfamily, hence I cannot
use it through gpar.


Have you not considered the built-in and fully featured cairo_pdf 
device?  (Not Windows, but you didn't tell us your OS.)



Does anyone know about this topic?

Thank you in advance.

--
Kohske Takahashi takahashi.koh...@gmail.com

Research Center for Advanced Science and Technology,
The University of  Tokyo, Japan.
http://www.fennel.rcast.u-tokyo.ac.jp/profilee_ktakahashi.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Finding NAs in DF

2011-01-17 Thread Johannes Graumann
Hi,

What is an efficient way to take this DF

data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))

and get 
c(NA,TWO,BOTH,ONE)

as the result, where NA corresponds to a row without NAs, TWO indicates NA 
in the second and ONE in the first column.

Thanks for any pointers.

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding NAs in DF

2011-01-17 Thread Ivan Calandra

Hi,

I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I 
have no idea what you're looking for...


But would that do?
df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
apply(df,1, FUN=function(x) length(x[is.na(x)]))
[1] 0 1 2 1

There might be better ways to do it, but it works
HTH,
Ivan

Le 1/17/2011 11:01, Johannes Graumann a écrit :

Hi,

What is an efficient way to take this DF

data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))

and get
c(NA,TWO,BOTH,ONE)

as the result, where NA corresponds to a row without NAs, TWO indicates NA
in the second and ONE in the first column.

Thanks for any pointers.

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median by geometric mean

2011-01-17 Thread S Ellison
Will this do?

x - runif(20, 1, 100)

exp( median( log( x) ) ) 

S Ellison



 Skull Crossbones witch.of.agne...@gmail.com 15/01/2011 16:26 
Hi All,

I need to calculate the median for even number of data points.However
instead of calculating
the arithmetic mean of the two middle values,I need to calculate their
geometric mean.

Though I can code this in R, possibly in a few lines, but wondering if
there
is
already some built in function.

Can somebody give a hint?

Thanks in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PANEL DATA SIMULATION

2011-01-17 Thread Carlos Brás
Dear R community,and especially Giovanni Millo,

 

For my master's thesis i need to simulate a panel data with the fixed
effects correlated with the predicor, so i run the 

the following code:

 

 

set.seed(1970)

###Panel data simulation with alphai correlated with
xi#

n - 5

t - 4 

nt - n*t

pData - data.frame(id = rep(paste(JohnDoe, 1:n, sep = _), each =
t),time = rep(1981:1984, n))

 

rho -0.95

alphai - rnorm(n,mean=0,sd=1)#alphai simulation

x- as.matrix(rnorm(nt,1))#xi simulation

akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai


cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix

cormat.chold - chol(cormat)#choleski transformation of correlation
matrix  


akrox - cbind(akro,x)  


ax - akrox%*%cormat.chold  


ai - as.matrix(ax[,1]) 

pData$alphai-as.vector(ai) 


xcorr - as.matrix(ax[,2:(1+ncol(x))])

pData$xcorrei-as.vector(xcorr)

pData$yi - 5 + pData$alphai + 5* pData$xcorrei + rnorm(nt)

##panel data
frame##

library(plm)

pData - pdata.frame(pData, c(id, time))

pData

 

I think the panel is correctly generated, but my doubt is about the
simulation of the correlated variables:

 

alphai - rnorm(n,mean=0,sd=1)#alphai simulation

x- as.matrix(rnorm(nt,1))#xi simulation

akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai


cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix

cormat.chold - chol(cormat)#choleski transformation of correlation
matrix  


akrox - cbind(akro,x)  


ax - akrox%*%cormat.chold  


ai - as.matrix(ax[,1]) 

pData$alphai-as.vector(ai) 


xcorr - as.matrix(ax[,2:(1+ncol(x))])

This method is correct or is there a better way to do this?

 

Must generate a variable xi correlated with the alphai, for various
values of rho:

For example rho=(0,0.5,0.6,0.8,0.95,0.99)

how do I simulate the xi associated with each value of rho and put in
the data frame at once?

tried various ways without success.

Please give your opinion and suggestions to improve my simulation. Tank
you,

best regards

 

 

   Carlos Brás



Confidencialidade: Esta mensagem (e eventuais ficheiros anexos) é destinada 
exclusivamente às pessoas nela indicadas e tem natureza confidencial. Se 
receber esta mensagem por engano, por favor contacte o remetente e elimine a 
mensagem e ficheiros, sem tomar conhecimento do respectivo conteúdo e sem 
reproduzi-la ou divulgá-la.

Confidentiality Warning: This e-mail message (and any attached files) is 
confidential and is intended solely for the use of the individual or entity to 
whom it is addressed. lf you are not the intended recipient of this message 
please notify the sender and delete and destroy all copies immediately.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding NAs in DF

2011-01-17 Thread Patrick Burns

Simpler would be:

rowSums(is.na(df))


On 17/01/2011 10:13, Ivan Calandra wrote:

Hi,

I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I
have no idea what you're looking for...

But would that do?
df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
apply(df,1, FUN=function(x) length(x[is.na(x)]))
[1] 0 1 2 1

There might be better ways to do it, but it works
HTH,
Ivan

Le 1/17/2011 11:01, Johannes Graumann a écrit :

Hi,

What is an efficient way to take this DF

data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))

and get
c(NA,TWO,BOTH,ONE)

as the result, where NA corresponds to a row without NAs, TWO
indicates NA
in the second and ONE in the first column.

Thanks for any pointers.

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Patrick Burns
pbu...@pburns.seanet.com
twitter: @portfolioprobe
http://www.portfolioprobe.com/blog
http://www.burns-stat.com
(home of 'Some hints for the R beginner'
and 'The R Inferno')

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Queries about the statistic methodology used by R-Intergration on the data with normal distribution

2011-01-17 Thread Cheng, Daniel WC [IAD]

Dear Sir,

Our bank purchased an product called Inforsense years ago, there is a 
department using it to conduct a data analysis using the Bestfit 
function called General R. As it is understood that the General-R is 
the R-Intergration plugin developed by R-Project, we would like to ask 
some questions on the difficulty that we encounter.


During our statistics re-performance, we found that even if the data are 
considered to be normally distributed by General-R, the statistic 
result on the confidence interval determination is still greatly 
different from the result calculated by EXCEL spreadsheet*. As we are 
not very familiar with the Statistics and not sure how many kinds of 
statistic models for data of normal distribution there are, thus we 
would like to inquire if the R is using different methodology on the 
confidence interval determination from the EXCEL? Thank you very much.



*P/S- We just simply use the Excel Function (Mean, Stdev and  
Confidence) to calculate the confidence interval for the data.




--
Thanks  Regards,
Daniel Cheng

Internal Audit Department
Wing Lung Bank Limited
Direct Line: 2710 4101
Fax : 2783 7292
Email :danielch...@winglungbank.com


---
DISCLAIMER: E-mail transmission cannot be guaranteed to ...{{dropped:5}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] What's wrong with Omegahat?

2011-01-17 Thread johannes rara
Omegahat project page seems to be down or registeration of omegahat
domain has ended?

http://www.omegahat.org/

Where I can find the RCurl package?

-J

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting the first occurrence of a value after an occurrence of a different value

2011-01-17 Thread Petr Savicky
On Sun, Jan 16, 2011 at 11:09:58PM -0800, surreyj wrote:
 
 Hello, 
 
 Back again, 
 
 I thought the problem was solved but I realised that the only reason I was
 getting the correct answer was because my data set happened to only have two
 rfts to choose from, so it looked correct.
 
 I have been using:
 
 onlyfirstresponseafterrft-which(!diff(as.numeric(factor(Stat, levels =
 c(MagDwn, Resp)
[...]
 
 to get my results and what is being delivered is the rows at which a resp
 occurs after a magdwn except I only want the first resp after a mag
 down... This seems simple to figure out and I have tried a lot of things but
 its just not happening!

Is it required to select positions with Resp, which are the
end-points of subsequences of the form MagDwn other^* Resp ?
For the sequence

  1  MagDwn
  2  other 
  3  MagDwn
  4  Resp  
  5  other 
  6  Resp  
  7  MagDwn
  8  other 
  9  Resp  

this would be positions 4 and 9.

The positions of Resp in these end-points may be computed,
for example

  Vals - c(MagDwn, Resp, other)
  Stat - Vals[c(1, 3, 1, 2, 3, 2, 1, 3, 2)]
  ind - which(Stat %in% c(MagDwn, Resp))
  Reduced - Stat[ind]
  ind[which(diff(Reduced == Resp) == 1) + 1]
  # [1] 4 9

The positions of the corresponding MagDwn are

  ind[which(diff(Reduced == Resp) == 1)]
  # [1] 3 7

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting the first occurrence of a value after an occurrence of a different value

2011-01-17 Thread surreyj

Hi Freddy, 

I have a long column of event codes e.g. below (in multiple files) that I am
trying to analyse.  

OutMag
FirstResp
InMag
MagUp
OutMag
MagDwn
Resp
Resp
Resp
InMag
MagUp
OutMag
InMag
OutMag
InMag
OutMag
InMag
OutMag
InMag
MagDwn
OutMag
Resp
MagUp
InMag
MagDwn
OutMag
Resp
MagUp

With these files I have been using which to select the appropriate event
codes so I can do analysis on timing between events using the time
appropriate time coloumn. 

This has worked well so far and now I am faced with the problem that I need
to select the first Resp that occurs after each MagDwn.  Sometimes there
will be just one Resp in between a MagDwn and sometimes there will be
many e.g. up to 500.  

This code onlyfirstresponseafterrft-which(!diff(as.numeric(factor(Stat,
levels = c(MagDwn, Resp)
allowed me to select the first Resp (or so I thought) but in the file I
tried it on there were only two occurences of Resp inbetween each MagDwn
and now that I have tried it on files with more Resp it is actually
selecting all but the last one that happens before the Mag Dwn but I only
need the first one. 

Hope that makes sense, thanks for your help. 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Selecting-the-first-occurrence-of-a-value-after-an-occurrence-of-a-different-value-tp3217340p3220852.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question about svm(e1071)

2011-01-17 Thread mutohrn
Dear Prof. Ligges,

Thank you for the reply.
Is an order of calculation changed when samples are shuffled?
Does that happen because of Sequential Minimal Optimization(SMO)?

I noticed that when I set scale=F, SVs were identical.
However, differences between coefs are sometimes relatively large.

Best,

Hiro


### Script start ###

set.seed(50)
s - sample(ncol(data))

m   - svm(x=t(data), y=factor(data.cl   ), scale=F, 
type=C-classification, kernel=linear)
m.s - svm(x=t(data[,s]), y=factor(data.cl[s]), scale=F, 
type=C-classification, kernel=linear)

 sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),]))
[1] 0
 sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))]))
[1] 0.3227749

### Script end ###

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Saturday, January 15, 2011 3:10 AM
To: 武藤裕紀(創薬資源研究部1GBIOINF)
Cc: r-help@r-project.org
Subject: Re: [R] question about svm(e1071)

Looking at your results suggests that differences are probably based on 
expected minor numerical inaccuracies and the possibly alternating sign 
of the support vectors.

Best,
Uwe Ligges



On 13.01.2011 01:28, muto...@chugai-pharm.co.jp wrote:
 Dear all,

 I executed svm calculation using e1071 library with a microarray data 
 (http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt).
 Then, I shuffled the data samples and executed svm calculation again.
 The results of 2 calculation were different (in SV, coefs and weights).

 I attached the script below. Could please tell me why this happens?
 If possible please tell me how to make them equal.

 Best regards,

 Hiro

 ### Script start ###

 library(e1071)
 data- 
 read.table('http://www.iu.a.u-tokyo.ac.jp/~kadota/R/data_Singh_RMA_3274.txt', 
 header=TRUE, row.names=1, sep=\t, quote=)

 data.cl- rep(NA,ncol(data))
 data.cl[grep('Normal',colnames(data))]- 'Normal'
 data.cl[grep('Tumour',colnames(data))]- 'Tumour'

 s- sample(ncol(data))

 m- svm(x=t(data), y=factor(data.cl   ), scale=T, 
 type=C-classification,kernel=linear)
 m.s- svm(x=t(data[,s]), y=factor(data.cl[s]), scale=T, 
 type=C-classification, kernel=linear)

 w- t(m  $coefs) %*% m$SV
 w.s- t(m.s$coefs) %*% m.s$SV

 # SV and coefs are slightly different
 sum(abs(m$SV[order(rownames(m$SV)),] - m.s$SV[order(rownames(m.s$SV)),]))
 sum(abs(m$coefs[order(rownames(m$SV))] -m.s$coefs[order(rownames(m.s$SV))]))

 # rank of weight are not identical
 all(rank(w)==rank(w.s))

 ### Script end ###


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help for R plot

2011-01-17 Thread Fabrice Tourre
Hi all,
How to plot as the coordinate  as in my attachment? I want to trim the
coordinate  and one of plot as the figure in attachment. Does any one
have such example?
Thanks.
attachment: Screen shot 2011-01-17 at 11.22.20 AM.png__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] What's wrong with Omegahat?

2011-01-17 Thread Vaness De Wit
Hi,

strange indeed, try this url:

http://cran.r-project.org/web/packages/RCurl/index.html

kind regards,

Vanessa

2011/1/17 johannes rara johannesr...@gmail.com

 Omegahat project page seems to be down or registeration of omegahat
 domain has ended?

 http://www.omegahat.org/

 Where I can find the RCurl package?

 -J

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] xyplot: modify axis tick marks

2011-01-17 Thread Kang Min
Thanks for all your great suggestions! I've learnt a lot about
graphics now..

Kang Min

On Jan 17, 1:46 am, Hugo Mildenberger hugo.mildenber...@web.de
wrote:
 Using lattice and the rainfall$Time series as proposed below by Dennis
 gives also a nice result:

 rainfall$Time - seq(from = as.Date('1993-01-01'),
                                 to      = as.Date('2007-12-01'), by = 'month')
 xyplot(rainfall~Time,data=rainfall,type=c(g,p,l,smooth))

 On Sunday 16 January 2011 17:33:18 Dennis Murphy wrote:



  Hi:

  Try this, since your data have no missing months:

  rainfall$Time - seq(from = as.Date('1993-01-01'), to =
  as.Date('2007-12-01'), by = 'month')
  g - ggplot(rainfall, aes(x = Time, y = rainfall))
  g + geom_path()

  HTH,
  Dennis

  On Sun, Jan 16, 2011 at 5:01 AM, Kang Min ngokang...@gmail.com wrote:

   Hi,

   I would like to plot time against rainfall data (data is at the end)
   using xyplot.

   The basic code looks like this: xyplot(rainfall~time, type=a)
   When I do this, the graph looks ok except that the x-axis has too many
   values. I would just like to display the years and not the months on
   the x-axis. I've been fiddling around with 'scales', and read previous
   posts about axis problems but I still can't find the solution.

   Thanks in advance.

   Kang Min

   time    rainfall
   Jan1993 176.4
   Feb1993 69.2
   Mar1993 250.5
   Apr1993 283.9
   May1993 129.9
   Jun1993 115.5
   Jul1993 240
   Aug1993 106.8
   Sep1993 61.7
   Oct1993 175.5
   Nov1993 250.8
   Dec1993 308.5
   Jan1994 56.9
   Feb1994 133.5
   Mar1994 288.2
   Apr1994 154
   May1994 169.6
   Jun1994 184.7
   Jul1994 53.8
   Aug1994 45.1
   Sep1994 23.7
   Oct1994 84.7
   Nov1994 322.2
   Dec1994 425.4
   Jan1995 349.4
   Feb1995 334
   Mar1995 67.7
   Apr1995 242.3
   May1995 84.4
   Jun1995 63.7
   Jul1995 173.6
   Aug1995 211.6
   Sep1995 29.5
   Oct1995 101.1
   Nov1995 372.8
   Dec1995 302.5
   Jan1996 173.2
   Feb1996 180.2
   Mar1996 129.7
   Apr1996 178.2
   May1996 107.5
   Jun1996 265.8
   Jul1996 162.3
   Aug1996 258.4
   Sep1996 297
   Oct1996 300
   Nov1996 180.2
   Dec1996 185.5
   Jan1997 15.4
   Feb1997 105.4
   Mar1997 34.3
   Apr1997 118.4
   May1997 41.6
   Jun1997 78.9
   Jul1997 18.6
   Aug1997 86.6
   Sep1997 31.1
   Oct1997 78.4
   Nov1997 158.3
   Dec1997 351.9
   Jan1998 268.8
   Feb1998 32.5
   Mar1998 58.8
   Apr1998 187.7
   May1998 370.8
   Jun1998 198.8
   Jul1998 259.2
   Aug1998 195
   Sep1998 258.2
   Oct1998 222.7
   Nov1998 107.2
   Dec1998 463.4
   Jan1999 193.9
   Feb1999 67.4
   Mar1999 181.4
   Apr1999 88.5
   May1999 157.1
   Jun1999 103.4
   Jul1999 225.4
   Aug1999 204
   Sep1999 125.9
   Oct1999 205
   Nov1999 241.5
   Dec1999 340.5
   Jan2000 275.2
   Feb2000 237.8
   Mar2000 238.3
   Apr2000 311.6
   May2000 96.8
   Jun2000 157.5
   Jul2000 116.1
   Aug2000 113.5
   Sep2000 81.1
   Oct2000 120.9
   Nov2000 385.7
   Dec2000 236
   Jan2001 425.8
   Feb2001 86.6
   Mar2001 297.3
   Apr2001 203.3
   May2001 164.9
   Jun2001 137.1
   Jul2001 111.3
   Aug2001 158.3
   Sep2001 162
   Oct2001 252.2
   Nov2001 175.3
   Dec2001 609
   Jan2002 221.2
   Feb2002 50.8
   Mar2002 55.6
   Apr2002 116.5
   May2002 236.6
   Jun2002 83.1
   Jul2002 233.7
   Aug2002 54.2
   Sep2002 124.2
   Oct2002 10.8
   Nov2002 307.2
   Dec2002 255
   Jan2003 444.2
   Feb2003 172.9
   Mar2003 154.6
   Apr2003 159.9
   May2003 81.8
   Jun2003 50.3
   Jul2003 170.4
   Aug2003 193.6
   Sep2003 205.3
   Oct2003 351.4
   Nov2003 133.8
   Dec2003 273
   Jan2004 600.9
   Feb2004 31.9
   Mar2004 269.4
   Apr2004 57.1
   May2004 137.6
   Jun2004 127.2
   Jul2004 166.6
   Aug2004 185.2
   Sep2004 128.9
   Oct2004 125.6
   Nov2004 166.2
   Dec2004 139.8
   Jan2005 163.2
   Feb2005 8.4
   Mar2005 82.4
   Apr2005 81.7
   May2005 331.1
   Jun2005 82.3
   Jul2005 104
   Aug2005 58.5
   Sep2005 175.7
   Oct2005 314.5
   Nov2005 362.9
   Dec2005 166
   Jan2006 454.4
   Feb2006 115.5
   Mar2006 83.1
   Apr2006 239.8
   May2006 205.7
   Jun2006 236.8
   Jul2006 153.8
   Aug2006 127.3
   Sep2006 83.3
   Oct2006 102
   Nov2006 185.6
   Dec2006 765.9
   Jan2007 450.1
   Feb2007 105.5
   Mar2007 269.1
   Apr2007 240.2
   May2007 127.2
   Jun2007 139
   Jul2007 141.7
   Aug2007 190.7
   Sep2007 149
   Oct2007 237.2
   Nov2007 367.9
   Dec2007 468.6

   __
   r-h...@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.

     [[alternative HTML version deleted]]

  __
  r-h...@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

 __
 r-h...@r-project.org 

[R] cannot allocate vector of size ... in RHLE5 PAE kernel

2011-01-17 Thread Mauricio Zambrano
Dear R community,

I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a
PAE kernel, as you can see here:

$ uname -a
Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010
i686 i686 i386 GNU/Linux


When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940,
ncol=9000) ), I got the following error:


 Error: cannot allocate vector of size 238.3 Mb


However, the amount of free memory in my machine seems to be much
larger than this:

system(free)
\ total   used   free sharedbuffers cached
Mem:  1246623663541166112120  0  675962107556
-/+ buffers/cache:41789648287272
Swap: 12582904  0   12582904


I tried to increase the memory limit available for R by using:

$ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M


but it didn't work.


Any hint about how can I get R using all the memory available in the machine ?


Thanks in advance,

Mauricio

-- 
===
Linux user #454569 -- Ubuntu user #17469

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding NAs in DF

2011-01-17 Thread Johannes Graumann
Both versions do not do what I am looking for, as they do not differentiate 
where the NA is, if there is just one.
My original wished for result therefore holts, but should probably be 
rewritten
c(NA,B,AB,A)

Joh

On Monday 17 January 2011 14:06:30 Patrick Burns wrote:
 Simpler would be:
 
 rowSums(is.na(df))
 
 On 17/01/2011 10:13, Ivan Calandra wrote:
  Hi,
  
  I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I
  have no idea what you're looking for...
  
  But would that do?
  df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
  apply(df,1, FUN=function(x) length(x[is.na(x)]))
  [1] 0 1 2 1
  
  There might be better ways to do it, but it works
  HTH,
  Ivan
  
  Le 1/17/2011 11:01, Johannes Graumann a écrit :
  Hi,
  
  What is an efficient way to take this DF
  
  data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
  
  and get
  c(NA,TWO,BOTH,ONE)
  
  as the result, where NA corresponds to a row without NAs, TWO
  indicates NA
  in the second and ONE in the first column.
  
  Thanks for any pointers.
  
  Joh
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


signature.asc
Description: This is a digitally signed message part.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding NAs in DF

2011-01-17 Thread Henrique Dallazuanna
Try this:

factor(sapply(apply(is.na(df), 1, which), sum), labels = c(NA, TWO,
BOTH, ONE))

On Mon, Jan 17, 2011 at 9:23 AM, Johannes Graumann johannes_graum...@web.de
 wrote:

 Both versions do not do what I am looking for, as they do not differentiate
 where the NA is, if there is just one.
 My original wished for result therefore holts, but should probably be
 rewritten
c(NA,B,AB,A)

 Joh

 On Monday 17 January 2011 14:06:30 Patrick Burns wrote:
  Simpler would be:
 
  rowSums(is.na(df))
 
  On 17/01/2011 10:13, Ivan Calandra wrote:
   Hi,
  
   I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I
   have no idea what you're looking for...
  
   But would that do?
   df - data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
   apply(df,1, FUN=function(x) length(x[is.na(x)]))
   [1] 0 1 2 1
  
   There might be better ways to do it, but it works
   HTH,
   Ivan
  
   Le 1/17/2011 11:01, Johannes Graumann a écrit :
   Hi,
  
   What is an efficient way to take this DF
  
   data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
  
   and get
   c(NA,TWO,BOTH,ONE)
  
   as the result, where NA corresponds to a row without NAs, TWO
   indicates NA
   in the second and ONE in the first column.
  
   Thanks for any pointers.
  
   Joh
  
   __
   R-help@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel

2011-01-17 Thread Martin Maechler
 MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com
 on Mon, 17 Jan 2011 11:46:44 +0100 writes:

MZ Dear R community,
MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a
MZ PAE kernel, as you can see here:

MZ $ uname -a
MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010
MZ i686 i686 i386 GNU/Linux


MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940,
MZ ncol=9000) ), I got the following error:


 Error: cannot allocate vector of size 238.3 Mb


MZ However, the amount of free memory in my machine seems to be much
MZ larger than this:

MZ system(free)
MZ \ total   used   free sharedbuffers 
cached
MZ Mem:  1246623663541166112120  0  67596
2107556
MZ -/+ buffers/cache:41789648287272
MZ Swap: 12582904  0   12582904


MZ I tried to increase the memory limit available for R by using:

MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M


MZ but it didn't work.


MZ Any hint about how can I get R using all the memory available in the 
machine ?

Install a 64-bit version of Linux, i.e., ubuntu in your case
and work from there.
I don't think there's a way around that.

Martin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2011-01-17 Thread carpan

Dear R community,and especially Giovanni Millo,

For my master's thesis i need to simulate a panel data with the fixed  
effects correlated with the predicor, so i run the

the following code:


set.seed(1970)
###Panel data simulation with alphai correlated  
with xi#

n - 5
t - 4
nt - n*t
pData - data.frame(id = rep(paste(JohnDoe, 1:n, sep = _), each =  
t),time = rep(1981:1984, n))


rho -0.95
alphai - rnorm(n,mean=0,sd=1)#alphai simulation
x- as.matrix(rnorm(nt,1))#xi simulation
akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai
cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix
cormat.chold - chol(cormat)#choleski transformation of correlation matrix
akrox - cbind(akro,x)
ax - akrox%*%cormat.chold
ai - as.matrix(ax[,1])
pData$alphai-as.vector(ai)
xcorr - as.matrix(ax[,2:(1+ncol(x))])
pData$xcorrei-as.vector(xcorr)
pData$yi - 5 + pData$alphai + 5* pData$xcorrei + rnorm(nt)
##panel data frame##
library(plm)
pData - pdata.frame(pData, c(id, time))
pData

I think the panel is correctly generated, but my doubt is about the  
simulation of the correlated variables:


alphai - rnorm(n,mean=0,sd=1)#alphai simulation
x- as.matrix(rnorm(nt,1))#xi simulation
akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai
cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix
cormat.chold - chol(cormat)#choleski transformation of correlation matrix
akrox - cbind(akro,x)
ax - akrox%*%cormat.chold
ai - as.matrix(ax[,1])
pData$alphai-as.vector(ai)
xcorr - as.matrix(ax[,2:(1+ncol(x))])
This method is correct or is there a better way to do this?

Must generate a variable xi correlated with the alphai, for various  
values of rho:

For example rho=(0,0.5,0.6,0.8,0.95,0.99)
how do I simulate the xi associated with each value of rho and put in  
the data frame at once?

tried various ways without success
Please give your opinion and suggestions to improve my simulation. Tank you,
best regards


   Carlos Brás

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PANEL DATA SIMULATION(sorry for my previous email with no subject)

2011-01-17 Thread carpan



Dear R community,and especially Giovanni Millo,

For my master's thesis i need to simulate a panel data with the fixed  
effects correlated with the predicor, so i run the

the following code:


set.seed(1970)
###Panel data simulation with alphai correlated  
with xi#

n - 5
t - 4
nt - n*t
pData - data.frame(id = rep(paste(JohnDoe, 1:n, sep = _), each =  
t),time = rep(1981:1984, n))


rho -0.95
alphai - rnorm(n,mean=0,sd=1)#alphai simulation
x- as.matrix(rnorm(nt,1))#xi simulation
akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai
cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix
cormat.chold - chol(cormat)#choleski transformation of correlation matrix
akrox - cbind(akro,x)
ax - akrox%*%cormat.chold
ai - as.matrix(ax[,1])
pData$alphai-as.vector(ai)
xcorr - as.matrix(ax[,2:(1+ncol(x))])
pData$xcorrei-as.vector(xcorr)
pData$yi - 5 + pData$alphai + 5* pData$xcorrei + rnorm(nt)
##panel data frame##
library(plm)
pData - pdata.frame(pData, c(id, time))
pData

I think the panel is correctly generated, but my doubt is about the  
simulation of the correlated variables:


alphai - rnorm(n,mean=0,sd=1)#alphai simulation
x- as.matrix(rnorm(nt,1))#xi simulation
akro - kronecker(alphai ,matrix(1,t,1))#kronecker of alphai
cormat-matrix(c(1,rho,rho,1),nrow=2,ncol=2)#correlation matrix
cormat.chold - chol(cormat)#choleski transformation of correlation matrix
akrox - cbind(akro,x)
ax - akrox%*%cormat.chold
ai - as.matrix(ax[,1])
pData$alphai-as.vector(ai)
xcorr - as.matrix(ax[,2:(1+ncol(x))])
This method is correct or is there a better way to do this?

Must generate a variable xi correlated with the alphai, for various  
values of rho:

For example rho=(0,0.5,0.6,0.8,0.95,0.99)
how do I simulate the xi associated with each value of rho and put in  
the data frame at once?

tried various ways without success
Please give your opinion and suggestions to improve my simulation. Tank you,
best regards


   Carlos Brás

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R looks for a folder not specified

2011-01-17 Thread Duncan Murdoch

On 16/01/2011 9:31 PM, l.chhay wrote:


Dear R community,

I have been getting this warning message after running a function sourced
from an R script, and can't seem to work out why R is looking for a folder
that wasn't even specified (it attaches a \NA to the specified directory,
where assess_rev has not asked to do so at all. R code has been tested by
another user and that works fine).

I have tried saving the files in a new folder and onto a different drive,
but that doesn't seem to fix the problem.. I have also checked to make sure
that I've the appropriate access options.

Has anyone encountered this problem where R starts looking for this
non-existent \NA folder?


source(S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\assess_rev3.R)
assess_rev(data.dir=S:\\research\\boxcox\\Series_to_experiment_with\\Revisions_analysis\\NEPAL_JUL81_B1\\lam0\\,rev.lag=13,series=NEPAL_JUL81_B1,comp=adj)

Error in file(file, r) : cannot open the connection
In addition: Warning message:
In file(file, r) :
   cannot open file
'S:\research\boxcox\Series_to_experiment_with\Revisions_analysis\NEPAL_JUL81_B1\lam0\NA':
No such file or directory

(I am working from a Windows7 machine, using R 2.9.2.)


It looks as though something in the assess_rev function (which was 
presumably defined in that script you sourced) constructs a path from 
the data.dir argument and makes use of a missing value.  The missing 
value is converted to NA and is appended to the directory, and you see 
the error message.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] intercept point coordinates

2011-01-17 Thread Tonja Krueger
Hi List,
Can someone help me to calculate the coordinates of the red and green points? 
In this
example I found their approximate location by trying, but as I have to analyse
many similar curves, I’d rather calculate the exact location. 

data-
c(0.008248005, 0.061242387, 0.099095516, 0.189943027, 0.227796157, 0.258078661,
0.280790538, 0.303502416, 0.386779301, 0.454914934, 0.545762445, 0.591186201,
0.682033712, 0.757739971, 0.825875604, 0.848587482, 0.803163726, 0.833446230,
0.878869985, 0.871299359, 0.878869985, 0.947005619, 1.0, 0.992429374,
0.954576245, 0.894011237, 0.765310597, 0.621468704, 0.492768064, 0.333784920,
0.258078661, 0.174801775, 0.099095516, 0.008248005)

plot(data, type=l)
abline(h=0.9)

points(21.35,.9, pch=20, col=red)
points(26,.9, pch=20, col=green)
 

Thank you, 
Tonja
___
Neu: WEB.DE De-Mail - Einfach wie E-Mail, sicher wie ein Brief!  
Jetzt De-Mail-Adresse reservieren: https://produkte.web.de/go/demail02

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] t-test calculation correct?

2011-01-17 Thread Sascha Vieweg
A multinomial logit model (N=192) revealed (besides others) the 
following statistics for the outcome, y, and one predictor, x:


- y = A (baseline, n=34)
- y = B (n=26), B(x)=0.7323 (SE=0.2384)
- y = C (n=132), B(x)=0.6535 (SE=0.2041)

With a t-test I want to explore whether the two predictors differ 
significantly, and I use the following calculation (according to 
Bortz, 2005, p.140):


##
dm - 0.7323 - 0.6535
se.dm - sqrt( (0.2384 / (34 + 26)) + (0.2041 / (34 + 132)) )
t.val - dm / se.dm
pval - (1 - pt(t.val, df=(34+26+132) )) * 2
##

My question is where this calculation is wrong and why.

Ref.: Bortz, J. (2005). Statistik für Human- und 
Sozialwissenschaftler (6. Aufl). Berlin: Springer.



--
Sascha Vieweg, saschav...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] transform a df with a condition

2011-01-17 Thread Vijayan Padmanabhan

Hi 
Try the following...

df - data.frame(A = c(1,1,3,2,2,3,3),
 B = c(2,1,1,2,7,8,7),
 K = c(a.1, d.2, f.3,
 a.1, k.4, f.9, f.5))
df$ID-rownames(df)
df$K-as.character(as.character(df$K))

changefunction-function(z)
{
tmp - lapply(split(z, z[,4]), 
 function(x) within(x, if(A==3)B - 5 ))
dat2-tmp
df-unsplit(dat2,df$ID)
tmp - lapply(split(df, df[,4]), 
 function(x) within(x, if(A==3)K - chartr(f,m,K)))
dat2-tmp
df-unsplit(dat2,df$ID)
return(df)
}
dfnew-changefunction(df)
df$ID-NULL
dfnew$ID-NULL
dfnew



Regards
Vijayan Padmanabhan


What is expressed without proof can be denied without proof - Euclide. 


Can you avoid printing this?
Think of the environment before printing the email.
---
Please visit us at www.itcportal.com
**
This Communication is for the exclusive use of the intended recipient (s) and 
shall
not attach any liability on the originator or ITC Ltd./its Subsidiaries/its 
Group 
Companies. If you are the addressee, the contents of this email are intended 
for your 
use only and it shall not be forwarded to any third party, without first 
obtaining 
written authorisation from the originator or ITC Ltd./its Subsidiaries/its 
Group 
Companies. It may contain information which is confidential and legally 
privileged
and the same shall not be used or dealt with by any third party in any manner 
whatsoever without the specific consent of ITC Ltd./its Subsidiaries/its Group 
Companies.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding NAs in DF

2011-01-17 Thread Ivan Calandra

Maybe something along the lines:
apply(df,1, FUN=function(x) which(is.na(x)))

It's not exactly what you want, but it might work combined with the 
other solutions


HTH,
Ivan

Le 1/17/2011 12:23, Johannes Graumann a écrit :

Both versions do not do what I am looking for, as they do not differentiate
where the NA is, if there is just one.
My original wished for result therefore holts, but should probably be
rewritten
c(NA,B,AB,A)

Joh

On Monday 17 January 2011 14:06:30 Patrick Burns wrote:

Simpler would be:

rowSums(is.na(df))

On 17/01/2011 10:13, Ivan Calandra wrote:

Hi,

I hope you made a mistake in c(NA,TWO,BOTH,ONE) because if not, I
have no idea what you're looking for...

But would that do?
df- data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))
apply(df,1, FUN=function(x) length(x[is.na(x)]))
[1] 0 1 2 1

There might be better ways to do it, but it works
HTH,
Ivan

Le 1/17/2011 11:01, Johannes Graumann a écrit :

Hi,

What is an efficient way to take this DF

data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))

and get
c(NA,TWO,BOTH,ONE)

as the result, where NA corresponds to a row without NAs, TWO
indicates NA
in the second and ONE in the first column.

Thanks for any pointers.

Joh

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Ivan CALANDRA
PhD Student
University of Hamburg
Biozentrum Grindel und Zoologisches Museum
Abt. Säugetiere
Martin-Luther-King-Platz 3
D-20146 Hamburg, GERMANY
+49(0)40 42838 6231
ivan.calan...@uni-hamburg.de

**
http://www.for771.uni-bonn.de
http://webapp5.rrz.uni-hamburg.de/mammals/eng/1525_8_1.php

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help on strange installation process of JRI / rJava 0.9.0

2011-01-17 Thread Luedde, Mirko
Hi Nidhi,

 ... On the other hand, I also found that JRI.jar is missing from
 both of these (piodev...) installations. ...

this is resolved now in the /mnt/tools/r installation.

Background: When compiling R, one needs to provide an option
--enable-R-shlib in order that R is capable of dynamically linking
libraries.  Now the desired JRI package (Java calls into R) is part of
the rJava package.  However, only if R was compiled with the above
option does installation of the rJava package include also the JRI
package.  Technically this autodetect feature, as they call it,
might make sense, but from the user's and admin's perspective a more
visible warning from the installation of the rJava package certainly
is desirable, if JRI is not going to be provided.

Hence I recompiled and reinstalled R-2-12-1 and rJava 0.9.0
accordingly.

Best, Mirko


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding NAs in DF

2011-01-17 Thread jim holtman
building on the previous responses, does this give you what you want:

 x
   A  B
1  1  1
2  2 NA
3 NA NA
4 NA  4
 # determine where the NAs are
 row.na - apply(x, 1, is.na)
 # now convert to list of columns with NAs
 apply(row.na, 2, function(a) paste(colnames(x)[a], collapse = ','))
[1] B   A,B A




On Mon, Jan 17, 2011 at 5:01 AM, Johannes Graumann
johannes_graum...@web.de wrote:
 Hi,

 What is an efficient way to take this DF

        data.frame(A=c(1,2,NA,NA),B=c(1,NA,NA,4))

 and get
        c(NA,TWO,BOTH,ONE)

 as the result, where NA corresponds to a row without NAs, TWO indicates NA
 in the second and ONE in the first column.

 Thanks for any pointers.

 Joh

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using summaryBy with weighted data

2011-01-17 Thread David Freedman

You might use the plyr package to get group-wise weighted means

library(plyr)
ddply(mydata,~group,summarise, b=mean(weights),
c=weighted.mean(response,weights))

hth
david freedman

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Using-summaryBy-with-weighted-data-tp3220761p3221212.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] data frame column name change

2011-01-17 Thread Pete Brecknock

or 

d = data.frame(Col1=c(1,2,3),Col2=c(2,3,4),Col3=c(3,4,5))
names(d)

names(d)[1] = NewName1
names(d)

HTH

Pete
-- 
View this message in context: 
http://r.789695.n4.nabble.com/data-frame-column-name-change-tp3220684p3221214.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Survfit: why different survival curves but same parameter estimates?

2011-01-17 Thread Terry Therneau

begin included message 
I'm trying to estimate a Cox proportional hazard model with time-varying
covariates using coxph. The parameter estimates are fine but there is
something wrong with the survival curves I get with survfit (results are
not plausible).

-- end inclusion 

This sounds wrong to me also.  Could you give more information so that I
can verify the issue?  (See the posting guide).
  What version of R and of the survival package?
  Is is possible to send me a copy of the example that fails?

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread Pete Brecknock

If I have understood your question correctly, how about the following ...

m = matrix(c(7,11,15,17,10,19,4,18,18), nrow = 3, ncol=3)

sum_m = sum(m)

new_m = summ-m

HTH

Pete
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221215.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread Pete Brecknock

typo ...

should have been

m = matrix(c(7,11,15,17,10,19,4,18,18), nrow = 3, ncol=3)

sum_m = sum(m)

new_m = sum_m-m
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221216.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help for R plot

2011-01-17 Thread Peter Ehlers

On 2011-01-17 02:26, Fabrice Tourre wrote:

Hi all,
How to plot as the coordinate  as in my attachment? I want to trim the
coordinate  and one of plot as the figure in attachment. Does any one
have such example?
Thanks.


Maybe you're looking for something like axis.break
or gap.plot in the plotrix package?

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] intercept point coordinates

2011-01-17 Thread Peter Ehlers

On 2011-01-17 04:14, Tonja Krueger wrote:

Hi List,
Can someone help me to calculate the coordinates of the red and green points? 
In this
example I found their approximate location by trying, but as I have to analyse
many similar curves, I’d rather calculate the exact location.

data-
c(0.008248005, 0.061242387, 0.099095516, 0.189943027, 0.227796157, 0.258078661,
0.280790538, 0.303502416, 0.386779301, 0.454914934, 0.545762445, 0.591186201,
0.682033712, 0.757739971, 0.825875604, 0.848587482, 0.803163726, 0.833446230,
0.878869985, 0.871299359, 0.878869985, 0.947005619, 1.0, 0.992429374,
0.954576245, 0.894011237, 0.765310597, 0.621468704, 0.492768064, 0.333784920,
0.258078661, 0.174801775, 0.099095516, 0.008248005)

plot(data, type=l)
abline(h=0.9)

points(21.35,.9, pch=20, col=red)
points(26,.9, pch=20, col=green)


Thank you,
Tonja


You can do this with graphics, using the locator() function:

 plot(data, type=l)
 abline(h=0.9)
 v - locator(2)

Now click on the intersection points and then extract v$x.
It's best to blow up the relevant region of the plot
(perhaps for each point separately) with appropriate
xlim/ylim settings and to maximize your plot window.

Alternatively, you can identify that the points lie
in data[21:22] and in data[25:26]. Then use the
approx() function:

 approx(data[21:22], 21:22, xout = 0.9)
 approx(data[25:26], 25:26, xout = 0.9)

Your points were close - just a bit high: approx gives
21.31012 and 25.90112.

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] effects packages for mixed model?

2011-01-17 Thread John Fox
Dear John,

I've wanted to extend the effects package to mixed-effects models for some
time now. The basics are quite simple and you should be able to do the
computations yourself using the estimated fixed effects and their covariance
matrix. 

The tricky computations are for models that have data-dependent bases, such
as those including regression spline or orthogonal polynomial terms. In the
limited time I've had to look at the problem, I haven't figured out how to
get so-called safe predictions for mixed models. Simply using predict()
isn't sufficient, since the effect() function has to manipulate the model
matrix directly.

Regards,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of array chip
 Sent: January-17-11 1:08 AM
 To: r-help@r-project.org
 Subject: [R] effects packages for mixed model?
 
 Hi, I am wondering if there is a similar effects package for mixed
 models, just like what effects package does for linear, generalized
 linear models?
 Specifically I am looking for a way to calculate the SAS-co-called least
 squared means (LS means) in mixed models (I understand there is a
 substantial debate on whether such adjusted means should be computed in
 the first place).
 
 Thank you,
 
 John
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem about for loop

2011-01-17 Thread ufuk beyaztas

Hi everyones, my function like; 

e - rnorm(n=50, mean=0, sd=sqrt(0.5625))
x0 - c(rep(1,50))
x1 - rnorm(n=50,mean=2,sd=1)
x2 - rnorm(n=50,mean=2,sd=1)
x3 - rnorm(n=50,mean=2,sd=1)
x4 - rnorm(n=50,mean=2,sd=1)
y - 1+ 2*x1+4*x2+3*x3+2*x4+e
x2[1] = 10 #influential observarion
y[1] = 10  #influential observarion

data.x - matrix(c(x0,x1,x2,x3,x4),ncol=5)
data.y - matrix(y,ncol=1) 
data.k - cbind(data.x,data.y)
dataX - data.k[,1:5]
dataY - data.k[,6]

theta - function(data) {
B.cap - solve(crossprod(dataX)) %*% crossprod(dataX,dataY)
P - dataX %*% solve(crossprod(dataX)) %*% t(dataX)
Y.cap - P %*% dataY
e - dataY - Y.cap
dX - nrow(dataX) - ncol(dataX)
var.cap - crossprod(e) / (dX)
ei - as.vector(dataY - dataX %*% B.cap)
pi - diag(P)
var.cap.i - (((dX) * var.cap) / (dX - 1)) -
(ei^2 / ((dX-1) * (1 - pi)))
ti - ei / sqrt(var.cap * (1 - pi))
Ci - (ti^2 / (ncol(dataX))) * (pi / (1 - pi))
output.i - mean(Ci)}


result - list()

for ( i in 1:5){

data - replicate(1, data.k[sample(50,50,replace=T),], simplify = FALSE) 

output.j - theta(data)

result - c(result,(list(output.j))) }
table - do.call(rbind.data.frame,result)
names(table)=c(cooks)
table

This function give same results each time, the data is changing every time
but mean(Ci)s are always same.
Does anyone have an idea about how to be? Thanks for any idea 

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Problem-about-for-loop-tp3221210p3221210.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel

2011-01-17 Thread Mauricio Zambrano
Thanks for your answer Martin, but -unfortunately- the decision about
installing a 32 bits OS in the 64 bits machine, was taken by the IT
guys of my work and not by me.

By the way, due to strong limitations about software installation in
my work place, this problem didn't happen in Ubuntu, but in Red Hat
Enterprise 5. At home I have Ubuntu 10.10 32 bits, but I can not run
the code I need in that machine.


Cheers,

Mauricio

-- 
===
Linux user #454569 -- Ubuntu user #17469
===


2011/1/17 Martin Maechler maech...@stat.math.ethz.ch:
 MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com
     on Mon, 17 Jan 2011 11:46:44 +0100 writes:

    MZ Dear R community,
    MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a
    MZ PAE kernel, as you can see here:

    MZ $ uname -a
    MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010
    MZ i686 i686 i386 GNU/Linux


    MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940,
    MZ ncol=9000) ), I got the following error:


     Error: cannot allocate vector of size 238.3 Mb


    MZ However, the amount of free memory in my machine seems to be much
    MZ larger than this:

    MZ system(free)
    MZ \             total       used       free     shared    buffers     
 cached
    MZ Mem:      12466236    6354116    6112120          0      67596    
 2107556
    MZ -/+ buffers/cache:    4178964    8287272
    MZ Swap:     12582904          0   12582904


    MZ I tried to increase the memory limit available for R by using:

    MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k 
 --max-nsize=5000M


    MZ but it didn't work.


    MZ Any hint about how can I get R using all the memory available in the 
 machine ?

 Install a 64-bit version of Linux, i.e., ubuntu in your case
 and work from there.
 I don't think there's a way around that.

 Martin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to doulbe all the value on a matrix

2011-01-17 Thread ADias

Hi,

Is there an expression to double the values of a matrix - without using a
loop?
 
What I need is this:

Suppose we have this matrix

 m
 [,1] [,2] [,3] 
[1,]7   174   
[2,]   11   10   18
[3,]   15   19   18   

and I want this matrix

 [,1] [,2] [,3]
[1,]  112  102  115
[2,]  108  109  101
[3,]  104  100  101

where for instance, m[1,1] was obtained by adding
(7+17+4+11+10+18+15+19+18)-7

with this loop I am able to get the result I need but I wanted to know if a
more R way of doing this.

 a-matrix(c(7,17,4,11,10,18,15,19,18),3,3,T)
 m=a
 for(i in 1:9){
+ m[c(i)]-sum(a)-a[c(i)]
+ }
 m


thanks
AD
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221213.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem about for loop

2011-01-17 Thread Martyn Byng
Hi,

Looks like the function theta takes a variable data, but that
variable is not being used in the body of the function (you are using
the global dataX and dataY, which will be the same each time the
function is called).

Martyn

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of ufuk beyaztas
Sent: 17 January 2011 13:21
To: r-help@r-project.org
Subject: [R] Problem about for loop


Hi everyones, my function like; 

e - rnorm(n=50, mean=0, sd=sqrt(0.5625))
x0 - c(rep(1,50))
x1 - rnorm(n=50,mean=2,sd=1)
x2 - rnorm(n=50,mean=2,sd=1)
x3 - rnorm(n=50,mean=2,sd=1)
x4 - rnorm(n=50,mean=2,sd=1)
y - 1+ 2*x1+4*x2+3*x3+2*x4+e
x2[1] = 10 #influential observarion
y[1] = 10  #influential observarion

data.x - matrix(c(x0,x1,x2,x3,x4),ncol=5)
data.y - matrix(y,ncol=1) 
data.k - cbind(data.x,data.y)
dataX - data.k[,1:5]
dataY - data.k[,6]

theta - function(data) {
B.cap - solve(crossprod(dataX)) %*% crossprod(dataX,dataY)
P - dataX %*% solve(crossprod(dataX)) %*% t(dataX)
Y.cap - P %*% dataY
e - dataY - Y.cap
dX - nrow(dataX) - ncol(dataX)
var.cap - crossprod(e) / (dX)
ei - as.vector(dataY - dataX %*% B.cap)
pi - diag(P)
var.cap.i - (((dX) * var.cap) / (dX - 1)) -
(ei^2 / ((dX-1) * (1 - pi)))
ti - ei / sqrt(var.cap * (1 - pi))
Ci - (ti^2 / (ncol(dataX))) * (pi / (1 - pi))
output.i - mean(Ci)}


result - list()

for ( i in 1:5){

data - replicate(1, data.k[sample(50,50,replace=T),], simplify = FALSE)


output.j - theta(data)

result - c(result,(list(output.j))) }
table - do.call(rbind.data.frame,result)
names(table)=c(cooks)
table

This function give same results each time, the data is changing every
time
but mean(Ci)s are always same.
Does anyone have an idea about how to be? Thanks for any idea 

-- 
View this message in context:
http://r.789695.n4.nabble.com/Problem-about-for-loop-tp3221210p3221210.h
tml
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


This e-mail has been scanned for all viruses by Star.\ _...{{dropped:12}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rootogram for normal distributions

2011-01-17 Thread S Ellison
I was distracted enough by the possibility of hijacking hist() for this
to give it a go. 

The following code implements a basic hanging rootogram based on a
normal density with hist() breaks used as bins and bin midpoints used as
the hanging location (not exact, I suspect, but perhaops good enough).
Extensions to other distributions are reasonably obvious.

S Ellison


rootonorm - function(x, breaks=Sturges, col=lightgrey, gap=0.2,
...) {
h-hist(x, breaks=breaks)
nbins-length(h$counts)
mu-mean(x)
s-sd(x)
normdens-dnorm(h$mids, mu, s)

plot.range - range(pretty(h$breaks))

plot(z - seq(plot.range[1], plot.range[2], length.out=200),
dens-dnorm(z, mu,s), type=n, ...)

d.gap - min(diff(h$breaks)) * gap /2

for(i in 1:nbins) {
rect(h$breaks[i]+d.gap, normdens[i]-h$density[i],
h$breaks[i+1]-d.gap, normdens[i], col=col)

}

lines(z, dens, lwd=2)

points(h$mids, normdens) 

}

set.seed(17*13)
y - rnorm(500, 10,3)
rootonorm(y)
 

 Deepayan Sarkar deepayan.sar...@gmail.com 17/01/2011 05:06:54

On Sun, Jan 16, 2011 at 11:58 AM, Hugo Mildenberger
hugo.mildenber...@web.de wrote:
 Thank you very much for your qualified answers, and also for the
 link to the Tukey paper. I appreciate Tukey's writings very much.

Yes, thanks to Hadley for the nice reference, I hadn't seen it before.

 Looking at the lattice code (below), a possible implementation might
 involve  binning, not so?

 I see a problematic part here:

   xx - sort(unique(x))

 Unique certainly works well with Poisson distributed data, but is
 essentially a no-op when confronted with continous floating-point
 numbers.

True, but as Achim said, rootogram() is intended to work with data
arising from discrete distributions, not continuous ones. I see now
that this is not as explicit as it could be in the help page (although
frequency distribution gives a hint), which I will try to improve.

I don't think automatic handling of continuous distributions is simple
(because it is not clear how you would specify the reference
distribution). However, a little preliminary work will get you close
with the current implementation:

xnorm - rnorm(1000)

## 'discretize' by binning and replacing data by bin midpoints
h - hist(xnorm, plot = FALSE) # add arguments for more control
xdisc - with(h, rep(mids, counts))

## Option 1: Assume bin probabilities proportional to dnorm()
norm.factor - sum(dnorm(h$mids, mean(xnorm), sd(xnorm)))

rootogram(~ xdisc,
  dfun = function(x) {
  dnorm(x, mean(xnorm), sd(xnorm)) / norm.factor
  })

## Option 2: Compute probabilities explicitly using pnorm()

## pdisc - diff(pnorm(h$breaks)) ## or estimated:
pdisc - diff(pnorm(h$breaks, mean = mean(xnorm), sd = sd(xnorm)))
pdisc - pdisc / sum(pdisc)

rootogram(~ xdisc,
  dfun = function(x) {
  f - factor(x, levels = h$mids)
  pdisc[f]
  })

-Deepayan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median by geometric mean

2011-01-17 Thread Peter Ehlers

On 2011-01-17 02:19, S Ellison wrote:

Will this do?

x- runif(20, 1, 100)

exp( median( log( x) ) )

S Ellison



That's what Hadley proposed, too. It's fine for
your example, but there is potentially a small
problem with this method: the data must be positive.
Since it's not unusual to see data with some zeros,
the log() would fail.

Depending on what type of data I was going to use
this modification of the median for, I would consider
modifying the (quite short) median.default function,
with appropriate additional data checks.

Peter Ehlers




Skull Crossboneswitch.of.agne...@gmail.com  15/01/2011 16:26

Hi All,

I need to calculate the median for even number of data points.However
instead of calculating
the arithmetic mean of the two middle values,I need to calculate their
geometric mean.

Though I can code this in R, possibly in a few lines, but wondering if
there
is
already some built in function.

Can somebody give a hint?

Thanks in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] t-test calculation correct?

2011-01-17 Thread Bert Gunter
As this is apparently a post hoc test, this is wrong. The results are
biased. You have provided a nice example of how to do irreproducible
science.

Consult a local statistician for what this means if you do not know.

-- Bert Gunter

On Mon, Jan 17, 2011 at 4:35 AM, Sascha Vieweg saschav...@gmail.com wrote:
 A multinomial logit model (N=192) revealed (besides others) the following
 statistics for the outcome, y, and one predictor, x:

 - y = A (baseline, n=34)
 - y = B (n=26), B(x)=0.7323 (SE=0.2384)
 - y = C (n=132), B(x)=0.6535 (SE=0.2041)

 With a t-test I want to explore whether the two predictors differ
 significantly, and I use the following calculation (according to Bortz,
 2005, p.140):

 ##
 dm - 0.7323 - 0.6535
 se.dm - sqrt( (0.2384 / (34 + 26)) + (0.2041 / (34 + 132)) )
 t.val - dm / se.dm
 pval - (1 - pt(t.val, df=(34+26+132) )) * 2
 ##

 My question is where this calculation is wrong and why.

 Ref.: Bortz, J. (2005). Statistik für Human- und Sozialwissenschaftler (6.
 Aufl). Berlin: Springer.


 --
 Sascha Vieweg, saschav...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics
467-7374
http://devo.gene.com/groups/devo/depts/ncb/home.shtml

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median by geometric mean

2011-01-17 Thread Peter Ehlers

I've been reminded by Prof. Brian Ripley that R's
log() function will indeed handle zeros appropriately.

Apologies to S Ellison and Hadley Wickham.

Peter Ehlers

On 2011-01-17 06:55, Peter Ehlers wrote:

On 2011-01-17 02:19, S Ellison wrote:

Will this do?

x- runif(20, 1, 100)

exp( median( log( x) ) )

S Ellison



That's what Hadley proposed, too. It's fine for
your example, but there is potentially a small
problem with this method: the data must be positive.
Since it's not unusual to see data with some zeros,
the log() would fail.

Depending on what type of data I was going to use
this modification of the median for, I would consider
modifying the (quite short) median.default function,
with appropriate additional data checks.

Peter Ehlers




Skull Crossboneswitch.of.agne...@gmail.com   15/01/2011 16:26

Hi All,

I need to calculate the median for even number of data points.However
instead of calculating
the arithmetic mean of the two middle values,I need to calculate their
geometric mean.

Though I can code this in R, possibly in a few lines, but wondering if
there
is
already some built in function.

Can somebody give a hint?

Thanks in advance



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread Dieter Menne


ADias wrote:
 
 Is there an expression to double the values of a matrix - without using a
 loop?
  
 
Why so complicated?
Dieter

 m = matrix(rep(1,20),nrow=4)
 m
 [,1] [,2] [,3] [,4] [,5]
[1,]11111
[2,]11111
[3,]11111
[4,]11111
 m*3
 [,1] [,2] [,3] [,4] [,5]
[1,]33333
[2,]33333
[3,]33333
[4,]33333


-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help for R plot

2011-01-17 Thread Dieter Menne


Fabrice Tourre wrote:
 
 How to plot as the coordinate  as in my attachment? I want to trim the
 coordinate  and one of plot as the figure in attachment. Does any one
 have such example?
 

http://markmail.org/message/3jn2sqoep36ckswb
(for a lattice-lookalike)

and package 

plotrix

Dieter

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Help-for-R-plot-tp3221035p3221232.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problems with TeachingDemos package

2011-01-17 Thread Greg Snow
What happens if you just load the R2wd package then run wdGet() yourself?

Also what OS, version of R, version of TeachingDemos, and version of R2wd are 
you using?

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of gaiarrido
 Sent: Saturday, January 15, 2011 3:30 AM
 To: r-help@r-project.org
 Subject: Re: [R] Problems with TeachingDemos package
 
 
 R2wd is working but i received an alarm:
 
  wdtxtStart()
 Error en R2wd::wdGet() : tentativa de aplicar una no-función
  The translation is attempt to apply a no-function
 
 
 
 -
 Mario Garrido Escudero
 PhD student
 Dpto. de Biología Animal, Ecología, Parasitología, Edafología y Qca.
 Agrícola
 Universidad de Salamanca
 --
 View this message in context: http://r.789695.n4.nabble.com/Problems-
 with-TeachingDemos-package-tp3218266p3218935.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread Pete Brecknock

try ...

new_m = m[c(2,7,8),c(1,4,6,7)]

HTH

Pete
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221234.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median by geometric mean

2011-01-17 Thread Keith Jewell
Just in case some of x are negative (the desired median still exists, as 
long as the two middle values are non -ve), how about:

x - runif(20, -1, 100)
exp(median(log(pmax(0,x

It'll give -Inf if the two middle values are negative, when I guess we 
should get NaN, but I can't see a 1-line way to handle that!

Keith J

Peter Ehlers ehl...@ucalgary.ca wrote in message 
news:4d3468ef.5010...@ucalgary.ca...
 I've been reminded by Prof. Brian Ripley that R's
 log() function will indeed handle zeros appropriately.

 Apologies to S Ellison and Hadley Wickham.

 Peter Ehlers

 On 2011-01-17 06:55, Peter Ehlers wrote:
 On 2011-01-17 02:19, S Ellison wrote:
 Will this do?

 x- runif(20, 1, 100)

 exp( median( log( x) ) )

 S Ellison


 That's what Hadley proposed, too. It's fine for
 your example, but there is potentially a small
 problem with this method: the data must be positive.
 Since it's not unusual to see data with some zeros,
 the log() would fail.

 Depending on what type of data I was going to use
 this modification of the median for, I would consider
 modifying the (quite short) median.default function,
 with appropriate additional data checks.

 Peter Ehlers


 Skull Crossboneswitch.of.agne...@gmail.com   15/01/2011 16:26
 Hi All,

 I need to calculate the median for even number of data points.However
 instead of calculating
 the arithmetic mean of the two middle values,I need to calculate their
 geometric mean.

 Though I can code this in R, possibly in a few lines, but wondering if
 there
 is
 already some built in function.

 Can somebody give a hint?

 Thanks in advance



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median by geometric mean -- are we missing what's important?

2011-01-17 Thread Bert Gunter
Folks:

I know this may be overreaching, but are we missing what's important?
WHY do the zeros occur? Are they values less then a known or unknown
LOD? -- and/or is there positive mass on zero? In either case, using
logs to calculate a geometric mean may not make sense. Paraphrasing
Greg Snow, what is the scientific question? What is the model?

Cheers,
Bert



On Mon, Jan 17, 2011 at 9:13 AM, Keith Jewell k.jew...@campden.co.uk wrote:
 Just in case some of x are negative (the desired median still exists, as
 long as the two middle values are non -ve), how about:

 x - runif(20, -1, 100)
 exp(median(log(pmax(0,x

 It'll give -Inf if the two middle values are negative, when I guess we
 should get NaN, but I can't see a 1-line way to handle that!

 Keith J

 Peter Ehlers ehl...@ucalgary.ca wrote in message
 news:4d3468ef.5010...@ucalgary.ca...
 I've been reminded by Prof. Brian Ripley that R's
 log() function will indeed handle zeros appropriately.

 Apologies to S Ellison and Hadley Wickham.

 Peter Ehlers

 On 2011-01-17 06:55, Peter Ehlers wrote:
 On 2011-01-17 02:19, S Ellison wrote:
 Will this do?

 x- runif(20, 1, 100)

 exp( median( log( x) ) )

 S Ellison


 That's what Hadley proposed, too. It's fine for
 your example, but there is potentially a small
 problem with this method: the data must be positive.
 Since it's not unusual to see data with some zeros,
 the log() would fail.

 Depending on what type of data I was going to use
 this modification of the median for, I would consider
 modifying the (quite short) median.default function,
 with appropriate additional data checks.

 Peter Ehlers


 Skull Crossboneswitch.of.agne...@gmail.com   15/01/2011 16:26
 Hi All,

 I need to calculate the median for even number of data points.However
 instead of calculating
 the arithmetic mean of the two middle values,I need to calculate their
 geometric mean.

 Though I can code this in R, possibly in a few lines, but wondering if
 there
 is
 already some built in function.

 Can somebody give a hint?

 Thanks in advance



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] median by geometric mean -- are we missing what's important?

2011-01-17 Thread Joshua Wiley
On Mon, Jan 17, 2011 at 9:23 AM, Bert Gunter gunter.ber...@gene.com wrote:
 Folks:

 I know this may be overreaching, but are we missing what's important?
 WHY do the zeros occur? Are they values less then a known or unknown
 LOD? -- and/or is there positive mass on zero? In either case, using
 logs to calculate a geometric mean may not make sense. Paraphrasing

Isn't this a bit of a general problem with the geometric mean if there
are 0s or an odd number of negative numbers it becomes 0 or imaginary
(please do correct me if I'm wrong)?

sqrt(prod(c(2, 0, 54)))
sqrt(prod(c(-2, 2)))


 Greg Snow, what is the scientific question? What is the model?

 Cheers,
 Bert



 On Mon, Jan 17, 2011 at 9:13 AM, Keith Jewell k.jew...@campden.co.uk wrote:
 Just in case some of x are negative (the desired median still exists, as
 long as the two middle values are non -ve), how about:

 x - runif(20, -1, 100)
 exp(median(log(pmax(0,x

 It'll give -Inf if the two middle values are negative, when I guess we
 should get NaN, but I can't see a 1-line way to handle that!

 Keith J

 Peter Ehlers ehl...@ucalgary.ca wrote in message
 news:4d3468ef.5010...@ucalgary.ca...
 I've been reminded by Prof. Brian Ripley that R's
 log() function will indeed handle zeros appropriately.

 Apologies to S Ellison and Hadley Wickham.

 Peter Ehlers

 On 2011-01-17 06:55, Peter Ehlers wrote:
 On 2011-01-17 02:19, S Ellison wrote:
 Will this do?

 x- runif(20, 1, 100)

 exp( median( log( x) ) )

 S Ellison


 That's what Hadley proposed, too. It's fine for
 your example, but there is potentially a small
 problem with this method: the data must be positive.
 Since it's not unusual to see data with some zeros,
 the log() would fail.

 Depending on what type of data I was going to use
 this modification of the median for, I would consider
 modifying the (quite short) median.default function,
 with appropriate additional data checks.

 Peter Ehlers


 Skull Crossboneswitch.of.agne...@gmail.com   15/01/2011 16:26
 Hi All,

 I need to calculate the median for even number of data points.However
 instead of calculating
 the arithmetic mean of the two middle values,I need to calculate their
 geometric mean.

 Though I can code this in R, possibly in a few lines, but wondering if
 there
 is
 already some built in function.

 Can somebody give a hint?

 Thanks in advance



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Bert Gunter
 Genentech Nonclinical Biostatistics

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CSV value not being read as it appears

2011-01-17 Thread Karl Ove Hufthammer
David Scott wrote:

 As a further note, this is a reminder that whenever you get data via a
 spreadsheet the first thing to do is examine it and clean up any
 problems. A basic requirement is to tabulate any categorical variable.

I like using the ‘describe’ function in the ‘Hmisc’ package for this. If you 
run the result through the ‘latex’ function, you get an even nicer output, 
with small histograms for each numerical variable.

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] CSV value not being read as it appears

2011-01-17 Thread Karl Ove Hufthammer
Peter Ehlers wrote:

 It is hardly R's fault that Excel users routinely commit
 crimes against data.

A ‘fortune’ candidate?

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sampling question

2011-01-17 Thread Chris Mcowen
Dear list i have a sample question

I have a dataframe of 1500 species and 13 life history traits. 

small example code:

traits - data.frame(letters[1:9],
   sample(letters, 9),
   sample(letters, 9),
   sample(letters, 9),
   sample(letters, 9),
   sample(letters, 9),
   sample(letters, 9),
   sample(letters, 9),
   sample(letters, 9))
colnames(traits) - c(species, 1:8)

What i want to do is:

Sample a number of species from the data frame in integers of 50: -  50 
species, 100 species ,150,200... up-to 1500, when i sample them i also want the 
traits associated with them to be kept intact. For each species number i would 
like a 1000 repetitions. So i would like 50 species with their life history 
traits randomly sampled 1000 times, then 100 species with their life history 
traits sampled 1000 times. I appreciate that as i get to the higher numbers i.e 
1500 species this will only be sampled once, therefore i will need to use 
replace = yes.

Then i have a function i want to run on the sample so for the 50 species  i 
want to run a function which requires the name of the sample

GFD(50species_sample1)
GFD(50species_sample2) etc to 
GFD(50species_sample1000)

Then 

GFD(100species_sample1)

etc.
With the reults put into a data-frame.

I am relatively new to R, i could probably hack together a code but i am unsure 
how to join it up so i sample, retain the data and then use it in a function?

Any help would be greatly appreciated.

I appreciate this is a lot to ask so any help would be greatly appreciated.

Thanks in advance,

Chris

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] to append a column to a data frame, has I use loop/if in my case?

2011-01-17 Thread Daniel Wu
days=Sys.Date()-1:70
price=abs(rnorm(70))
regular=rep(c(0,0,0,0,1,0,1,0,0,1),c(7,7,7,7,7,7,7,7,7,7))
y=data.frame(cbind(days,price,regular))


y is like
days  price regular
1  14990 0.16149463   0
2  14989 1.69519358   0
3  14988 1.57821998   0
4  14987 0.47614311   0
5  14986 0.87016180   0
6  14985 2.55679229   0
7  14984 0.89753533   0


the output I want:
have another column appended to y, whose value is the max price in the recent 2 
**regular** weeks.
So if the current row is today, then get the max price of the past 14 days 
(including today) if the last 2 week are regular weeks,  if one of the last 2 
weeks is not regular week, then I need to go back further to find the max 
price, as I need the max price for the last 2 **regular** weeks. How can I do 
that? Or I have to use loop/if to do it?




BTW, why the days is like 14990,14989, after cbind(days,price,regular)? before 
the cbind, days is like the format 2010-12-23.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Log difference in a dataframe column

2011-01-17 Thread eric

What am I doing wrong here ? And what's the right way to calculate the log
differences in a column in a df ?

# first 3 rows of 5000 rows
y[1:3,]

 Date  Open  High   Low Close
1 1983-03-30 29.96 30.51 29.96 30.35
2 1983-03-31 30.35 30.55 30.20 30.24
3 1983-04-04 30.25 30.65 30.24 30.39

#equation in question ...why is this giving zeros ?
y1 - 100*log(y[,5]/(lag(y[,5],1)))

# first 10 values from the equation...all zeros
head(y1,10)
 [1] 0 0 0 0 0 0 0 0 0 0
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Log-difference-in-a-dataframe-column-tp3221225p3221225.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using anova() with glmmPQL()

2011-01-17 Thread Toby Marthews
Dear R HELP,

ABOUT glmmPQL and the anova command. Here is an example of a repeated-measures 
ANOVA focussing on the way starling masses vary according to (i) roost 
situation and (ii) time (two time points only).

library(nlme);library(MASS)
stmass=c(78,88,87,88,83,82,81,80,80,89,78,78,85,81,78,81,81,82,76,74,79,73,79,75,77,78,80,78,83,84,77,68,75,70,74,84,80,75,76,75,85,88,86,95,100,87,98,86,89,94,84,88,91,96,86,87,93,87,94,96,91,90,87,84,86,88,92,96,83,85,90,87,85,81,84,86,82,80,90,77)
roostsitu=factor(c(tree,tree,tree,tree,tree,tree,tree,tree,tree,tree,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,inside,inside,inside,inside,inside,inside,inside,inside,inside,inside,other,other,other,other,other,other,other,other,other,other,tree,tree,tree,tree,tree,tree,tree,tree,tree,tree,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,nest-box,inside,inside,inside,inside,inside,inside,inside,inside,inside,inside,other,other,other,other,other,other,other,other,other,other),levels=c(tree,nest-box,inside,other))
mnth=factor(c(rep(Nov,times=40),rep(Jan,times=40)),levels=c(Nov,Jan))
subjectnum=c(1:10,1:10,1:10,1:10,1:10,1:10,1:10,1:10)
subject=factor(paste(roostsitu,subjectnum,sep=))
dataf=data.frame(mnth,roostsitu,subjectnum,subject,stmass)
lmeres=lme(fixed=stmass~mnth*roostsitu,random=~1|subject/mnth,na.action=na.exclude)
anova(object=lmeres,test=Chisq)

  numDF denDF   F-value p-value
(Intercept)136 31143.552  .0001
mnth   13695.458  .0001
roostsitu  33610.614  .0001
mnth:roostsitu 336 0.657  0.5838

I can conclude from this that variation with both roost situation and month are 
significant, but with no interaction term. So far so good. However, say I were 
interested only in whether or not those starlings were heavier or lighter than 
78g: seemingly, I could change my response variable and analyse like this -

stmassheavy=ifelse(stmass78,1,0)
lmeres1=lme(fixed=stmassheavy~mnth*roostsitu,random=~1|subject/mnth,na.action=na.exclude,family=binomial)
anova(object=lmeres1,test=Chisq)

but I get errors doing that. After a certain amount of web searching, I find 
that I'm supposed to use glmmPQL for this so I tried:

lmeres2=glmmPQL(fixed=stmassheavy~mnth*roostsitu,random=~1|subject/mnth,na.action=na.exclude,family=binomial)
anova(object=lmeres2,test=Chisq)

The glmmPQL command runs, but I get Error in anova.glmmPQL(object = lmeres, 
test = Chisq) : 'anova' is not available for PQL fits. Looking into this, I 
find that I am not supposed to use the anova command in conjunction with 
glmmPQL (several posts from Brian Ripley 
http://r.789695.n4.nabble.com/R-glmmPQL-in-2-3-1-td808574.html and 
http://www.biostat.wustl.edu/archives/html/s-news/2002-06/msg00055.html and 
http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg46894.html ) even 
though it appears that the earlier versions of glmmPQL did allow the anova 
command to work (before ~2004).
   However, I couldn't find any other way to run a repeated-measures ANOVA with 
famiy=binomial. After a while longer on Google, I found a 'workaround' from 
Spencer Graves (on 
http://markmail.org/message/jddj6aq66wdidrog#query:how%20to%20use%20anova%20with%20glmmPQL+page:1+mid:jddj6aq66wdidrog+state:results
 ):

class(lmeres2)=lme
anova(object=lmeres2,test=Chisq)

   numDF denDF   F-value p-value
(Intercept)136 182.84356  .0001
mnth   136 164.57288  .0001
roostsitu  336  17.79263  .0001
mnth:roostsitu 336   3.26912  0.0322

Which does give me a result and tells me that the interaction term is 
significant here. HOWEVER, on that link Douglas Bates told Spencer Graves that 
this wasn't an approprate method.

I haven't found any other workarounds for this except some general advice that 
I should move onto using the lmer command (which I can't do because I need to 
get p-values for my fits and according to 
https://stat.ethz.ch/pipermail/r-help/2006-May/094765.html I won't get those 
from lmer).

My questions are: (1) Is lmer the only way to do a binomial repeated-measures 
ANOVA in R? (which means that there's no way to do such an ANOVA in R without 
losing the p-values) and (2) if I am supposed to be using glmmPQL for this 
simple situation, what am I doing wrong?

Thanks very much for any help anyone can give me.

best,
Toby Marthews

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel

2011-01-17 Thread Mauricio Zambrano
Following the advice a colleague, I put the gc() and gcinfo(TRUE)
commands just before the line I got the problem, and their output
were:

 used (Mb) gc trigger  (Mb)  max used   (Mb)
Ncells  471485 12.61704095  45.6   7920371  211.5
Vcells 6408885 48.9  113919753 869.2 347651599 2652.4

Garbage collection 538 = 323+101+114 (level 2) ...
13.0 Mbytes of cons cells used (29%)
49.0 Mbytes of vectors used (7%)

Error: cannot allocate vector of size 238.1 Mb


If I understood correctly, I should have enough memory for allocating
the new matrix (Q.obs - matrix(NA, nrow=6940,  MZ ncol=9000) ))

Thanks in advance for any help,

Mauricio


-- 
===
Linux user #454569 -- Ubuntu user #17469
===

2011/1/17 Martin Maechler maech...@stat.math.ethz.ch:
 MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com
     on Mon, 17 Jan 2011 11:46:44 +0100 writes:

    MZ Dear R community,
    MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a
    MZ PAE kernel, as you can see here:

    MZ $ uname -a
    MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010
    MZ i686 i686 i386 GNU/Linux


    MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940,
    MZ ncol=9000) ), I got the following error:


     Error: cannot allocate vector of size 238.3 Mb


    MZ However, the amount of free memory in my machine seems to be much
    MZ larger than this:

    MZ system(free)
    MZ \             total       used       free     shared    buffers     
 cached
    MZ Mem:      12466236    6354116    6112120          0      67596    
 2107556
    MZ -/+ buffers/cache:    4178964    8287272
    MZ Swap:     12582904          0   12582904


    MZ I tried to increase the memory limit available for R by using:

    MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k 
 --max-nsize=5000M


    MZ but it didn't work.


    MZ Any hint about how can I get R using all the memory available in the 
 machine ?

 Install a 64-bit version of Linux, i.e., ubuntu in your case
 and work from there.
 I don't think there's a way around that.

 Martin


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread ADias

Hi,

yes it works perfectly.

I have another question:

Is there way of selecting with a vector the values I wish to take out from a
matrix.

Example:

I have this matrix and I want to take out the numbers in bold and get the
second matrix below

m
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]   17165   192   191   15 8
 [2,]77   2032   1699   1913
 [3,]243   11   18   11   14   133 1
 [4,]3757   17   18   106515
 [5,]8   20   13   108   12   20   19116
 [6,]9   141   12   12   12   17   18   1017
 [7,]3   10   112   129   186   19 9
 [8,]   132   17   16   1889   14916
 [9,]94   1141   1797   2012
[10,]91488   19   198   1718

 [,1] [,2] [,3] [,4]
[1,]73   169
[2,]329   18
[3,]   13   1689

thanks
AD

-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Importing multiple text files with lapply.

2011-01-17 Thread Simon Kiss
Hello,
I'm trying to read in 50 text filess with dates as content to create a list of 
tables.  

a is the list of filenames that need to be read in.

The following command returns the following error
mylist-lapply(a, read.table(header=TRUE, sep=\n))

Error in read.table(header = TRUE, sep = \n) : 
  element 1 is empty;
   the part of the args list of 'is.character' being evaluated was:
   (file)

Does anyone have any suggestions?
Yours, Simon Kiss
*
Simon J. Kiss, PhD
Assistant Professor, Wilfrid Laurier University
73 George Street
Brantford, Ontario, Canada
N3T 2C9
Cell: +1 519 761 7606

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-help Digest, Vol 95, Issue 17

2011-01-17 Thread Prof. John C Nash
For those issues with optimization methods (optim, optimx, and others) I see, a 
good
percentage are because the objective function (or gradient if user-supplied) is 
mis-coded.
However, an almost equal number are due to functions getting into overflow or 
underflow
territory and yielding quantities that the optimization tools cannot handle (NA 
or Inf etc.)

Two general approaches I find helpful:
1) even if there are no actual bounds on parameters, put in reasonable 
limits. They
don't need to be too tight, just enough to keep the parameters from giving a 
silly
objective function
2) do some evaluations of the objective to make sure it is really being properly
calculated. Never hurts to have some known outcomes.

Beyond this, we get into reparametrizations. Great idea, but far too much work 
for most of
us, even if we work in the field.

Best,

JN


On 01/17/2011 06:00 AM, r-help-requ...@r-project.org wrote:
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: Jinrui Xu jinru...@umich.edu
 Cc: r-help@r-project.org
 Subject: Re: [R] fgev_error_matrix_singular

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fw: Re: help in calculating ar on ranked vector

2011-01-17 Thread Raymond Wong


--- On Mon, 1/17/11, Raymond Wong raywong...@yahoo.ca wrote:


From: Raymond Wong raywong...@yahoo.ca
Subject: Re: [R] help in calculating ar on ranked vector
To: Uwe Ligges lig...@statistik.tu-dortmund.de
Received: Monday, January 17, 2011, 11:56 AM







Thanks Uwe:
 
Here is my code. the first set of print statements work, but not the second.
 
#
z-as.vector(na.omit(z)) #remove na
nz-length(z)
rz-rank(z,ties.method=average)
#
print(ar(z, order.max=1, method=burg))
print(ar(z, order.max=1, method=ols))
print(ar(z, order.max=1, method=mle))
print(ar(z, order.max=1, method=yule-walker))
#
# **
#
print(ar(rz, order.max=1, method=burg))
print(ar(rz, order.max=1, method=ols))
print(ar(rz, order.max=1, method=mle))
print(ar(rz, order.max=1, method=yule-walker))
#
 
What did I miss?
Thanks a million.
 
Raymond

--- On Fri, 1/14/11, Uwe Ligges lig...@statistik.tu-dortmund.de wrote:


From: Uwe Ligges lig...@statistik.tu-dortmund.de
Subject: Re: [R] help in calculating ar on ranked vector
To: Raymond Wong raywong...@yahoo.ca
Cc: R-help@r-project.org
Received: Friday, January 14, 2011, 12:42 PM


Works for my examples. But you have not specified what you actual call 
to ar() was.

Uwe Ligges




On 12.01.2011 21:17, Raymond Wong wrote:
 I was using ar(stats) to calculate autoregressive coefficient. It works on 
 vector z, but it will not work on vector rz-rank (z, 
 ties.method=average).  What did I miss?
 Any info will be greatly appreciated.  TIA


     [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using summaryBy with weighted data

2011-01-17 Thread Solomon Messing
Thanks Josh.  I built on your example and ended up with the code below--if you 
or anyone sees any issues please let me know.  It would be great if there were 
a slicker way to get these kinds of summary stats in R, but this gets the job 
done.

# takes data frame z with weights w and data x, returns weighted mean, weighted 
SE, and N
msenw = function(z){
N = length(na.omit(z)$response)
i = which(!is.na(z$response)) 
return( 
c( W.M = weighted.mean(z$response, z$weights, na.rm=T), 
W.SE = sqrt(wtd.var(z$response, weights = 
z$weights))/sqrt(sum(z$weights[i])), 
N=N ) )
}

library(doBy)
library(Hmisc)
## make up some data (easier)
mydata - data.frame(response = rnorm(100),
group = rep(1:5, each = 20), weights = runif(100, 0, 1)) 

xy - by(mydata, mydata$group, msenw)
data.frame( group = names(c(xy)), do.call(rbind, xy) )

## can be extended to other data using:
xy - by(data.frame(response = mydata$response, weights = mydata$weights), 
mydata$group, msenw)


Solomon Messing
www.stanford.edu/~messing



On Jan 16, 2011, at 11:16 PM, Joshua Wiley wrote:

 Dear Solomon,
 
 On Sun, Jan 16, 2011 at 10:27 PM, Solomon Messing
 solomon.mess...@gmail.com wrote:
 Dear Soren and R users:
 
 I am trying to use the summaryBy function with weights.  Is this possible?  
 An example that illustrates what I am trying to do follows:
 
 library(doBy)
 ## make up some data
 response = rnorm(100)
 group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20))
 weights = runif(100, 0, 1)
 mydata = data.frame(response,group,weights)
 
 ## run summaryBy without weights:
 summaryBy(response~group, data = mydata, FUN = mean)
 
 ## attempt to run summaryBy with weights, throws error
 summaryBy(x~group, data = mydata, FUN = weighted.mean, w=weights )
 
 ## throws the error:
 # Error in tapply(lh.data[, lh.var[vv]], rh.string.factor, function(x) { :
 #   arguments must have same length
 
 My guess is that summaryBy is not giving weighted.mean() each group of 
 weights, but instead is passing all of the weights in the data set each time 
 it calls weighted.mean().
 
 Yes, of course.  It has no way of knowing that the weights should also
 be being broken down by groupthey are not in the formula.
 
  Do you know if there is some way to get summaryBy to pass weights to 
 weighted.mean() only for each group?
 
 Ideally there would be a way to pass more than one variable to a
 function (e.g., response and weights) or just an entire object
 (mydata) broken down by group.  Then you would just make a wrapper
 function to pass the right values to the x and w arguments of
 weighted.mean.  Instead here is a somewhat hacked version:
 
 library(doBy)
 ## make up some data (easier)
 mydata - data.frame(response = rnorm(100),
 group = rep(1:5, each = 20), weights = runif(100, 0, 1))
 
 ## manually compute weighted mean
 tmp - summaryBy(response*weights ~ group, data = mydata, FUN = sum)
 tmp[,2] - tmp[,2]/with(mydata, tapply(weights, group, sum))
 tmp ## weighted means
 
 ## here's the 'problem', if you will, even with  +, they are passed
 one at a time
 summaryBy(response + weights ~ group, data = mydata, FUN = str)
 summaryBy(mydata ~ group, data = mydata, FUN = str)
 
 ## here is an option using by():
 xy - by(mydata, mydata$group, function(z) weighted.mean(z$response, 
 z$weights))
 xy
 ## if you don't like the formatting
 data.frame(group = names(c(xy)), weighted.mean = c(xy))
 
 HTH,
 
 Josh
 
 
 I suspect this functionality would be a tremendous benefit to R users who 
 regularly work with weighted data, such as myself.
 
 Thanks,
 
 Solomon Messing
 www.stanford.edu/~messing
 
 PS I know this basic example can be done using lapply(split(...)) approach 
 referenced here:
 
 http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg12349.html
 
 but for more complex tasks the lapply approach will mean writing a lot of 
 extra code to run everything and then to get things formatted as nicely as 
 summaryBy() was designed to do.
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 -- 
 Joshua Wiley
 Ph.D. Student, Health Psychology
 University of California, Los Angeles
 http://www.joshuawiley.com/


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [Fwd: Re: R-help Digest, Vol 95, Issue 17]

2011-01-17 Thread nashjc
Apologies if this is posted twice. The r-help mailing system gave an error
(reported to moderator) on first try, but it may have gone through.

 Original Message 
Subject: Re: R-help Digest, Vol 95, Issue 17
From:Prof. John C Nash nas...@uottawa.ca
Date:Mon, 17 January, 2011 1:04 pm
To:  r-help@r-project.org
Cc:  lig...@statistik.tu-dortmund.de
 jinru...@umich.edu
--

For those issues with optimization methods (optim, optimx, and others) I
see, a good
percentage are because the objective function (or gradient if
user-supplied) is mis-coded.
However, an almost equal number are due to functions getting into overflow
or underflow
territory and yielding quantities that the optimization tools cannot
handle (NA or Inf etc.)

Two general approaches I find helpful:
1) even if there are no actual bounds on parameters, put in reasonable
limits. They
don't need to be too tight, just enough to keep the parameters from giving
a silly
objective function
2) do some evaluations of the objective to make sure it is really being
properly
calculated. Never hurts to have some known outcomes.

Beyond this, we get into reparametrizations. Great idea, but far too much
work for most of
us, even if we work in the field.

Best,

JN


On 01/17/2011 06:00 AM, r-help-requ...@r-project.org wrote:
 From: Uwe Ligges lig...@statistik.tu-dortmund.de
 To: Jinrui Xu jinru...@umich.edu
 Cc: r-help@r-project.org
 Subject: Re: [R] fgev_error_matrix_singular

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summing data frame columns on identical data

2011-01-17 Thread Steve Murray

Dear all,

I have 9 data frames, and I'm simply trying to sum the values of column 3 (on a 
row-by-row basis). However, there are a slightly different number of rows in 
each data frame, so I'm receiving the following error: Error in 
Ops.data.frame(mrunoff_207101[3], mrunoff_207102[3]) : 
  + only defined for equally-sized data frames.

Here is what I'm attempting to do:

 arunoff_2071 - cbind(mrunoff_207101[1:2], (mrunoff_207101[3] + 
 mrunoff_207102[3] + mrunoff_207103[3] + mrunoff_207104[3] + mrunoff_207105[3] 
 + mrunoff_207106[3] + mrunoff_207107[3] + mrunoff_207108[3] + 
 mrunoff_207109[3]))


Is there an easy way of summing based on congruent values in columns 1 and 2? 
The only way I can think of would be to use merge, but this would involve doing 
this for every pair of data frames.

The data for each data frame look like this:

 head(mrunoff_207101)
  Latitude Longitude  FPC
1 5.75  0.25 0.0112384744
2 6.25  0.25 0.0019959067
3 6.75  0.25 0.0003245941
4 7.25  0.25 0.0011973676
5 7.75  0.25 0.0001062602
6 8.25  0.25 0.0451578423


Any suggestions on how to achieve this easily will be very welcome.

Many thanks,

Steve
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Importing multiple text files with lapply.

2011-01-17 Thread jim holtman
try:

mylist - lapply(a, read.table, header = TRUE, sep = '\n')

also is the separator really '\n' meaning a new-line?  What exactly
does the data look like?

On Mon, Jan 17, 2011 at 11:47 AM, Simon Kiss simonjk...@yahoo.ca wrote:
 Hello,
 I'm trying to read in 50 text filess with dates as content to create a list 
 of tables.

 a is the list of filenames that need to be read in.

 The following command returns the following error
 mylist-lapply(a, read.table(header=TRUE, sep=\n))

 Error in read.table(header = TRUE, sep = \n) :
  element 1 is empty;
   the part of the args list of 'is.character' being evaluated was:
   (file)

 Does anyone have any suggestions?
 Yours, Simon Kiss
 *
 Simon J. Kiss, PhD
 Assistant Professor, Wilfrid Laurier University
 73 George Street
 Brantford, Ontario, Canada
 N3T 2C9
 Cell: +1 519 761 7606

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Log difference in a dataframe column

2011-01-17 Thread Peter Ehlers

On 2011-01-17 07:44, eric wrote:


What am I doing wrong here ? And what's the right way to calculate the log
differences in a column in a df ?

# first 3 rows of 5000 rows
y[1:3,]

  Date  Open  High   Low Close
1 1983-03-30 29.96 30.51 29.96 30.35
2 1983-03-31 30.35 30.55 30.20 30.24
3 1983-04-04 30.25 30.65 30.24 30.39

#equation in question ...why is this giving zeros ?
y1- 100*log(y[,5]/(lag(y[,5],1)))

# first 10 values from the equation...all zeros
head(y1,10)
  [1] 0 0 0 0 0 0 0 0 0 0


Well, take a look at the output of lag().
Try it with as.ts(y[, 5]) replacing y[, 5].

Peter Ehlers

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The Percentile of a User-Defined pdf

2011-01-17 Thread Nissim Kaufmann
I got it to work:
# To get a percentile of a single-variable function:
# Step 1: Integrate over the domain to ge the normalization constant:
Z-integrate(function(x) sqrt(1+x^-1), 1,2)$value
Z
# Step 2: Find the .975 percentile
x975-uniroot(function(t) integrate(function(x) sqrt(1+x^-1), 1,
t)$value/Z-.975,
   lower=1, upper=2, tol=5e-4 )$root
x975

# To get a percentile of a marginal of a bivariate function of (x,y):
# Define and compute the marginal x distribution:
fx-function(x) {sapply(x, function(x)
integrate( function(y)  sqrt(sin(x)+1/y),
0,10)$value
 )
 }
# Then proceed as above in the singe-variable case.


Thank you Dieter and David.

Nissim Kaufmann
Dept. of Mathematics and Statistics
University at Albany

In reply to
http://www.mail-archive.com/r-help@r-project.org/msg121420.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix manipulations

2011-01-17 Thread Monica Pisica

Hi,

I am having some difficulties with matrix operations. It is a little hard to 
explain it so please bear with me. I have a very large data set, large enough 
that it needs to be split in parts in order to deal with. I can work things on 
these parts but the problem lies in adding together these parts for the final 
answer. 

So that been said, let's say that i split the data in 2 parts, 1 and 2. Each 
part has data belonging to 6 different categories, and each category has 2 
different classes, these classes being the same for each category. The classes 
are called land and water and each category is labeled cat1 to cat6. I 
am using the command (function) table to tabulate each class for each category, 
but since i split the data in 2 parts, one part has only some of the 6 
categories, and the other some other of the 6 categories (and not necessarily 
exclusive).

So let's built some results after i used the table function.

m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, 
dimnames = list(c(land, water), c(cat2, cat5, cat6)))

 m1
 cat2 cat5 cat6
land 3235   36
water 12   15   16

m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = 
TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3, cat4)))

 m2
 cat1 cat2 cat3 cat4
land  45   46   47   48
water 21   22   23   24

So my end desired result should be a matrix (or a data frame) that has 6 
columns called cat1 to cat6 and 2 rows labeled land and water, and for the 
category that appears in both m1 and m2 the end result will be a sum.

results will be m3:

 cat1 cat2 cat3 cat4 cat5 cat6
land  45  78   4748   35   36
water 21  34   2324   15   16

To do this i thought in making an empty matrix for each m1 and m2 (called m01 
and m02 respectively) with 6 columns and 2 rows, and do a long if else 
statement in which i match the name of the first column in m1 with the name of 
the first column in m01 and if they match get the data from m1, if not leave it 
0 and so on. Same thing for m2 and m02. This is long and extremely clunky but 
afterwards i can add m01 with m02 and get my desired result m3. Is there any 
way i can do this more elegantly? My real data is split in 4 parts, but the 
problem is the same.

Thanks for all your inputs, and sorry for this long email, but i didn't know 
how else i could explain what i wanted to do.
 
Monica
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread Pete Brecknock

I believe you want to select a subset of rows and subset of columns of your
original matrix m.

If you had wanted only the first row of m, you could have used m[1,] 

Alternatively, if you had wanted only the second column of m then you could
have used m[,2] 

m[1,2] would give you the element at row 1, column 2.

You are requesting rows 2,7, and 8 and columns 1,4,6 and 7.

The syntax is m[required rows, required columns]

c() allows you to specify multiple rows/columns at the same time.

HTH

Pete
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221255.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix manipulations

2011-01-17 Thread Phil Spector

Monica -
   Perhaps this small example can demonstrate how factors can
solve your problem:


d1 = 
data.frame(cat=sample(c('cat2','cat5','cat6'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE))
d2 = 
data.frame(cat=sample(c('cat1','cat3','cat4'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE))
d1$cat = factor(d1$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6'))
d2$cat = factor(d2$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6'))
table(d1$group,d1$cat) + table(d2$group,d2$cat)


cat1 cat2 cat3 cat4 cat5 cat6
  land14   17   18   22   19   23
  water   19   15   16   11   10   16

This works because when you include all possible levels in a factor, R will 
automatically put zeroes in the right places when you use table():



table(d1$group,d1$cat)

cat1 cat2 cat3 cat4 cat5 cat6
  land 0   1700   19   23
  water0   1500   10   16

table(d2$group,d2$cat)

cat1 cat2 cat3 cat4 cat5 cat6
  land140   18   2200
  water   190   16   1100

Hope this helps.
- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Mon, 17 Jan 2011, Monica Pisica wrote:



Hi,

I am having some difficulties with matrix operations. It is a little hard to explain it 
so please bear with me. I have a very large data set, large enough that it needs to be 
split in parts in order to deal with. I can work things on these parts but 
the problem lies in adding together these parts for the final answer.

So that been said, let's say that i split the data in 2 parts, 1 and 2. Each part has data belonging to 6 different 
categories, and each category has 2 different classes, these classes being the same for each category. The classes are 
called land and water and each category is labeled cat1 to cat6. I am 
using the command (function) table to tabulate each class for each category, but since i split the data in 2 parts, one 
part has only some of the 6 categories, and the other some other of the 6 categories (and not necessarily exclusive).

So let's built some results after i used the table function.

m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, dimnames = list(c(land, water), 
c(cat2, cat5, cat6)))


m1

cat2 cat5 cat6
land 3235   36
water 12   15   16

m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = TRUE, dimnames = list(c(land, water), 
c(cat1, cat2, cat3, cat4)))


m2

cat1 cat2 cat3 cat4
land  45   46   47   48
water 21   22   23   24

So my end desired result should be a matrix (or a data frame) that has 6 
columns called cat1 to cat6 and 2 rows labeled land and water, and for the 
category that appears in both m1 and m2 the end result will be a sum.

results will be m3:

cat1 cat2 cat3 cat4 cat5 cat6
land  45  78   4748   35   36
water 21  34   2324   15   16

To do this i thought in making an empty matrix for each m1 and m2 (called m01 
and m02 respectively) with 6 columns and 2 rows, and do a long if else 
statement in which i match the name of the first column in m1 with the name of 
the first column in m01 and if they match get the data from m1, if not leave it 
0 and so on. Same thing for m2 and m02. This is long and extremely clunky but 
afterwards i can add m01 with m02 and get my desired result m3. Is there any 
way i can do this more elegantly? My real data is split in 4 parts, but the 
problem is the same.

Thanks for all your inputs, and sorry for this long email, but i didn't know 
how else i could explain what i wanted to do.

Monica
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix manipulations

2011-01-17 Thread Henrique Dallazuanna
Try this:

library(reshape)
xtabs(rowSums(cbind(value.x, value.y), na.rm = TRUE) ~ X1 + X2,
merge(melt(m1), melt(m2), by = c('X1', 'X2'), all = TRUE), exclude = FALSE)


On Mon, Jan 17, 2011 at 5:59 PM, Monica Pisica pisican...@hotmail.comwrote:


 Hi,

 I am having some difficulties with matrix operations. It is a little hard
 to explain it so please bear with me. I have a very large data set, large
 enough that it needs to be split in parts in order to deal with. I can work
 things on these parts but the problem lies in adding together these parts
 for the final answer.

 So that been said, let's say that i split the data in 2 parts, 1 and 2.
 Each part has data belonging to 6 different categories, and each category
 has 2 different classes, these classes being the same for each category. The
 classes are called land and water and each category is labeled cat1 to
 cat6. I am using the command (function) table to tabulate each class for
 each category, but since i split the data in 2 parts, one part has only some
 of the 6 categories, and the other some other of the 6 categories (and not
 necessarily exclusive).

 So let's built some results after i used the table function.

 m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE,
 dimnames = list(c(land, water), c(cat2, cat5, cat6)))

  m1
 cat2 cat5 cat6
 land 3235   36
 water 12   15   16

 m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow =
 TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3,
 cat4)))

  m2
 cat1 cat2 cat3 cat4
 land  45   46   47   48
 water 21   22   23   24

 So my end desired result should be a matrix (or a data frame) that has 6
 columns called cat1 to cat6 and 2 rows labeled land and water, and for the
 category that appears in both m1 and m2 the end result will be a sum.

 results will be m3:

 cat1 cat2 cat3 cat4 cat5 cat6
 land  45  78   4748   35   36
 water 21  34   2324   15   16

 To do this i thought in making an empty matrix for each m1 and m2 (called
 m01 and m02 respectively) with 6 columns and 2 rows, and do a long if else
 statement in which i match the name of the first column in m1 with the name
 of the first column in m01 and if they match get the data from m1, if not
 leave it 0 and so on. Same thing for m2 and m02. This is long and extremely
 clunky but afterwards i can add m01 with m02 and get my desired result m3.
 Is there any way i can do this more elegantly? My real data is split in 4
 parts, but the problem is the same.

 Thanks for all your inputs, and sorry for this long email, but i didn't
 know how else i could explain what i wanted to do.

 Monica
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Replacing rows in a data frame

2011-01-17 Thread Brant Inman
R-helpers,

Below is a simple example of some output that I am getting while trying to work 
with a data frame in R 2.12.1 for Mac.

-
 testdat - data.frame(matrix(ncol=10, nrow=10))
 colnames(testdat) - c('a','b','c','d','e','f','g','h','i','j')
 testdat[seq(1,10,3),] - c(1,0,0,0,0,0,0,0,0,0)

 testdat
a  b  c  d  e  f  g  h  i  j
1   1  0  0  0  0  1  0  0  0  0
2  NA NA NA NA NA NA NA NA NA NA
3  NA NA NA NA NA NA NA NA NA NA
4   0  0  0  0  0  0  0  0  0  0
5  NA NA NA NA NA NA NA NA NA NA
6  NA NA NA NA NA NA NA NA NA NA
7   0  0  1  0  0  0  0  1  0  0
8  NA NA NA NA NA NA NA NA NA NA
9  NA NA NA NA NA NA NA NA NA NA
10  0  0  0  0  0  0  0  0  0  0

-

The output is not what I would have anticipated.  Since seq(1,10,3) gives the 
vector [1 4 7 10], I expected rows 1, 4, 7 and 10 of the data.frame testdat 
to contain the same data, a 1 for variable 'a' and zeros for all other 
variables.  I guess I assumed the assigment would proceed by rows, but it 
appears from the resulting output to be proceeding by columns. Can someone 
point out how I can modify this simple code so that the assignments proceed by 
rows?

Thank you.

Brant

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Replacing rows in a data frame

2011-01-17 Thread Henrique Dallazuanna
Try this:

 testdat[seq(1,10,3),] - t(replicate(4, c(1,0,0,0,0,0,0,0,0,0)))

On Mon, Jan 17, 2011 at 6:29 PM, Brant Inman brant.in...@me.com wrote:

 R-helpers,

 Below is a simple example of some output that I am getting while trying to
 work with a data frame in R 2.12.1 for Mac.

 -
  testdat - data.frame(matrix(ncol=10, nrow=10))
  colnames(testdat) - c('a','b','c','d','e','f','g','h','i','j')
  testdat[seq(1,10,3),] - c(1,0,0,0,0,0,0,0,0,0)

  testdat
a  b  c  d  e  f  g  h  i  j
 1   1  0  0  0  0  1  0  0  0  0
 2  NA NA NA NA NA NA NA NA NA NA
 3  NA NA NA NA NA NA NA NA NA NA
 4   0  0  0  0  0  0  0  0  0  0
 5  NA NA NA NA NA NA NA NA NA NA
 6  NA NA NA NA NA NA NA NA NA NA
 7   0  0  1  0  0  0  0  1  0  0
 8  NA NA NA NA NA NA NA NA NA NA
 9  NA NA NA NA NA NA NA NA NA NA
 10  0  0  0  0  0  0  0  0  0  0

 -

 The output is not what I would have anticipated.  Since seq(1,10,3) gives
 the vector [1 4 7 10], I expected rows 1, 4, 7 and 10 of the data.frame
 testdat to contain the same data, a 1 for variable 'a' and zeros for all
 other variables.  I guess I assumed the assigment would proceed by rows, but
 it appears from the resulting output to be proceeding by columns. Can
 someone point out how I can modify this simple code so that the assignments
 proceed by rows?

 Thank you.

 Brant

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R scheduling request

2011-01-17 Thread Greg Snow
You could write a batch file and then have your OS schedule to run R on the 
batch file whenever you want (see Rscript for one approach of running the 
batch).

Inside of R you can use Sys.sleep to wait a certain amount of time before 
running the next command.  If you load the tcltk2 package then you can use the 
tclTaskSchedule function.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Alessandro Oggioni
 Sent: Saturday, January 15, 2011 6:19 AM
 To: r-help
 Subject: [R] R scheduling request
 
 Dear all,
 I have used R.rps to produce a Google API chart (googleVis) with a
 data request in another server.
 But i don't understand how is possible to scheduling a request data
 to the server and after produce a update of the charts.
 Thanks in advance.
 Alessandro Oggioni
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to still processing despite bug errors?

2011-01-17 Thread Altay
Hi, everybody.

I am working  processing EEG data from 1000 pacients. I have a specific
syntax to perform the Spectral Analysis and a loop to analyse all subjects.
each subject data are in separate folders (P1, P2 P3...)

My question is: in some cases, some errors can appear in one subject. I want
to know if is possible to jump to the next subject and perform the same
syntax , exibiting an error like:

Working on P1
Error X in P1

Working on P2...

The idea is to let the computer processing continuosly all the subjects and
at the end only see the problems in the subjects that R could not perform
the analysis.
Each Subject takes 20 minutes to perform the analysis and I don't want to
stay days in front of PC, waiting for the next error in order to start the
syntax again with the next subject.

Any ideas?
Thanks in Advance.

Altay Lino de Souza.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread David Winsemius


On Jan 17, 2011, at 11:16 AM, ADias wrote:



Hi,

yes it works perfectly.

I have another question:

Is there way of selecting with a vector the values I wish to take  
out from a

matrix.

Example:

I have this matrix and I want to take out the numbers in bold and  
get the

second matrix below


This is a plain text mailing list (despite what the Nabble mirror may  
(mis-)lead you into believing) ... no bold.


--
david.



m

 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]   17165   192   191   15 8
[2,]77   2032   1699   1913
[3,]243   11   18   11   14   133 1
[4,]3757   17   18   106515
[5,]8   20   13   108   12   20   19116
[6,]9   141   12   12   12   17   18   1017
[7,]3   10   112   129   186   19 9
[8,]   132   17   16   1889   14916
[9,]94   1141   1797   2012
[10,]91488   19   198   1718

[,1] [,2] [,3] [,4]
[1,]73   169
[2,]329   18
[3,]   13   1689

thanks
AD

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using summaryBy with weighted data

2011-01-17 Thread Dennis Murphy
Hi:

Does this do what you need?

wstats - function(d) {
 require(Hmisc)
 N - length(d$response[!is.na(d$response)])
 c(WM = wtd.mean(d$response, d$weights),
   WSE = sqrt(wtd.var(d$response, d$weights)),
   N = N)
}
library(plyr)
ddply(mydata, .(group), wstats)
  groupWM   WSE  N
1 1  0.1302255752 1.1911298 20
2 2 -0.2814664362 0.8582928 20
3 3 -0.3640550516 1.2618343 20
4 4  0.0002852392 1.1463205 20
5 5 -0.0070283053 1.2315683 20

The trick to writing this function for input into plyr is that the argument
is a data frame. When called in ddply(), the function wstats() will be
applied to each sub-frame corresponding to the grouping factor(s). Inside
it, the variables of interest are extracted relative to the input data frame
and the three quantities are computed. I used wtd.mean() and wtd.var() from
Hmisc, as both will remove NAs by default. In the ddply call, the function
name is simply cited since a sub-data frame is the sole argument of the
function.

I couldn't figure out how to get doBy to get this to work, as it seems best
suited to functions of one argument (a single response), but here's an
alternative using the data.table package:

library(data.table)
# Assumes Hmisc is already loaded...
myDT - data.table(mydata, key = 'group')
myDT[, list(N = length(response[!is.na(response)]),
 wtdMean = wtd.mean(response, weights),
 wtdSE = sqrt(wtd.var(response, weights))), by = 'group']
 group  N   wtdMean wtdSE
[1,] 1 20  0.1302255752 1.1911298
[2,] 2 20 -0.2814664362 0.8582928
[3,] 3 20 -0.3640550516 1.2618343
[4,] 4 20  0.0002852392 1.1463205
[5,] 5 20 -0.0070283053 1.2315683

data.table uses a different model of data organization from data frames. A
simplistic description is that it you can think of a data.table as analogous
to a table in a DBMS. Notice that the 'function call' is indexed inside the
data table: the first 'subscript' corresponds to what are called I()
operations  (analogous to 'select' statements in an SQL); the second
'subscript' corresponds to J() operations, (analogous to  'where'
statements), while the third argument is the by group(s), or sub-data
tables, to which (in this case) the J() operations apply.

For functions that take multiple arguments and that are meant to be applied
in a groupwise fashion, I find plyr and data.table to be very good options.
There are also base package alternatives (e.g., some combination of
lapply(), mapply() and do.call()) and several other packages, but plyr and
data.table are generally pretty good at handling most of the niggling
details. Having said that, both have learning curves - data.table, in
particular, will be much easier to pick up if you have some background in
SQLs, since its syntax uses primary principles of SQL.

data.table has a vignette and FAQ, along with an independent help list - for
details, see its page on R-forge:
http://r-forge.r-project.org/projects/datatable/
For plyr's documentation, see
http://had.co.nz/plyr/
A link to its mailing list is found on that page as well.

HTH,
Dennis

On Mon, Jan 17, 2011 at 10:24 AM, Solomon Messing solomon.mess...@gmail.com
 wrote:

 Thanks Josh.  I built on your example and ended up with the code below--if
 you or anyone sees any issues please let me know.  It would be great if
 there were a slicker way to get these kinds of summary stats in R, but this
 gets the job done.

 # takes data frame z with weights w and data x, returns weighted mean,
 weighted SE, and N
 msenw = function(z){
N = length(na.omit(z)$response)
i = which(!is.na(z$response))
return(
c( W.M = weighted.mean(z$response, z$weights,
 na.rm=T),
W.SE = sqrt(wtd.var(z$response, weights =
 z$weights))/sqrt(sum(z$weights[i])),
N=N ) )
 }

 library(doBy)
 library(Hmisc)
 ## make up some data (easier)
 mydata - data.frame(response = rnorm(100),
group = rep(1:5, each = 20), weights = runif(100, 0, 1))

 xy - by(mydata, mydata$group, msenw)
 data.frame( group = names(c(xy)), do.call(rbind, xy) )

 ## can be extended to other data using:
 xy - by(data.frame(response = mydata$response, weights = mydata$weights),
 mydata$group, msenw)


 Solomon Messing
 www.stanford.edu/~messing http://www.stanford.edu/%7Emessing



 On Jan 16, 2011, at 11:16 PM, Joshua Wiley wrote:

  Dear Solomon,
 
  On Sun, Jan 16, 2011 at 10:27 PM, Solomon Messing
  solomon.mess...@gmail.com wrote:
  Dear Soren and R users:
 
  I am trying to use the summaryBy function with weights.  Is this
 possible?  An example that illustrates what I am trying to do follows:
 
  library(doBy)
  ## make up some data
  response = rnorm(100)
  group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20))
  weights = runif(100, 0, 1)
  mydata = data.frame(response,group,weights)
 
  ## run summaryBy without weights:

[R] Dealing with Latex output in Openoffice

2011-01-17 Thread Rob James
I am making considerable use of Harrell's rms package, but I do not use 
Latex for writing.  (I have enough trouble convincing my co-authors to 
use Openoffice!).   rms makes copious use of Latex output for various 
mixed graphical and text outputs, amongst other things.


Does someone have a convenient strategy for dealing with Latex output 
and openoffice, either within or outside of a OdfSweave environment?


Thanks,

Rob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread André Dias
OK!!

So, the ideia is from the 1st matrix get the 2nd matrix with the use of a
vector.

is it possible?

In the example I have a 10x10 matrix and I get from that one a second 4x3
matrix selected from a vector.

thanks
ADias


2011/1/17 David Winsemius dwinsem...@comcast.net


 On Jan 17, 2011, at 11:16 AM, ADias wrote:


 Hi,

 yes it works perfectly.

 I have another question:

 Is there way of selecting with a vector the values I wish to take out from
 a
 matrix.

 Example:

 I have this matrix and I want to take out the numbers in bold and get the
 second matrix below


 This is a plain text mailing list (despite what the Nabble mirror may
 (mis-)lead you into believing) ... no bold.

 --
 david.


  m

 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]   17165   192   191   15 8
 [2,]77   2032   1699   1913
 [3,]243   11   18   11   14   133 1
 [4,]3757   17   18   106515
 [5,]8   20   13   108   12   20   19116
 [6,]9   141   12   12   12   17   18   1017
 [7,]3   10   112   129   186   19 9
 [8,]   132   17   16   1889   14916
 [9,]94   1141   1797   2012
 [10,]91488   19   198   1718

[,1] [,2] [,3] [,4]
 [1,]73   169
 [2,]329   18
 [3,]   13   1689

 thanks
 AD

 --
 View this message in context:
 http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html

 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using summaryBy with weighted data

2011-01-17 Thread Søren Højsgaard
It is currently not possible to pass weights in summaryBy.
Regards
Søren 


Fra: Joshua Wiley [jwiley.ps...@gmail.com]
Sendt: 17. januar 2011 08:16
Til: Solomon Messing
Cc: r-help@r-project.org; Søren Højsgaard
Emne: Re: [R] Using summaryBy with weighted data

Dear Solomon,

On Sun, Jan 16, 2011 at 10:27 PM, Solomon Messing
solomon.mess...@gmail.com wrote:
 Dear Soren and R users:

 I am trying to use the summaryBy function with weights.  Is this possible?  
 An example that illustrates what I am trying to do follows:

 library(doBy)
 ## make up some data
 response = rnorm(100)
 group = c(rep(1,20), rep(2,20), rep(3,20), rep(4,20), rep(5,20))
 weights = runif(100, 0, 1)
 mydata = data.frame(response,group,weights)

 ## run summaryBy without weights:
 summaryBy(response~group, data = mydata, FUN = mean)

 ## attempt to run summaryBy with weights, throws error
 summaryBy(x~group, data = mydata, FUN = weighted.mean, w=weights )

 ## throws the error:
 # Error in tapply(lh.data[, lh.var[vv]], rh.string.factor, function(x) { :
 #   arguments must have same length

 My guess is that summaryBy is not giving weighted.mean() each group of 
 weights, but instead is passing all of the weights in the data set each time 
 it calls weighted.mean().

Yes, of course.  It has no way of knowing that the weights should also
be being broken down by groupthey are not in the formula.

  Do you know if there is some way to get summaryBy to pass weights to 
 weighted.mean() only for each group?

Ideally there would be a way to pass more than one variable to a
function (e.g., response and weights) or just an entire object
(mydata) broken down by group.  Then you would just make a wrapper
function to pass the right values to the x and w arguments of
weighted.mean.  Instead here is a somewhat hacked version:

library(doBy)
## make up some data (easier)
mydata - data.frame(response = rnorm(100),
 group = rep(1:5, each = 20), weights = runif(100, 0, 1))

## manually compute weighted mean
tmp - summaryBy(response*weights ~ group, data = mydata, FUN = sum)
tmp[,2] - tmp[,2]/with(mydata, tapply(weights, group, sum))
tmp ## weighted means

## here's the 'problem', if you will, even with  +, they are passed
one at a time
summaryBy(response + weights ~ group, data = mydata, FUN = str)
summaryBy(mydata ~ group, data = mydata, FUN = str)

## here is an option using by():
xy - by(mydata, mydata$group, function(z) weighted.mean(z$response, z$weights))
xy
## if you don't like the formatting
data.frame(group = names(c(xy)), weighted.mean = c(xy))

HTH,

Josh


 I suspect this functionality would be a tremendous benefit to R users who 
 regularly work with weighted data, such as myself.

 Thanks,

 Solomon Messing
 www.stanford.edu/~messing

 PS I know this basic example can be done using lapply(split(...)) approach 
 referenced here:

 http://www.mail-archive.com/r-help@stat.math.ethz.ch/msg12349.html

 but for more complex tasks the lapply approach will mean writing a lot of 
 extra code to run everything and then to get things formatted as nicely as 
 summaryBy() was designed to do.


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot allocate vector of size ... in RHLE5 PAE kernel

2011-01-17 Thread Hugo Mildenberger
Mauricio,

I tried your matrix allocation on Gentoo-hardened 32 and 
64 bit systems. Both work ok, using R-2.11.1 and R-2.12.2 respectively,
and both use a recent 2.6.36 kernel revision.

This is from the 32 bit system with 512 MB physical memory:

system(free)
  total   used   free sharedbuffers 
cached
Mem:469356  61884 407472  0   1368  21592
-/+ buffers/cache:  38924 430432
Swap:  1927796  360961891700

 gc()
used (Mb) gc trigger (Mb) max used (Mb)
Ncells   120116  3.3 35  9.4   35  9.4
Vcells 78413  0.6 786432  6.0   391299  3.0

 bs - matrix(NA, nrow=6940,ncol=9000)

 gc()
used  (Mb)gc trigger  (Mb)  max used  (Mb)
Ncells 120123 3.3   35 9.4 359.4
Vcells 31308414 238.9   34854943 266.0 31308428 238.9

 system(free)
 total   used   free sharedbuffers cached
Mem:469356 307528 161828  0   1404  22508
-/+ buffers/cache: 283616 185740
Swap:  1927796  360841891712


MZ I tried to increase the memory limit available for R by using:
MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k --max-nsize=5000M

Hmm, I wonder if specifying 5000M is a good idea within a 32-bit environment. 
Depending on R's internal implementation, maybe that value could overflow an 
tacitly wrap around on a 32 bit integer. (5000M  2^32 - 1)  You may try to 
specify 
1000M instead. But I think it's more probable that the system or VM 
configuration 
had setup a memory usage limit per user or per process. How to view/change this 
on redhat I don't know. But you may try to compile a small C programm using
malloc() and see what happens if you request say 1Gigabyte:

#include stdlib.h
#include stdio.h

void main() {
 const size_t size = 10LU;
 void* p = malloc(size);
 if ( p ) {
  fprintf(stderr,successfully allocated %lu bytes\n,size);
 }else {
  fprintf(stderr,allocation of %lu bytes failed:%m\n,size);
 }
}

put this into a file named, say, tmalloc.c and compile it using
   
  gcc tmalloc.c -o tmalloc

Hugo







On Monday 17 January 2011 16:42:43 Mauricio Zambrano wrote:
 Following the advice a colleague, I put the gc() and gcinfo(TRUE)
 commands just before the line I got the problem, and their output
 were:
 
  used (Mb) gc trigger  (Mb)  max used   (Mb)
 Ncells  471485 12.61704095  45.6   7920371  211.5
 Vcells 6408885 48.9  113919753 869.2 347651599 2652.4
 
 Garbage collection 538 = 323+101+114 (level 2) ...
 13.0 Mbytes of cons cells used (29%)
 49.0 Mbytes of vectors used (7%)
 
 Error: cannot allocate vector of size 238.1 Mb
 
 
 If I understood correctly, I should have enough memory for allocating
 the new matrix (Q.obs - matrix(NA, nrow=6940,  MZ ncol=9000) ))
 
 Thanks in advance for any help,
 
 Mauricio
 
 
  MZ == Mauricio Zambrano hzambran.newsgro...@gmail.com
  on Mon, 17 Jan 2011 11:46:44 +0100 writes:
 
 MZ Dear R community,
 MZ I'm running R 32 bits in a 64-bits machine (with 16Gb of Ram) using a
 MZ PAE kernel, as you can see here:
 
 MZ $ uname -a
 MZ Linux mymachine 2.6.18-238.el5PAE #1 SMP Sun Dec 19 14:42:44 EST 2010
 MZ i686 i686 i386 GNU/Linux
 
 
 MZ When I try to create a large matrix ( Q.obs - matrix(NA, nrow=6940,
 MZ ncol=9000) ), I got the following error:
 
 
  Error: cannot allocate vector of size 238.3 Mb
 
 
 MZ However, the amount of free memory in my machine seems to be much
 MZ larger than this:
 
 MZ system(free)
 MZ \ total   used   free sharedbuffers 
  cached
 MZ Mem:  1246623663541166112120  0  67596
  2107556
 MZ -/+ buffers/cache:41789648287272
 MZ Swap: 12582904  0   12582904
 
 
 MZ I tried to increase the memory limit available for R by using:
 
 MZ $ R --min-vsize=10M --max-vsize=5000M --min-nsize=500k 
  --max-nsize=5000M
 
 
 MZ but it didn't work.
 
 
 MZ Any hint about how can I get R using all the memory available in the 
  machine ?
 
  Install a 64-bit version of Linux, i.e., ubuntu in your case
  and work from there.
  I don't think there's a way around that.
 
  Martin
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to cut a multidimensional array along a chosen dimension and store each piece into a list

2011-01-17 Thread Sean Zhang
Dear R-Helpers,

I wonder whether there is a function which cuts a multiple dimensional array
along a chosen dimension and then store each piece (still an array of one
dimension less) into a list.
For example,

arr - array(seq(1*2*3*4),dim=c(1,2,3,4))  # I made a point to set the
length of the first dimension be 1to test whether I worry about drop=F
option.

brkArrIntoListAlong - function(arr,alongWhichDim){

return(outlist)
}

I have tried splitter_a in plyr package but does not get what I want.

library(plyr)
plyr:::splitter_a(arr,3)

I understand that I can write a for loop to make it happen but I am
searching for a better solution.

Thanks in advance.

-Sean

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Manipulation

2011-01-17 Thread michael.hopgood

Dear R family,
I am a relative newbie and have been dabbling with R for a little while. 
Simple things really, but my employers are beginning to see the benefits of
using R instead of excel. We have a remote monitoring station measuring
groundwater levels.  We download the  date as a .csv file and up until now,
we have been using excel to analyse the data.  It’s been a hassle trying to
wrestle with that damn program as my boss wants to do things that excel was
never meant to do,  so I’ve convinced my boss to give R a chance.  It’s been
a steep learning curve, but I’m fairly confident I can reduce the amount of
labour involved in producing and improving the graphs we show our clients.

The groundwater levels are measured by pressure sensors lowered into the
monitoring wells.   After a certain time, the sensors were lowered further
into the well, thus creating a disparity in the measurements.  

The data frame I import into R looks something like this:
DateWaterhead (mm)  Parameter 1 Paramater 2, etc.
10-01-01 100
10-01-02 105
10-01-03 101
10-01-04  99
10-01-05  85
10-01-06200  # - Sensor lowered#
10-01-07199
10-01-08195
10-01-09185
10-01-10170

For example, on the 10-10-06, the sensor was lowered by 115 mm. 
When I download the csv file, I download the data from the beginning of the
measurement period. I then need to adjust the height by 115 mm to account
for the lowering of the parameter.  My question to you is how do I do that
in R?
I am after a formula or a manipulation that selects the first five
measurements of a column in the data frame and adds a fixed amount.  This is
something that is added everytime I download the csv file and import it into
R so that when I display my data, it is based on the following data frame:

DateWaterhead (mm)
10-01-01 215
10-01-02 220
10-01-03 216
10-01-04  214
10-01-05  200
10-01-06200
10-01-07199
10-01-08195
10-01-09185
10-01-10170

In short, I want to select a fixed number of rows from my data frame, add a
constant to the rows of one of the columns, and insert the new values into
their respective rows without affecting the subsequent rows.  I hope I have
produced a reproducible example, I have been searching high and low for a
solution, but have come up against a brick wall. I feel I have read
something that tackles this some time in the past, but can’t find it again.
Thanks in advance!

Sincerely,
Michael Hopgood
MRM Konsult AB


-- 
View this message in context: 
http://r.789695.n4.nabble.com/Manipulation-tp3221260p3221260.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread ADias


Pete Brecknock wrote:
 
 try ...
 
 new_m = m[c(2,7,8),c(1,4,6,7)]
 
 HTH
 
 Pete
 

Hi Pete,

I haven't understood what you wanted to say here. Can you explain please?

thanks
ADias
-- 
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221252.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Difficult with round() function

2011-01-17 Thread Aaron Polhamus
Dear list,

I'm writing a function to re-grid a data set from finer to coarser
resolutions in R as follows (I use this function with sapply/apply):

gridResize - function(startVec = stop(What's your input vector),
to = stop(Missing 'to': How long do you want the fnial vector to be?)){
 from - length(startVec)
shortVec-numeric()
tics - from*to
for(j in 1:to){
interval - ((j/to)*tics - (1/to)*tics + 1):((j/to)*tics)
benchmarks - interval/to
 #FIRST RUN ASSUMES FINAL BENCHMARK/TO IS AN INTEGER...
positions - which(round(benchmarks) == benchmarks)
indeces - benchmarks[positions]
fracs - numeric()
 #SINCE MUCH OF THE TIME THIS WILL NOT BE THE CASE, THIS SCRIPT DEALS WITH
THE REMAINDER...
for(i in 1:length(positions)){
if(i == 1) fracs[i] - positions[i]/length(benchmarks) else{
fracs[i] - (positions[i] - sum(positions[1:(i-1)]))/length(benchmarks)
}
}
 #AND UPDATES STARTVEC INDECES AND FRACTION MULTIPLIERS
if(max(positions) != length(benchmarks)) indeces - c(indeces, max(indeces)
+ 1)
if(sum(fracs) != 1) fracs - c(fracs, 1 - sum(fracs))
 fromVals - startVec[indeces]
 if(any(is.na(fromVals))){
NAindex - which(is.na(fromVals))
if(sum(Fracs[-NAindex]) = 0.5)  shortVec[j] - sum(fromVals*fracs,
na.rm=TRUE) else shortVec[j] - NA
}else{shortVec[j] - sum(fromVals*fracs)}
}
return(shortVec)
}


for the simple test case test - gridResize(startVec =
c(2,4,6,8,10,8,6,4,2), to = 7) the function works fine. For larger vectors,
however, it breaks down. E.g.: test - gridResize(startVec = rnorm(300, 9,
20), to = 200)

This returns the error:

Error in positions[1:(i - 1)] :
  only 0's may be mixed with negative subscripts

and the problem seems to be in the line positions - which(round(benchmarks)
== benchmarks). In this particular example the code cracks up at j = 27.
When set j = 27 and run the calculation manually I discover the following:

 benchmarks[200]
[1] 40
 benchmarks[200] == 40
[1] FALSE
 round(benchmarks[200]) == 40
[1] TRUE

Even though my benchmark calculation seems to be returning a clean integers
to serve as inputs for the creation of the 'positions' variable, for
whatever reason R doesn't read it that way. I would be very grateful for any
advice on how I can either alter my approach entirely (I am sure there is a
far more elegant way to regrid data in R) or a simple fix for this rounding
error.

Many thanks in advance,
Aaron

-- 
Aaron Polhamus aaronpolha...@gmail.com
Statistical consultant, Revolution Analytics
MSc Applied Statistics, The University of Oxford, 2009
838a NW 52nd St, Seattle, WA 98107
Cell: +1 (206) 380.3948

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extraction and replacement of data in a data frame

2011-01-17 Thread michael.hopgood

Dear R family,
I am a relative newbie and have been dabbling with R for a little while. 
Simple things really, but my employers are beginning to see the benefits of
using R instead of excel. We have a remote monitoring station measuring
groundwater levels.  We download the  date as a .csv file and up until now,
we have been using excel to analyse the data.  It’s been a hassle trying to
wrestle with that damn program as my boss wants to do things that excel was
never meant to do,  so I’ve convinced my boss to give R a chance.  It’s been
a steep learning curve, but I’m fairly confident I can reduce the amount of
labour involved in producing and improving the graphs we show our clients.

The groundwater levels are measured by pressure sensors lowered into the
monitoring wells.   After a certain time, the sensors were lowered further
into the well, thus creating a disparity in the measurements.  

The data frame I import into R looks something like this:
DateWaterhead (mm)
10-01-01 100
10-01-02 105
10-01-03 101
10-01-04  99
10-01-05  85
10-01-06200
10-01-07199
10-01-08195
10-01-09185
10-01-10170

For example, on the 10-10-06, the sensor was lowered by 115 mm. 
When I download the csv file, I download the data from the beginning of the
measurement period. I then need to adjust the height by 115 mm to account
for the lowering of the parameter.  My question to you is how do I do that
in R?
I am after a formula or a manipulation that selects the first five
measurements and adds a fixed amount.  This is something that is added
everytime I download the csv file and import it into R so that when I
display my data, it is based on the following data frame:

DateWaterhead (mm)
10-01-01 215
10-01-02 220
10-01-03 216
10-01-04  214
10-01-05  200
10-01-06200
10-01-07199
10-01-08195
10-01-09185
10-01-10170

In short, I want to select a fixed number of rows of a column from my data
frame, add a constant to these, and insert the new values into their
respective rows without affecting the subsequent rows.  I hope I have
produced a reproducible example.  I have been searching high and low for a
solution, but have come up against a brick wall. I feel I have read
something that tackles this some time in the past, but can’t find it again.
Thanks in advance!

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to still processing despite bug errors?

2011-01-17 Thread Hugo Mildenberger

Altay, simply run your tests under control of an exception handler:

  help(try)
  help(tryCatch)



On Monday 17 January 2011 22:05:07 Altay wrote:
 Hi, everybody.
 
 I am working  processing EEG data from 1000 pacients. I have a specific
 syntax to perform the Spectral Analysis and a loop to analyse all subjects.
 each subject data are in separate folders (P1, P2 P3...)
 
 My question is: in some cases, some errors can appear in one subject. I want
 to know if is possible to jump to the next subject and perform the same
 syntax , exibiting an error like:
 
 Working on P1
 Error X in P1
 
 Working on P2...
 
 The idea is to let the computer processing continuosly all the subjects and
 at the end only see the problems in the subjects that R could not perform
 the analysis.
 Each Subject takes 20 minutes to perform the analysis and I don't want to
 stay days in front of PC, waiting for the next error in order to start the
 syntax again with the next subject.
 
 Any ideas?
 Thanks in Advance.
 
 Altay Lino de Souza.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Accessing MySQL Database in R

2011-01-17 Thread schlafly

I have a local installation of MySQL on my computer.

I enter the following to access MySQL from the command line:
/Applications/MAMP/Library/bin/mysql -h localhost -u root -p
I am then prompted for a password, and I use: root
This connects me to MySQL in the command line.

I now want to access MySQL databases in R. I enter the following: 
mysql - dbDriver(MySQL)
conn - dbConnect(mysql,user='root',host='localhost', password='root')

I get the following error message: Error in mysqlNewConnection(drv, ...) : 
RS-DBI driver: (Failed to connect to database: Error: Access denied for user
'root'@'localhost' (using password: YES)

Does anyone know why these aren't equivalent?
-- 
View this message in context: 
http://r.789695.n4.nabble.com/Accessing-MySQL-Database-in-R-tp3221264p3221264.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] selection statistics from function

2011-01-17 Thread ufuk beyaztas

Hi,
My code:

e - rnorm(n=50, mean=0, sd=sqrt(0.5625))
x0 - c(rep(1,50))
x1 - rnorm(n=50,mean=2,sd=1)
x2 - rnorm(n=50,mean=2,sd=1)
x3 - rnorm(n=50,mean=2,sd=1)
x4 - rnorm(n=50,mean=2,sd=1)
y - 1+ 2*x1+4*x2+3*x3+2*x4+e
x2[1] = 10 #influential observarion
y[1] = 10  #influential observarion
data.x - matrix(c(x0,x1,x2,x3,x4),ncol=5)
data.y - matrix(y,ncol=1)
data.k - cbind(data.x,data.y)

result - list()

for( i in 1: 3100) {
data - data.k[sample(50,50,replace=TRUE),]
dataX - data[,1:5]
dataY - data[,6]
B.cap - solve(crossprod(dataX)) %*% crossprod(dataX,dataY)
P - dataX %*% solve(crossprod(dataX)) %*% t(dataX)
Y.cap - P %*% dataY
e - dataY - Y.cap
dX - nrow(dataX) - ncol(dataX)
var.cap - crossprod(e) / (dX)
ei - as.vector(dataY - dataX %*% B.cap)
pi - diag(P)
var.cap.i - (((dX) * var.cap) / (dX - 1)) - (ei^2 / ((dX-1) * (1 - pi)))
ti - ei / sqrt(var.cap * (1 - pi))
Ci - (ti^2 / (ncol(dataX))) * (pi / (1 - pi))
result - c(result,list(mean(Ci)))}

table-do.call(rbind.data.frame,result)
names(table)=c(Cook's Distance)
table

I want to find data's statistics (mean(Ci)) which do not contain influential
observation. That is do not contain the value of 10. Can someone help me?
Thanks for advices !
-- 
View this message in context: 
http://r.789695.n4.nabble.com/selection-statistics-from-function-tp3221267p3221267.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Summing data frame columns on identical data

2011-01-17 Thread Dennis Murphy
Hi:

Try this based on the following toy example:

### Generate a list of named data frames
# There are more efficient ways to do this with replicate, but I forgot :)
# A function to generate a data frame
dmake - function() data.frame(A = factor(rep(1:5, each = 10)),
B = rep(rep(c(0.25, 0.5), each = 5), 5),
y = rnorm(50))
# Create an empty list
dflist - vector('list', 5)
# populate it
for(i in 1:5) dflist[[i]] - dmake()
# Give names to the list components:
names(dflist) - paste('df', 1:5, sep = '')


library(plyr)
# Function to sum y by A-B combinations for a generic data frame
dsum - function(d) ddply(d, .(A, B), summarise, sumY = sum(y))

# Apply it to each component of the list:

# Returns a list
summlist - llply(dflist, dsum)
# Returns a data frame
summdf - ldply(dflist, dsum)

Since you state that the individual data frames have different lengths, you
may want to add another variable to dsum to return length, perhaps something
like

dsum2 - function(d) ddply(d, .(A, B), summarise, sumY = sum(y), n =
length(y))

and apply either or both of the llply/ldply calls with dsum2 substituted for
dsum.

If you want to combine certain groups together, you can create a new factor
that merges levels. The following post in the archives provides a clue:
http://r.789695.n4.nabble.com/Documentation-detail-was-Merging-factor-levels-td911547.html

Since you already have the data frames, you can do something like

# Names of existing data frames in the workspace
filelist - c('mydf1', 'mydf2', 'anotherdf', 'what_more', 'oyvey')
dflist -  as.list(sapply(filelist, get))

and then move on to the summarization stage.

HTH,
Dennis

On Mon, Jan 17, 2011 at 10:42 AM, Steve Murray smurray...@hotmail.comwrote:


 Dear all,

 I have 9 data frames, and I'm simply trying to sum the values of column 3
 (on a row-by-row basis). However, there are a slightly different number of
 rows in each data frame, so I'm receiving the following error: Error in
 Ops.data.frame(mrunoff_207101[3], mrunoff_207102[3]) :
   + only defined for equally-sized data frames.

 Here is what I'm attempting to do:

  arunoff_2071 - cbind(mrunoff_207101[1:2], (mrunoff_207101[3] +
 mrunoff_207102[3] + mrunoff_207103[3] + mrunoff_207104[3] +
 mrunoff_207105[3] + mrunoff_207106[3] + mrunoff_207107[3] +
 mrunoff_207108[3] + mrunoff_207109[3]))


 Is there an easy way of summing based on congruent values in columns 1 and
 2? The only way I can think of would be to use merge, but this would involve
 doing this for every pair of data frames.

 The data for each data frame look like this:

  head(mrunoff_207101)
   Latitude Longitude  FPC
 1 5.75  0.25 0.0112384744
 2 6.25  0.25 0.0019959067
 3 6.75  0.25 0.0003245941
 4 7.25  0.25 0.0011973676
 5 7.75  0.25 0.0001062602
 6 8.25  0.25 0.0451578423


 Any suggestions on how to achieve this easily will be very welcome.

 Many thanks,

 Steve

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to doulbe all the value on a matrix

2011-01-17 Thread David Winsemius
 You have the capability of using the Nabble interface to post plain  
text. I have checked. There is a little button above your composition  
frame that lets you change to plain text.



On Jan 17, 2011, at 4:38 PM, André Dias wrote:


OK!!

So, the ideia is from the 1st matrix get the 2nd matrix with the use  
of a vector.


is it possible?



Peter B already gave you an answer.

 m[c(2,7, 8), c(1,4,6, 7)]
 V1 V4 V6 V7
[1,]  7  3 16  9
[2,]  3  2  9 18
[3,] 13 16  8  9




In the example I have a 10x10 matrix and I get from that one a  
second 4x3 matrix selected from a vector.


Then you should have presented the logical conditions for selecting  
that matrix rather than expecting us to guess what they might be.  
Please (re?)read the Posting Guide about what is expected of  
questioners regarding presenting complete examples of data and code.


--
David.



thanks
ADias


2011/1/17 David Winsemius dwinsem...@comcast.net

On Jan 17, 2011, at 11:16 AM, ADias wrote:


Hi,

yes it works perfectly.

I have another question:

Is there way of selecting with a vector the values I wish to take  
out from a

matrix.

Example:

I have this matrix and I want to take out the numbers in bold and  
get the

second matrix below

This is a plain text mailing list (despite what the Nabble mirror  
may (mis-)lead you into believing) ... no bold.


--
david.

m
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]   17165   192   191   15 8
[2,]77   2032   1699   1913
[3,]243   11   18   11   14   133 1
[4,]3757   17   18   106515
[5,]8   20   13   108   12   20   19116
[6,]9   141   12   12   12   17   18   1017
[7,]3   10   112   129   186   19 9
[8,]   132   17   16   1889   14916
[9,]94   1141   1797   2012
[10,]91488   19   198   1718

   [,1] [,2] [,3] [,4]
[1,]73   169
[2,]329   18
[3,]   13   1689

thanks
AD

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-doulbe-all-the-value-on-a-matrix-tp3221213p3221230.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix manipulations

2011-01-17 Thread Monica Pisica

Hi,

I've got 2 very good solutions, thank you very much. One, from Henrique 
Dallazuanna using the library reshape and one line of code - although it will 
take me quite some time to understand it. Here it is what he sent:

library(reshape)
xtabs(rowSums(cbind(value.x, value.y), na.rm = TRUE) ~ X1 + X2, merge(melt(m1), 
melt(m2), by = c('X1', 'X2'), all = TRUE), exclude = FALSE)


The other is from Phil Spector ( code below) that i can understand quite 
easily, although until now to my shame i never quite used factor levels and 
their properties and i don't know their uses and possibilities. Until now i 
tried to avoid them and transform them in something else (like character 
strings).

Again, thanks for all your help,
Monica



 Date: Mon, 17 Jan 2011 12:13:09 -0800
 From: spec...@stat.berkeley.edu
 To: pisican...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] matrix manipulations

 Monica -
 Perhaps this small example can demonstrate how factors can
 solve your problem:

  d1 = 
  data.frame(cat=sample(c('cat2','cat5','cat6'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE))
  d2 = 
  data.frame(cat=sample(c('cat1','cat3','cat4'),100,replace=TRUE),group=sample(c('land','water'),100,replace=TRUE))
  d1$cat = factor(d1$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6'))
  d2$cat = factor(d2$cat,levels=c('cat1','cat2','cat3','cat4','cat5','cat6'))
  table(d1$group,d1$cat) + table(d2$group,d2$cat)

 cat1 cat2 cat3 cat4 cat5 cat6
 land 14 17 18 22 19 23
 water 19 15 16 11 10 16

 This works because when you include all possible levels in a factor, R will
 automatically put zeroes in the right places when you use table():

  table(d1$group,d1$cat)
 cat1 cat2 cat3 cat4 cat5 cat6
 land 0 17 0 0 19 23
 water 0 15 0 0 10 16
  table(d2$group,d2$cat)
 cat1 cat2 cat3 cat4 cat5 cat6
 land 14 0 18 22 0 0
 water 19 0 16 11 0 0

 Hope this helps.
 - Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



 On Mon, 17 Jan 2011, Monica Pisica wrote:

 
  Hi,
 
  I am having some difficulties with matrix operations. It is a little hard 
  to explain it so please bear with me. I have a very large data set, large 
  enough that it needs to be split in parts in order to deal with. I can work 
  things on these parts but the problem lies in adding together these parts 
  for the final answer.
 
  So that been said, let's say that i split the data in 2 parts, 1 and 2. 
  Each part has data belonging to 6 different categories, and each category 
  has 2 different classes, these classes being the same for each category. 
  The classes are called land and water and each category is labeled 
  cat1 to cat6. I am using the command (function) table to tabulate each 
  class for each category, but since i split the data in 2 parts, one part 
  has only some of the 6 categories, and the other some other of the 6 
  categories (and not necessarily exclusive).
 
  So let's built some results after i used the table function.
 
  m1 - matrix(c(32, 35, 36, 12, 15, 16), nrow = 2, ncol = 3, byrow = TRUE, 
  dimnames = list(c(land, water), c(cat2, cat5, cat6)))
 
  m1
  cat2 cat5 cat6
  land 32 35 36
  water 12 15 16
 
  m2 - matrix(c(45, 46, 47, 48, 21, 22, 23, 24), nrow = 2, ncol = 4, byrow = 
  TRUE, dimnames = list(c(land, water), c(cat1, cat2, cat3, 
  cat4)))
 
  m2
  cat1 cat2 cat3 cat4
  land 45 46 47 48
  water 21 22 23 24
 
  So my end desired result should be a matrix (or a data frame) that has 6 
  columns called cat1 to cat6 and 2 rows labeled land and water, and for the 
  category that appears in both m1 and m2 the end result will be a sum.
 
  results will be m3:
 
  cat1 cat2 cat3 cat4 cat5 cat6
  land 45 78 47 48 35 36
  water 21 34 23 24 15 16
 
  To do this i thought in making an empty matrix for each m1 and m2 (called 
  m01 and m02 respectively) with 6 columns and 2 rows, and do a long if else 
  statement in which i match the name of the first column in m1 with the name 
  of the first column in m01 and if they match get the data from m1, if not 
  leave it 0 and so on. Same thing for m2 and m02. This is long and extremely 
  clunky but afterwards i can add m01 with m02 and get my desired result m3. 
  Is there any way i can do this more elegantly? My real data is split in 4 
  parts, but the problem is the same.
 
  Thanks for all your inputs, and sorry for this long email, but i didn't 
  know how else i could explain what i wanted to do.
 
  Monica
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] Using summaryBy with weighted data

2011-01-17 Thread Solomon Messing
Thanks Dennis, looks like there's even less boiler plate code with plyr.  By 
the way, what I labelled W.SE is meant to represent the weighted standard 
error of the mean.  Your WSE calculations appear to be providing the weighted 
standard deviation of the variable.  Is this a matter of needing to change my 
labels to W.SEM to avoid this kind of confusion, or is there literature 
suggesting that I should be using the standard deviation of the variable to 
estimate the weighted standard error of the mean? 

Thanks,

-Solomon

On Jan 17, 2011, at 1:11 PM, Dennis Murphy wrote:

 Hi:
 
 Does this do what you need?
 
 wstats - function(d) {
  require(Hmisc)
  N - length(d$response[!is.na(d$response)])
  c(WM = wtd.mean(d$response, d$weights),
WSE = sqrt(wtd.var(d$response, d$weights)),
N = N)
 }
 library(plyr)
 ddply(mydata, .(group), wstats)
   groupWM   WSE  N
 1 1  0.1302255752 1.1911298 20
 2 2 -0.2814664362 0.8582928 20
 3 3 -0.3640550516 1.2618343 20
 4 4  0.0002852392 1.1463205 20
 5 5 -0.0070283053 1.2315683 20
 
 The trick to writing this function for input into plyr is that the argument 
 is a data frame. When called in ddply(), the function wstats() will be 
 applied to each sub-frame corresponding to the grouping factor(s). Inside it, 
 the variables of interest are extracted relative to the input data frame and 
 the three quantities are computed. I used wtd.mean() and wtd.var() from 
 Hmisc, as both will remove NAs by default. In the ddply call, the function 
 name is simply cited since a sub-data frame is the sole argument of the 
 function.
 
 I couldn't figure out how to get doBy to get this to work, as it seems best 
 suited to functions of one argument (a single response), but here's an 
 alternative using the data.table package:
 
 library(data.table)
 # Assumes Hmisc is already loaded...
 myDT - data.table(mydata, key = 'group')
 myDT[, list(N = length(response[!is.na(response)]),
  wtdMean = wtd.mean(response, weights),
  wtdSE = sqrt(wtd.var(response, weights))), by = 'group']
  group  N   wtdMean wtdSE
 [1,] 1 20  0.1302255752 1.1911298
 [2,] 2 20 -0.2814664362 0.8582928
 [3,] 3 20 -0.3640550516 1.2618343
 [4,] 4 20  0.0002852392 1.1463205
 [5,] 5 20 -0.0070283053 1.2315683
 
 data.table uses a different model of data organization from data frames. A 
 simplistic description is that it you can think of a data.table as analogous 
 to a table in a DBMS. Notice that the 'function call' is indexed inside the 
 data table: the first 'subscript' corresponds to what are called I() 
 operations  (analogous to 'select' statements in an SQL); the second 
 'subscript' corresponds to J() operations, (analogous to  'where' 
 statements), while the third argument is the by group(s), or sub-data tables, 
 to which (in this case) the J() operations apply. 
 
 For functions that take multiple arguments and that are meant to be applied 
 in a groupwise fashion, I find plyr and data.table to be very good options. 
 There are also base package alternatives (e.g., some combination of lapply(), 
 mapply() and do.call()) and several other packages, but plyr and data.table 
 are generally pretty good at handling most of the niggling details. Having 
 said that, both have learning curves - data.table, in particular, will be 
 much easier to pick up if you have some background in SQLs, since its syntax 
 uses primary principles of SQL. 
 
 data.table has a vignette and FAQ, along with an independent help list - for 
 details, see its page on R-forge:
 http://r-forge.r-project.org/projects/datatable/
 For plyr's documentation, see
 http://had.co.nz/plyr/
 A link to its mailing list is found on that page as well.
 
 HTH,
 Dennis
 
 On Mon, Jan 17, 2011 at 10:24 AM, Solomon Messing solomon.mess...@gmail.com 
 wrote:
 Thanks Josh.  I built on your example and ended up with the code below--if 
 you or anyone sees any issues please let me know.  It would be great if there 
 were a slicker way to get these kinds of summary stats in R, but this gets 
 the job done.
 
 # takes data frame z with weights w and data x, returns weighted mean, 
 weighted SE, and N
 msenw = function(z){
N = length(na.omit(z)$response)
i = which(!is.na(z$response))
return(
c( W.M = weighted.mean(z$response, z$weights, na.rm=T),
W.SE = sqrt(wtd.var(z$response, weights = 
 z$weights))/sqrt(sum(z$weights[i])),
N=N ) )
 }
 
 library(doBy)
 library(Hmisc)
 ## make up some data (easier)
 mydata - data.frame(response = rnorm(100),
group = rep(1:5, each = 20), weights = runif(100, 0, 1))
 
 xy - by(mydata, mydata$group, msenw)
 data.frame( group = names(c(xy)), do.call(rbind, xy) )
 
 ## can be extended to other data using:
 xy - by(data.frame(response = mydata$response, weights = 

Re: [R] Using summaryBy with weighted data

2011-01-17 Thread Sebastián Daza

Hi everyone,
I am trying to run Sweave.bat (batchfiles_0.6-1) from the command line 
on Windows, but I get this error:


C:\batchfiles_0.6-1Sweave.bat Sweave-test-1
Error: rterm.exe not found

I don't know how to set up the path if this one were the problem... I 
ran rcmd.bat and I got this... so I don't know if it is a path problem.


C:\batchfiles_0.6-1Rcmd,bat
R_ARCH=/x64
R_ARCH0=x64
R_ARCH0=x64
cmdpath=C:\R\R-2.12.1\bin\x64\Rcmd.exe
args=,bat
'bat' is not recognized as an internal or external command,
operable program or batch file.

the path of rterm.exe in my computer is: C:\R\R-2.12.1\bin\x64
thank you in advance!

--
Sebastián Daza
sebastian.d...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] sweave.bat

2011-01-17 Thread Sebastián Daza

Hi everyone,
I am trying to run Sweave.bat (batchfiles_0.6-1) from the command line 
on Windows, but I get this error:


C:\batchfiles_0.6-1Sweave.bat Sweave-test-1
Error: rterm.exe not found

I don't know how to set up the path if this one were the problem... I 
ran rcmd.bat and I got this... so I don't know if it is a path problem.


C:\batchfiles_0.6-1Rcmd,bat
R_ARCH=/x64
R_ARCH0=x64
R_ARCH0=x64
cmdpath=C:\R\R-2.12.1\bin\x64\Rcmd.exe
args=,bat
'bat' is not recognized as an internal or external command,
operable program or batch file.

the path of rterm.exe in my computer is: C:\R\R-2.12.1\bin\x64
thank you in advance!

--
Sebastián Daza
sebastian.d...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extraction and replacement of data in a data frame

2011-01-17 Thread Mike Marchywka









 Date: Mon, 17 Jan 2011 12:51:43 -0800
 From: michael.hopg...@mrm.se
 To: r-help@r-project.org
 Subject: [R] Extraction and replacement of data in a data frame


 Dear R family,
 I am a relative newbie and have been dabbling with R for a little while.
 Simple things really, but my employers are beginning to see the benefits of
 using R instead of excel. We have a remote monitoring station measuring
 groundwater levels. We download the date as a .csv file and up until now,
 we have been using excel to analyse the data. It’s been a hassle trying to
 wrestle with that damn program as my boss wants to do things that excel was
 never meant to do, so I’ve convinced my boss to give R a chance. It’s been
 a steep learning curve, but I’m fairly confident I can reduce the amount of
 labour involved in producing and improving the graphs we show our clients.

 The groundwater levels are measured by pressure sensors lowered into the
 monitoring wells. After a certain time, the sensors were lowered further
 into the well, thus creating a disparity in the measurements.

 The data frame I import into R looks something like this:
 Date Waterhead (mm)
 10-01-01 100
 10-01-02 105
 10-01-03 101
 10-01-04 99
 10-01-05 85
 10-01-06 200
 10-01-07 199
 10-01-08 195
 10-01-09 185
 10-01-10 170

 For example, on the 10-10-06, the sensor was lowered by 115 mm.
 When I download the csv file, I download the data from the beginning of the
 measurement period. I then need to adjust the height by 115 mm to account
 for the lowering of the parameter. My question to you is how do I do that
 in R?
 I am after a formula or a manipulation that selects the first five
 measurements and adds a fixed amount. This is something that is added
 everytime I download the csv file and import it into R so that when I
 display my data, it is based on the following data frame:


See if this helps, I'm still learning how to do good R but this seems
to work. Just personal pref I converted your data to csv,

 254  cat xxx.txt | awk '{print 20$1,$2}'  xxx.csv

I've neer used posix before, just copying what I've seen here but
it seemed to work as shown below, 

x-read.table(xxx.csv,sep=,)
str(x)
x$V1=as.POSIXct(x$V1)
str(x)
y=(x$V1as.POSIXct(2010-01-05))
y
x$V2[y]=x$V2[y]+1
x


output ends like 

5  2010-01-05   200
6  2010-01-06 10200
7  2010-01-07 10199
8  2010-01-08 10195
9  2010-01-09 10185
10 2010-01-10 10170




 Date Waterhead (mm)
 10-01-01 215
 10-01-02 220
 10-01-03 216
 10-01-04 214
 10-01-05 200
 10-01-06 200
 10-01-07 199
 10-01-08 195
 10-01-09 185
 10-01-10 170

 In short, I want to select a fixed number of rows of a column from my data
 frame, add a constant to these, and insert the new values into their
 respective rows without affecting the subsequent rows. I hope I have
 produced a reproducible example. I have been searching high and low for a
 solution, but have come up against a brick wall. I feel I have read
 something that tackles this some time in the past, but can’t find it again.
 Thanks in advance!

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Extraction-and-replacement-of-data-in-a-data-frame-tp3221261p3221261.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Manipulation

2011-01-17 Thread Ista Zahn
Hi Michael,

This can be accomplished using the basic extract and assign functions:

dat - structure(list(Date = structure(1:10, .Label = c(10-01-01,
10-01-02, 10-01-03, 10-01-04, 10-01-05, 10-01-06, 10-01-07,
10-01-08, 10-01-09, 10-01-10), class = factor), Waterhead = c(100,
105, 101, 99, 85, 200, 199, 195, 185, 170)), .Names = c(Date,
Waterhead), row.names = c(NA, -10L), class = data.frame)

dat[1:5, Waterhead] - dat[1:5, Waterhead] - 115

You may find it helpful to work through An Introduction to R
(http://cran.r-project.org/manuals.html) and/or one or more of the
fine contributed introductory tutorials
(http://cran.r-project.org/other-docs.html).

Best,
Ista

On Mon, Jan 17, 2011 at 3:49 PM, michael.hopgood michael.hopg...@mrm.se wrote:

 Dear R family,
 I am a relative newbie and have been dabbling with R for a little while.
 Simple things really, but my employers are beginning to see the benefits of
 using R instead of excel. We have a remote monitoring station measuring
 groundwater levels.  We download the  date as a .csv file and up until now,
 we have been using excel to analyse the data.  It’s been a hassle trying to
 wrestle with that damn program as my boss wants to do things that excel was
 never meant to do,  so I’ve convinced my boss to give R a chance.  It’s been
 a steep learning curve, but I’m fairly confident I can reduce the amount of
 labour involved in producing and improving the graphs we show our clients.

 The groundwater levels are measured by pressure sensors lowered into the
 monitoring wells.   After a certain time, the sensors were lowered further
 into the well, thus creating a disparity in the measurements.

 The data frame I import into R looks something like this:
 Date            Waterhead (mm)  Parameter 1 Paramater 2, etc.
 10-01-01     100
 10-01-02     105
 10-01-03     101
 10-01-04      99
 10-01-05      85
 10-01-06    200  # - Sensor lowered#
 10-01-07    199
 10-01-08    195
 10-01-09    185
 10-01-10    170

 For example, on the 10-10-06, the sensor was lowered by 115 mm.
 When I download the csv file, I download the data from the beginning of the
 measurement period. I then need to adjust the height by 115 mm to account
 for the lowering of the parameter.  My question to you is how do I do that
 in R?
 I am after a formula or a manipulation that selects the first five
 measurements of a column in the data frame and adds a fixed amount.  This is
 something that is added everytime I download the csv file and import it into
 R so that when I display my data, it is based on the following data frame:

 Date            Waterhead (mm)
 10-01-01     215
 10-01-02     220
 10-01-03     216
 10-01-04      214
 10-01-05      200
 10-01-06    200
 10-01-07    199
 10-01-08    195
 10-01-09    185
 10-01-10    170

 In short, I want to select a fixed number of rows from my data frame, add a
 constant to the rows of one of the columns, and insert the new values into
 their respective rows without affecting the subsequent rows.  I hope I have
 produced a reproducible example, I have been searching high and low for a
 solution, but have come up against a brick wall. I feel I have read
 something that tackles this some time in the past, but can’t find it again.
 Thanks in advance!

 Sincerely,
 Michael Hopgood
 MRM Konsult AB


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Manipulation-tp3221260p3221260.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Accessing MySQL Database in R

2011-01-17 Thread Dennis Murphy
Hi:

Because R does not have a direct interface to MySQL?

You need to load a communication package - the two most common ones are
RODBC and RMySQL. The former requires that you register your MySQL database
table(s) with ODBC before using the RODBC package on them, whereas the
latter works with specific version combinations of MySQL and R. The RODBC
package has a very informative vignette; for information re the RMySQL
package, see
http://biostat.mc.vanderbilt.edu/wiki/Main/RMySQL

HTH,
Dennis

On Mon, Jan 17, 2011 at 1:30 PM, schlafly andrewschla...@gmail.com wrote:


 I have a local installation of MySQL on my computer.

 I enter the following to access MySQL from the command line:
 /Applications/MAMP/Library/bin/mysql -h localhost -u root -p
 I am then prompted for a password, and I use: root
 This connects me to MySQL in the command line.

 I now want to access MySQL databases in R. I enter the following:
 mysql - dbDriver(MySQL)
 conn - dbConnect(mysql,user='root',host='localhost', password='root')

 I get the following error message: Error in mysqlNewConnection(drv, ...) :
 RS-DBI driver: (Failed to connect to database: Error: Access denied for
 user
 'root'@'localhost' (using password: YES)

 Does anyone know why these aren't equivalent?
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Accessing-MySQL-Database-in-R-tp3221264p3221264.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to cut a multidimensional array along a chosen dimension and store each piece into a list

2011-01-17 Thread Hadley Wickham
On Mon, Jan 17, 2011 at 2:20 PM, Sean Zhang seane...@gmail.com wrote:
 Dear R-Helpers,

 I wonder whether there is a function which cuts a multiple dimensional array
 along a chosen dimension and then store each piece (still an array of one
 dimension less) into a list.
 For example,

 arr - array(seq(1*2*3*4),dim=c(1,2,3,4))  # I made a point to set the
 length of the first dimension be 1to test whether I worry about drop=F
 option.

 brkArrIntoListAlong - function(arr,alongWhichDim){
 
 return(outlist)
 }

 I have tried splitter_a in plyr package but does not get what I want.

 library(plyr)
 plyr:::splitter_a(arr,3)

We'll you're really not supposed to call internal functions - you probably want:

alply(arr, 3)

but you don't say what is wrong with the output.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >