[R] Imputing missing values using LSmeans (i.e., population marginal means) - advice in R?

2012-04-03 Thread Jenn Barrett
Hi folks,

I have a dataset that consists of counts over a ~30 year period at multiple 
(200) sites. Only one count is conducted at each site in each year; however, 
not all sites are surveyed in all years. I need to impute the missing values 
because I need an estimate of the total population size (i.e., sum of counts 
across all sites) in each year as input to another model. 

 head(newdat,40)
   SITE YEAR COUNT
1 1 1975 12620
2 1 1976 13499
3 1 1977 45575
4 1 1978 21919
5 1 1979 33423
...
372 1975 4
382 1978 40322
392 1979 7
402 1980 16244


It was suggested to me by a statistician to use LSmeans to do this; however, I 
do not have SAS, nor do I know anything much about SAS. I have spent DAYS 
reading about these LSmeans and while (I think) I understand what they are, I 
have absolutely no idea how to a) calculate them in R and b) how to use them to 
impute my missing values in R. Again, I've searched the mail lists, internet 
and literature and have not found any documentation to advise on how to do this 
- I'm lost.

I've looked at popMeans, but have no clue how to use this with predict() - if 
this is even the route to go. Any advice would be much appreciated. Note that 
YEAR will be treated as a factor and not a linear variable (i.e., the 
relationship between COUNT and YEAR is not linear - rather there are highs and 
lows about every 10 or so years).

One thought I did have was to just set up a loop to calculate the least-squares 
estimates as:

Yij = (IYi + JYj - Y)/[(I-1)(J-1)]
where  I = number of treatments and J = number of blocks (so I = sites and J = 
years). I found this formula in some stats lecture handouts by UC Davis on 
unbalanced data and LSMeans...but does it yield the same thing as using the 
LSmeans estimates? Does it make any sense? Thoughts?

Many thanks in advance.

Jenn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?

2012-04-03 Thread peter dalgaard

On Apr 3, 2012, at 07:19 , Philip Rhoades wrote:

 People,
 
 On my home computer with a real plot I get what I expect - open circles as 
 the plot character - on my work computer I get qs !  So I tried a trivial 
 test plot on the work computer ie:
 

Argh. I have forgotten this before... There was a configuration issue leading 
to plotting symbols showing up as q, but I can't recall the details. Possibly 
libpango-related (do you have it installed?)

-pd


 a - c(1,2,3,4,5)
 b - c(1,2,3,4,5)
 plot(a,b)
 
 and that also gives me open circles as it should! - the plot line in my real 
 script is:
 
 plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1) Rst, 
 ylab=ADf )
 
 I have tried adding the pch=x command with no effect . . what is going on?  
 (Both computers have the same fonts installed although the home computer is 
 Fedora 16 and the work one is Fedora 15).  It is very frustrating . .
 
 Thanks,
 
 Phil.
 -- 
 Philip Rhoades
 
 GPO Box 3411
 Sydney NSW2001
 Australia
 E-mail:  p...@pricom.com.au
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error in gamma(delta + (complex(0, 0, 1) * (x - mu))/alpha) : unimplemented complex function

2012-04-03 Thread Hans W Borchers
 I am trying to obtain the grafic of a pdf . but this error keeps showing .
 Here is the code

Or use the complex gamma function gammaz() in package 'pracma'.
The following code works, that is produces a plot:

library(pracma)
MXN.fd - function(x,alpha,beta,mu,delta) {
A - (2*cos(beta/2))^(2*delta)
B - 2*alpha*pi*gamma(2*delta)
C - (beta*(x-mu))/alpha
D - abs(gammaz(delta + (complex(0,0,1)*(x-mu))/alpha)^2)
M - A/B*exp(C)*D
plot(x,M,type=l,lwd=2,col=red)
}

alpha - 0.02612297; beta - -0.50801886
mu - 0.00575829917; delta - 0.67395
x - seq(-0.04,0.04,length=200)
MXN.fd(x,alpha,beta,mu,delta)
grid()


 i think the problem is the gamma function, does anyone know how to compute
 gamma with imaginary numbers?

 thanks in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] e1071 tune.control() random parameter

2012-04-03 Thread Jessica Streicher
I'm not sure what the parameter specifies:

random  
if an integer value is specified, random parameter vectors are drawn from the 
parameter space.
What are the parameter vectors and what is the parameter space? What means 
drawn?

greetings
Jessi
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Package seems to be present but library don't find it

2012-04-03 Thread Marc Girondot

Hi,

I try to make my first package? The HelloWorld.R file is:
 HelloWorld.R 
#' showHello est une fonction R permettant d'afficher le message
#' Hello World! sur la console.
#' @title la fonction showHello()

showHello -function(){
cat(Hello World!\n)
}

I use the following procedure to get the tar:

# set the working directory where the file is located
 setwd(...)

 package.skeleton(HelloWorld,code_files=c(HelloWorld.R))

# to generate .rd files
 library(roxygen2)
 roxygenize(HelloWorld,copy.package=FALSE)

 system(R CMD build '/Users/marcgirondot/Documents/Espace de travail 
R/Phenology/Source fit/Essai_package/HelloWorld')
* checking for file ‘/Users/marcgirondot/Documents/Espace de travail 
R/Phenology/Source fit/Essai_package/HelloWorld/DESCRIPTION’ ... OK

* preparing ‘HelloWorld’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
Removed empty directory ‘HelloWorld/inst’
* building ‘HelloWorld_1.0.tar.gz’


 install.packages(/Users/marcgirondot/Documents/Espace\ de\ travail\ 
R/Phenology/Source\ fit/Essai_package/HelloWorld_1.0.tar.gz, repos = NULL)
Installing package(s) into 
‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library’

(as ‘lib’ is unspecified)

 library(HelloWorld)
Erreur dans library(HelloWorld) :
‘HelloWorld’ n'est pas un nom correct de package installé

Whereas the Helloworld folder is available in the library folder with 
other packages

/Library/Frameworks/R.framework/Versions/2.14/Resources/library/HelloWorld


--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?

2012-04-03 Thread Prof Brian Ripley

On Tue, 3 Apr 2012, peter dalgaard wrote:



On Apr 3, 2012, at 07:19 , Philip Rhoades wrote:


People,

On my home computer with a real plot I get what I expect - open 
circles as the plot character - on my work computer I get qs ! 
So I tried a trivial test plot on the work computer ie:




Argh. I have forgotten this before... There was a configuration 
issue leading to plotting symbols showing up as q, but I can't 
recall the details. Possibly libpango-related (do you have it 
installed?)


We were not told the device   This is well-known for pdf() and 
poppler-based PDF viewers so you could try the workaround in ?pdf.


I don't really see how this could happen on the default X11 device, as 
it draws circles rather than uses fonts.





-pd



a - c(1,2,3,4,5)
b - c(1,2,3,4,5)
plot(a,b)

and that also gives me open circles as it should! - the plot line 
in my real script is:


plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1) Rst, 
ylab=ADf )

I have tried adding the pch=x command with no effect . . what is 
going on?  (Both computers have the same fonts installed although 
the home computer is Fedora 16 and the work one is Fedora 15).  It 
is very frustrating . .


Thanks,

Phil.
--
Philip Rhoades

GPO Box 3411
Sydney NSW  2001
Australia
E-mail:  p...@pricom.com.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] output of several results from a function

2012-04-03 Thread Johannes Radinger
Hi,

I try my first time to write a summary method for my function. The result of my 
function is a list of two objects (2 arrays). In my summary both objects should 
be displayed but with a some introductory text like:

ls - list(A=A123,B=B123)

summaryout=function(x,...){
cat(This is output A:/n)
x$A
cat(This is output B:/n)
x$B
}

How can I achieve that both list-elements are displayed and the lines 
inbetween, so that it looks nice? 

As the single list elements are arrays (partly 5-dimensional) is there a nice 
way to display them i a summary or print method?

cheers,


best regards,

Johannes
-- 

Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fisher's LSD multiple comparisons in a two-way ANOVA

2012-04-03 Thread Jinsong Zhao

Hi there,

Is there a function that can do a Fisher's LSD multiple comparisons in a 
two-way ANOVA? I hope to get a result similar with TukeyHSD(). 
Especially, I hope to know the significance of comparisons between the 
interactions of two factors.


In the following example:

x - c(76, 84, 78, 80, 82, 70, 62, 72, 71, 69, 72, 74, 66, 74, 68, 66, 
69, 72, 72, 78, 74, 71, 73, 67, 86, 67, 72, 85, 87, 74, 83, 86, 66, 68, 
70, 76, 78, 76, 69, 74, 72, 72, 76, 69, 69, 82, 79, 81)

a - rep(c(A1,A2), each = 24)
b - rep(c(B1,B2,B3), each =8, times = 2)
a - factor(a)
b - factor(b)
x.aov - aov(x~a*b)

I hope to obtained the results similar with

http://www.gigawiz.com/images12/twowayrmposthoc.jpg

Any suggestions or comments will be really appreciated.

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] output of several results from a function

2012-04-03 Thread Ista Zahn
Hi Johannes,

You are on the right track. Just need to use the right line break
indicators, and explicitly call print. Like this:

tmp - list(A=A123,B=B123)

summaryout=function(x,...){
   cat(This is output A:\n)
   print(x$A)
   cat(This is output B:\n)
   print(x$B)
}

summaryout(tmp)

Best,
Ista

On Tue, Apr 3, 2012 at 5:12 AM, Johannes Radinger jradin...@gmx.at wrote:
 Hi,

 I try my first time to write a summary method for my function. The result of 
 my function is a list of two objects (2 arrays). In my summary both objects 
 should be displayed but with a some introductory text like:

 ls - list(A=A123,B=B123)

 summaryout=function(x,...){
        cat(This is output A:/n)
        x$A
        cat(This is output B:/n)
        x$B
 }

 How can I achieve that both list-elements are displayed and the lines 
 inbetween, so that it looks nice?

 As the single list elements are arrays (partly 5-dimensional) is there a nice 
 way to display them i a summary or print method?

 cheers,


 best regards,

 Johannes
 --

 Jetzt informieren: http://mobile.1und1.de/?ac=OM.PW.PW003K20328T7073a

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fisher's LSD multiple comparisons in a two-way ANOVA

2012-04-03 Thread Rmh
yes.  See ?glht in the multcomp package, and the examples using glht in ?MMC in 
the HH package.

Sent from my iPhone

On Apr 3, 2012, at 6:16, Jinsong Zhao jsz...@yeah.net wrote:

 Hi there,
 
 Is there a function that can do a Fisher's LSD multiple comparisons in a 
 two-way ANOVA? I hope to get a result similar with TukeyHSD(). Especially, I 
 hope to know the significance of comparisons between the interactions of two 
 factors.
 
 In the following example:
 
 x - c(76, 84, 78, 80, 82, 70, 62, 72, 71, 69, 72, 74, 66, 74, 68, 66, 69, 
 72, 72, 78, 74, 71, 73, 67, 86, 67, 72, 85, 87, 74, 83, 86, 66, 68, 70, 76, 
 78, 76, 69, 74, 72, 72, 76, 69, 69, 82, 79, 81)
 a - rep(c(A1,A2), each = 24)
 b - rep(c(B1,B2,B3), each =8, times = 2)
 a - factor(a)
 b - factor(b)
 x.aov - aov(x~a*b)
 
 I hope to obtained the results similar with
 
 http://www.gigawiz.com/images12/twowayrmposthoc.jpg
 
 Any suggestions or comments will be really appreciated.
 
 Regards,
 Jinsong
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] e1071 tune.control() random parameter

2012-04-03 Thread Alekseiy Beloshitskiy
Hello, Jessica,

Can you please elaborate what you're trying to find with that function.
If e.g. you want to find best parameters C and gamma for RBF model (SVM), you 
can use grid.py, check here:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/

Function tune.control() in package e1071 is an R interface to this algorithm.

Best,
-Alex

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Jessica Streicher [j.streic...@micromata.de]
Sent: 03 April 2012 11:05
To: r-help@r-project.org
Subject: [R] e1071 tune.control() random parameter

I'm not sure what the parameter specifies:

random
if an integer value is specified, random parameter vectors are drawn from the 
parameter space.
What are the parameter vectors and what is the parameter space? What means 
drawn?

greetings
Jessi
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] TR: [e1071] Load an SVM model exported with write.svm

2012-04-03 Thread Alekseiy Beloshitskiy
Hello, Alexandre,

In R you can specify whether to use or not scaling with parameter scale, e.g.:
model-svm(class_var ~ ., data=trainset, scale=FALSE);

Don't forget to disable it if you already sclaed your data with libsvm 
svm-scale algorithm.


Best,
-Alex

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Alexandre Marié [alexandre.ma...@artelys.com]
Sent: 29 March 2012 13:22
To: r-help@r-project.org
Subject: [R] TR: [e1071] Load an SVM model exported with write.svm

Dear R help mailing list,



I work on a project using the SVM implementation from e1071 R package and I
really need your help in order to use correctly the write.svm function.



I trained my SVM model in R and I would like to use this model in Java. To
do that, I would like to use the libsvm Java version (I tried to used
jlibsvm, in order to benefit from the refactoring, but it does not seem to
work and the lack of documentation push me to switch to the libsvm java
version).



Thus, I exported my model with the function write.svm and import it with the
libsvm java library but I didn’t obtain the expected result from my test
data set. I suspect that the load of the model file is not enough and it is
necessary to use the .scale file, in Java, to scale the data (I suspect that
the model object in R contains the scaling) but I don’t know how to do that.



In order to understand how it is working, I would like, in the first place,
load the file generated by write.svm in R in order to be sure the model in
the file match with my R object. But I don’t find any function to load that
kind of file.



How could I load a model file generated by write.svm in R?



Thanks a lot in advance for your help!



Best regards,



  _

Alexandre Mariι
Artelys : www.artelys.com




[[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 1:19 AM, Philip Rhoades wrote:


People,

On my home computer with a real plot I get what I expect - open  
circles as the plot character - on my work computer I get qs !  So  
I tried a trivial test plot on the work computer ie:


a - c(1,2,3,4,5)
b - c(1,2,3,4,5)
plot(a,b)

and that also gives me open circles as it should! - the plot line in  
my real script is:


plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1)  
Rst, ylab=ADf )


I have tried adding the pch=x command with no effect . .


Did you actually type pch=x  ?

Not pch=x ?

--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to use condition indexes to test multi-collinearity

2012-04-03 Thread John Fox
Dear Frank,

The idea of collinearity with the intercept never made a whole lot of sense
to me, unless one entirely assimilates the notion of collinearity with
numerical instability, rather than thinking about it as the difficulty of
separating different partial relationships. In any event, variables with
values far from 0, where the ratio of the largest to smallest value is close
to 1, will be collinear with the intercept. Thus, a way of dealing with
the problem, if you think it's a problem, is to centre the variables at
their means.

I hope this helps,
 John


John Fox
Senator William McMaster
  Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada
http://socserv.mcmaster.ca/jfox



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of frank.chiang
 Sent: April-02-12 11:47 PM
 To: r-help@r-project.org
 Subject: [R] how to use condition indexes to test multi-collinearity
 
 Dear Users,
 I try to calculate condition indexes and variance decomposition
 proportions  in order to test for collinearity using colldiag() in
 perturb package, I got  a large index and two variables with large
 variance decomposition  proportions,but one of them is constant item.I
 also checked the VIF for that  variable, the value is small.The result
 is as follows:
 
  Index interceptV1V2
  163.54  0.97 0.810.16
 
  VIF(V1)=1.4
 
  V1,V2 are two variables with largest variance decomposition
 proportions.
 
  I haven't got the Belsley's book, and don't quite understand what's
 the  meaning of collinearity between intercept and one variable.If
 that's the case, how to deal with it? Discard  the V1 variable? Anyone
 can give me some suggestions?Thanks
 
 --
 View this message in context: http://r.789695.n4.nabble.com/how-to-use-
 condition-indexes-to-test-multi-collinearity-tp4527740p4527740.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] grouping

2012-04-03 Thread Val
Hi all,

Assume that I have the following 10 data points.
 x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

sort x  and get the following
  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

I want to  group the sorted  data point (y)  into  equal number of
observation per group. In this case there will be three groups.  The first
two groups  will have three observation  and the third will have four
observations

group 1  = 34, 45, 46
group 2  = 66, 78, 125
group 3  = 193, 209, 242,297

Finally I want to calculate the group mean

group 1  =  42
group 2  =  87
group 3  =  234

Can anyone help me out?

In SAS I used to do it using proc rank.

thanks in advance

Val

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] temporal disaggregation

2012-04-03 Thread ronald
There is a package called 'tempdisagg' on CRAN that offers a similar
functionality.

--
View this message in context: 
http://r.789695.n4.nabble.com/temporal-disaggregation-tp3745205p4528583.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] rpart error message

2012-04-03 Thread Raji
Hi R-helpers,
 I am using rpart package for decision tree using R.We are invoking R
environment through JRI from our java application.Hence, the result of R
command is returned in REXP and we use geterrMessage() to retrieve the
error.

When we execute the following command,

cnr_model-rpart(as.factor(Species)~Sepal Length+Sepal Width+Petal Length,
method=class, parms=list(split=gini,prior=c()),
control=rpart.control(minsplit=2,
na.action=na.pass,cp=0.001,usesurrogate=1,maxsurrogate=2,surrogatestyle=0,maxdepth=20,xval=10))

we get an error message* Error: unexpected symbol in
a-cnr_model-rpart(as.factor(Species)~Sepal Length*

The REXP returned in not null and the geterrMessage() call doesnt return an
error message.

But for other errors in R , say summary(cnr_model) , REXP returned is null
and we get the following error message from geterrMessage() call.

*Error in summary(cnr_model) : object 'cnr_model' not found*

Is this behavior of showing error message specific to rpart package ? Why is
this difference in the way error message is displayed in rpart package?Can
you please let us know how to retrieve error messages of rpart command
through JRI? 

Thanks in advance.

--
View this message in context: 
http://r.789695.n4.nabble.com/rpart-error-message-tp4528104p4528104.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help in ddply

2012-04-03 Thread arunkumar1111
Hi

I've records like this
df=
x   panel
4   1
93  2
21  3
83  4
75  1
87  2
87  3
78  4
50  1
76  2
86  3
65  4
84  1
40  2
39  3
26  4


i want to create histogram out of it . i want all the mid and count values
for panel wise

my code is 
histoutput = ddply(df,.(df[2]),hist)

i'm not able to get the required result.

please help me
using for loop takes a lot of time if there are more records

-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/help-in-ddply-tp4528064p4528064.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Histogram classwise

2012-04-03 Thread arunkumar1111
Hi 
I have a data class wise. I want to create a histogram class wise without
using for loop as it takes a long time 
my data looks like this

x   class
27  1
93  3
65  5
1   2
69  5
2   1
92  4
49  5
55  4
46  1
51  3
100 4




-
Thanks in Advance
Arun
--
View this message in context: 
http://r.789695.n4.nabble.com/Histogram-classwise-tp4528624p4528624.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] filling small gaps of N/A

2012-04-03 Thread jeff6868
Hi everybody,

I'm a new R french user. Sorry if my english is not perfect. Hope you'll
understand my problem ;)

I have to work on temperature data (35000 lines in one file) containing some
missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
but I have also sometimes 100 or 200 N/A following each other. Here's an
example of my data, when I have only small gaps of missing data (2 or 3
N/A):

09/01/2008 12:00   2   1.93   2.93   4.56   5.43
09/01/2008 12:15   2   *3.93*   3.25   4.93   5.56
09/01/2008 12:30   2NA   3.5   5.06   5.56
09/01/2008 12:45   2NA   3.68 5.25   5.68
09/01/2008 13:00   2   *4.93 *  3.87   5.56   5.93
09/01/2008 13:15   2   5.93   4.25   5.75   6.06
09/01/2008 13:30   2   3.93   4.56   5.93   6.18

My question is: how can I replace these small gaps of N/A by numeric values?
I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
data, but not the big gaps (more than 5 N/A following each other).

For the moment, i'm trying to do it by working with the time gap between the
2 numeric values surrounding the N/A as following:

imputation - function(x){
met = NULL

temp - met[1] - x[1]

ind_temp - 1

tps - time(x)
   
for (i in 2:(length(x)) ){
if((tps[i]-tps[ind_temp]  1)(tps[i]-tps[ind_temp] =
4)(is.na(x[i]))){
met[i] - na.approx(x)
}
else {
temp - met[i] - x[i]
ind_temp - i
}   
}

return(met)
}

In this example, I would like to apply the function: na.approx(x) on my N/A,
but only when I have maximum 4 N/A following each other.
There's no error, but it doesn't work (it was working in the other way, when
I had to detect aberrant data and replace it by N/A, but not now). It is
maybe not the good way to solve this problem. I don't have a lot of
experience in R. Maybe there is an easier way to do it...
Does somebody have an idea about it for helping me?
Thanks a lot!


--
View this message in context: 
http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hclust and plot functions work, cutree does not

2012-04-03 Thread vinod1
Sarah,

.   clust_tree=hclust(as.dist(x),method=complete)
.   plot(clust_tree)

this produces a dendrogram, whereas 
.   clust_tree=hclust(as.dist(x),method=complete)
.cut = cutree(clust_tree,k=1:5)
.plot(cut)

produces a plot with 2 dots. The dissimilarity matrix x is 100*100 matrix.

Thanks for any feedback. Also, I am looking at various methods to retrieve
the sub-trees. Any tips on such techniques would be helpful to me.

 

--
View this message in context: 
http://r.789695.n4.nabble.com/hclust-and-plot-functions-work-cutree-does-not-tp4515010p4528282.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Choice between gwidgetsRGtk2 and gwidgetstcltk

2012-04-03 Thread mrzung
Hi, all

What I want to ask is what is better gui between gwidgetsRGtk2 and
gwidgetstcltk?

In my opinion, gwidgetsRgtk2 has more functionality for gui.
For example , in importing image from file, Rgtk2 works with most of formats
but tcltk only with gif and pnm.

However, gwidgetstcltk is more lovely than Rgtk2.

Is there a way to mix two of them or alternative?


--
View this message in context: 
http://r.789695.n4.nabble.com/Choice-between-gwidgetsRGtk2-and-gwidgetstcltk-tp4528350p4528350.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fitting data with constraints to GEV distribution

2012-04-03 Thread chong20
Hi, 

Is there a way to fit a set of data with specified constraints (ie specified
the percentile values at say 99.9%) to a GEV distribution?

Thanks


--
View this message in context: 
http://r.789695.n4.nabble.com/Fitting-data-with-constraints-to-GEV-distribution-tp4528418p4528418.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] e1071 tune.control() random parameter

2012-04-03 Thread Jessica Streicher
The function is a parameter to the tune function of library e1071. The 
other parameters of the tune.control() function specify more or less the 
sampling method (cross-validation, fixed or bootstrap+attributes for those)


tune itself iterates over valuevectors of the parameters (e.g. C and 
gamma) and calculates errors in the classification for each combination 
(using the provided sampling method in tune.control() i would guess)


also: e1071 = libsvm
and i have enough problems with R, i definetly wont start with python now ;)

The question is rather of academic interest, i can do what i want to 
do already. But it COULD be something useful, after all it is the first 
argument of the function^^


On 03.04.12 14:13, Alekseiy Beloshitskiy wrote:

Hello, Jessica,

Can you please elaborate what you're trying to find with that function.
If e.g. you want to find best parameters C and gamma for RBF model (SVM), you 
can use grid.py, check here:
http://www.csie.ntu.edu.tw/~cjlin/libsvm/

Function tune.control() in package e1071 is an R interface to this algorithm.

Best,
-Alex

From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] on behalf of 
Jessica Streicher [j.streic...@micromata.de]
Sent: 03 April 2012 11:05
To: r-help@r-project.org
Subject: [R] e1071 tune.control() random parameter

I'm not sure what the parameter specifies:

random
if an integer value is specified, random parameter vectors are drawn from the 
parameter space.
What are the parameter vectors and what is the parameter space? What means 
drawn?

greetings
Jessi
 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 8:47 AM, Val wrote:


Hi all,

Assume that I have the following 10 data points.
x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

sort x  and get the following
 y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)


The methods below do not require a sorting step.



I want to  group the sorted  data point (y)  into  equal number of
observation per group. In this case there will be three groups.  The  
first

two groups  will have three observation  and the third will have four
observations

group 1  = 34, 45, 46
group 2  = 66, 78, 125
group 3  = 193, 209, 242,297

Finally I want to calculate the group mean

group 1  =  42
group 2  =  87
group 3  =  234


I hope those weren't answers from SAS.



Can anyone help me out?



I usually do this with Hmisc::cut2 since it has a `g = n` parameter  
that auto-magically calls the quantile splitting criterion but this is  
done in base R.


split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,  
include.lowest=TRUE) )

$`[36,65.9]`
[1] 36 45 46

$`(65.9,189]`
[1]  66  78 125

$`(189,297]`
[1] 193 209 242 297


 lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,  
include.lowest=TRUE) ), mean)

$`[36,65.9]`
[1] 42.3

$`(65.9,189]`
[1] 89.7

$`(189,297]`
[1] 235.25

Or to get a table instead of a list:
 tapply( x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,  
include.lowest=TRUE) , mean)

 [36,65.9] (65.9,189]  (189,297]
  42.3   89.7  235.25000


In SAS I used to do it using proc rank.


?quantile isn't equivalent to  Proc Rank but it will provide a useful  
basis for splitting or tabling functions.




thanks in advance

Val

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread R. Michael Weylandt
Ignoring the fact your desired answers are wrong, I'd split the
separating part and the group means parts into three steps:

i) quantile() can help you get the split points,
ii)  findInterval() can assign each y to a group
iii) then ave() or tapply() will do group-wise means

Something like:

y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here.
ave(y, findInterval(y, quantile(y, c(0.33, 0.66
tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

You could also use cut2 from the Hmisc package to combine findInterval
and quantile into a single step.

Depending on your desired output.

Hope that helps,
Michael

On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:
 Hi all,

 Assume that I have the following 10 data points.
  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

 sort x  and get the following
  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

 I want to  group the sorted  data point (y)  into  equal number of
 observation per group. In this case there will be three groups.  The first
 two groups  will have three observation  and the third will have four
 observations

 group 1  = 34, 45, 46
 group 2  = 66, 78, 125
 group 3  = 193, 209, 242,297

 Finally I want to calculate the group mean

 group 1  =  42
 group 2  =  87
 group 3  =  234

 Can anyone help me out?

 In SAS I used to do it using proc rank.

 thanks in advance

 Val

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Giovanni Petris
Probably something along the following lines:

 x - c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
 sorted - c(36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
 tapply(sorted, INDEX = (seq_along(sorted) - 1) %/% 3, FUN = mean)
0 1 2 3 
 42.3  89.7 214.7 297.0 

Hope this helps,
Giovanni

On Tue, 2012-04-03 at 08:47 -0400, Val wrote:
 Hi all,
 
 Assume that I have the following 10 data points.
  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
 
 sort x  and get the following
   y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
 
 I want to  group the sorted  data point (y)  into  equal number of
 observation per group. In this case there will be three groups.  The first
 two groups  will have three observation  and the third will have four
 observations
 
 group 1  = 34, 45, 46
 group 2  = 66, 78, 125
 group 3  = 193, 209, 242,297
 
 Finally I want to calculate the group mean
 
 group 1  =  42
 group 2  =  87
 group 3  =  234
 
 Can anyone help me out?
 
 In SAS I used to do it using proc rank.
 
 thanks in advance
 
 Val
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 

Giovanni Petris  gpet...@uark.edu
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] rpart error message

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 4:16 AM, Raji wrote:


Hi R-helpers,
I am using rpart package for decision tree using R.We are invoking R
environment through JRI from our java application.Hence, the result  
of R

command is returned in REXP and we use geterrMessage() to retrieve the
error.

When we execute the following command,

cnr_model-rpart(as.factor(Species)~Sepal Length+Sepal Width+Petal  
Length,


You have spaces in your variable names. That has nothing to do with  
JRI and everything to do with base R syntax.


(You also have no data argument so are you attaching it. That would  
seem to be an unwise choice.)



method=class, parms=list(split=gini,prior=c()),
control=rpart.control(minsplit=2,
na 
.action 
= 
na 
.pass 
,cp 
= 
0.001 
,usesurrogate=1,maxsurrogate=2,surrogatestyle=0,maxdepth=20,xval=10))


we get an error message* Error: unexpected symbol in
a-cnr_model-rpart(as.factor(Species)~Sepal Length*


The unexpected symbol was Length. Spaces are separators to the R  
parser.


The REXP returned in not null and the geterrMessage() call doesnt  
return an

error message.

But for other errors in R , say summary(cnr_model) , REXP returned  
is null

and we get the following error message from geterrMessage() call.

*Error in summary(cnr_model) : object 'cnr_model' not found*


Right. It wasn't found because the command you tried to create it  
with ... failed.




Is this behavior of showing error message specific to rpart package ?


No. It would be very generic.


Why is
this difference in the way error message is displayed in rpart  
package?Can

you please let us know how to retrieve error messages of rpart command
through JRI?

Thanks in advance.

--
View this message in context: 
http://r.789695.n4.nabble.com/rpart-error-message-tp4528104p4528104.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] gamm: tensor product and interaction

2012-04-03 Thread Simon Wood


On 04/02/2012 01:26 PM, Mabille, Geraldine wrote:

 Hi list,
 I'm working with gamm models of this sort, using Simon Wood's mgcv library:
 gm- gamm(Z~te(x,y),data=DATA,random=list(Group=~1))
 gm1-gamm(Z~te(x,y,by=Factor)+Factor,data=DATA,random=list(Group=~1))
 with a dataset of about 7 rows and 110 levels for Group

I would be inclined to use something like

gm- gam(Z~te(x,y)+s(Group,bs=re),data=DATA,method=REML)
gm1-gam(Z~te(x,y,by=Factor)+Factor+s(Group,bs=re),data=DATA,method=REML)
AIC(gm,gm1)

See ?random.effects in the mgcv help for more details.

 in order to test whether tensor product smooths vary across factor levels. I 
was wondering if comparing those two models would be enough to conclude? I saw 
a preceding post on similar issues
 http://r.789695.n4.nabble.com/Comparing-and-Interpreting-GAMMs-td2234209.html
 but that example used simple s() smooths instead of tensor product smooths. In 
this case, the person was comparing between model A and model A1 like:

 A-gamm4(Z~factor+s(x)+s(x,by=factor) ,data=DATA, random=~(1|Group))
 A1- gamm4(Z~factor+s(x) ,data=DATA, random=~(1|Group))

 I thus also tried to compare my model gm1, with another gm2 model of that sort:
 gm2-gamm(Z~te(x,y)+te(x,y,by=Factor)+Factor,data=DATA,random=list(Group=~1))
 but this type of models never converge and I obtain error messages of that 
sort: Error in MEestimate(lmeSt, grps) :   Singularity in backsolve at level 0, 
block 1

- The problem is that te(x,y) and te(x,y,by=Factor) are confounded. You
can get around this by making `Factor' into an ordered factor. See te
`by' variable section in ?gam.models.

 2 questions from that:
 1) Keeping in mind that my main question is to check whether tensor products 
smooths vary across factor levels, would the comparison of models gm and gm1 be 
sufficient for me?

- I'd have thought so.

 2) Otherwise, is there something wrong in my gm2 model that make it impossible 
to converge??

 Also, side question that I already posted a few days ago but didn't get an 
answer for:
 3)I don't manage to use vis.gam() to print 3 different 2D-contour plots for 
the three levels of factors I have in model gm1. It works well with model gm 
(when only one plot is generated) but not gm1 (3 plots should be generated)
 Here is the error message I obtain:
 vis.gam(gm1$gam,plot.type=contour,n.grid=200,color=heat,zlim=c(0,4))
 Error in predict.gam(x, newdata = newd, se.fit = TRUE, type = type) :  number 
of items to replace is not a multiple of replacement length

- hmm, possibly a bug. I'll look into it.

best,
Simon

 Thanks a lot if someone can help with that,
 Geraldine





[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help in ddply

2012-04-03 Thread R. Michael Weylandt
Don't use ddply because you aren't returning a data.frame. Hist
objects are lists so use dlply.

dats - structure(list(x = c(4L, 93L, 21L, 83L, 75L, 87L, 87L, 78L, 50L,
76L, 86L, 65L, 84L, 40L, 39L, 26L), panel = c(1L, 2L, 3L, 4L,
1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L)), .Names = c(x,
panel), class = data.frame, row.names = c(NA, -16L))

#please start using dput -- I've asked this before

dlply(dats, panel, function(d) hist(d$x))

It's up to you to interpret the results.

Michael

On Tue, Apr 3, 2012 at 4:02 AM, arunkumar akpbond...@gmail.com wrote:
 Hi

 I've records like this
 df=
 x       panel
 4       1
 93      2
 21      3
 83      4
 75      1
 87      2
 87      3
 78      4
 50      1
 76      2
 86      3
 65      4
 84      1
 40      2
 39      3
 26      4


 i want to create histogram out of it . i want all the mid and count values
 for panel wise

 my code is
 histoutput = ddply(df,.(df[2]),hist)

 i'm not able to get the required result.

 please help me
 using for loop takes a lot of time if there are more records

 -
 Thanks in Advance
        Arun
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/help-in-ddply-tp4528064p4528064.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread R. Michael Weylandt
It seems like you could benefit from using a zoo [time series] object
to hold your data -- then you have a variety of NA filling functions
which work for arbitrarily long gaps. E.g.,

library(zoo)
x - zoo(1:100, Sys.Date() + 1:100)
x[2:60] - NA

# Most of these look the same because the data is simple: will give
different results for more complicated examples
na.approx(x)
na.locf(x)
na.spline(x)
na.aggregate(x)
na.fill # Takes more arguments

Hope this helps,
Michael

On Tue, Apr 3, 2012 at 4:52 AM, jeff6868
geoffrey_kl...@etu.u-bourgogne.fr wrote:
 Hi everybody,

 I'm a new R french user. Sorry if my english is not perfect. Hope you'll
 understand my problem ;)

 I have to work on temperature data (35000 lines in one file) containing some
 missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
 but I have also sometimes 100 or 200 N/A following each other. Here's an
 example of my data, when I have only small gaps of missing data (2 or 3
 N/A):

 09/01/2008 12:00   2   1.93   2.93   4.56   5.43
 09/01/2008 12:15   2   *3.93*   3.25   4.93   5.56
 09/01/2008 12:30   2    NA   3.5   5.06   5.56
 09/01/2008 12:45   2    NA   3.68 5.25   5.68
 09/01/2008 13:00   2   *4.93 *  3.87   5.56   5.93
 09/01/2008 13:15   2   5.93   4.25   5.75   6.06
 09/01/2008 13:30   2   3.93   4.56   5.93   6.18

 My question is: how can I replace these small gaps of N/A by numeric values?
 I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
 data, but not the big gaps (more than 5 N/A following each other).

 For the moment, i'm trying to do it by working with the time gap between the
 2 numeric values surrounding the N/A as following:

 imputation - function(x){
    met = NULL

    temp - met[1] - x[1]

    ind_temp - 1

    tps - time(x)

    for (i in 2:(length(x)) ){
    if((tps[i]-tps[ind_temp]  1)(tps[i]-tps[ind_temp] =
 4)(is.na(x[i]))){
    met[i] - na.approx(x)
    }
    else {
    temp - met[i] - x[i]
    ind_temp - i
    }
    }

    return(met)
    }

 In this example, I would like to apply the function: na.approx(x) on my N/A,
 but only when I have maximum 4 N/A following each other.
 There's no error, but it doesn't work (it was working in the other way, when
 I had to detect aberrant data and replace it by N/A, but not now). It is
 maybe not the good way to solve this problem. I don't have a lot of
 experience in R. Maybe there is an easier way to do it...
 Does somebody have an idea about it for helping me?
 Thanks a lot!


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread R. Michael Weylandt
Sorry -- left out a major detail: most of these functions have maxgap
arguments which allow you to leave larger gaps of NAs as NAs.

Best,
Michael

On Tue, Apr 3, 2012 at 9:24 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 It seems like you could benefit from using a zoo [time series] object
 to hold your data -- then you have a variety of NA filling functions
 which work for arbitrarily long gaps. E.g.,

 library(zoo)
 x - zoo(1:100, Sys.Date() + 1:100)
 x[2:60] - NA

 # Most of these look the same because the data is simple: will give
 different results for more complicated examples
 na.approx(x)
 na.locf(x)
 na.spline(x)
 na.aggregate(x)
 na.fill # Takes more arguments

 Hope this helps,
 Michael

 On Tue, Apr 3, 2012 at 4:52 AM, jeff6868
 geoffrey_kl...@etu.u-bourgogne.fr wrote:
 Hi everybody,

 I'm a new R french user. Sorry if my english is not perfect. Hope you'll
 understand my problem ;)

 I have to work on temperature data (35000 lines in one file) containing some
 missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
 but I have also sometimes 100 or 200 N/A following each other. Here's an
 example of my data, when I have only small gaps of missing data (2 or 3
 N/A):

 09/01/2008 12:00   2   1.93   2.93   4.56   5.43
 09/01/2008 12:15   2   *3.93*   3.25   4.93   5.56
 09/01/2008 12:30   2    NA   3.5   5.06   5.56
 09/01/2008 12:45   2    NA   3.68 5.25   5.68
 09/01/2008 13:00   2   *4.93 *  3.87   5.56   5.93
 09/01/2008 13:15   2   5.93   4.25   5.75   6.06
 09/01/2008 13:30   2   3.93   4.56   5.93   6.18

 My question is: how can I replace these small gaps of N/A by numeric values?
 I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
 data, but not the big gaps (more than 5 N/A following each other).

 For the moment, i'm trying to do it by working with the time gap between the
 2 numeric values surrounding the N/A as following:

 imputation - function(x){
    met = NULL

    temp - met[1] - x[1]

    ind_temp - 1

    tps - time(x)

    for (i in 2:(length(x)) ){
    if((tps[i]-tps[ind_temp]  1)(tps[i]-tps[ind_temp] =
 4)(is.na(x[i]))){
    met[i] - na.approx(x)
    }
    else {
    temp - met[i] - x[i]
    ind_temp - i
    }
    }

    return(met)
    }

 In this example, I would like to apply the function: na.approx(x) on my N/A,
 but only when I have maximum 4 N/A following each other.
 There's no error, but it doesn't work (it was working in the other way, when
 I had to detect aberrant data and replace it by N/A, but not now). It is
 maybe not the good way to solve this problem. I don't have a lot of
 experience in R. Maybe there is an easier way to do it...
 Does somebody have an idea about it for helping me?
 Thanks a lot!


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528184.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread K. Elo

Hi!

Maybe not the most elegant solution, but works:

for(i in seq(1,length(data)-(length(data) %% 3), 3)) { 
ifelse((length(data)-i)3, { print(sort(data)[ c(i:(i+2)) ]); 
print(mean(sort(data)[ c(i:(i+2)) ])) }, { print(sort(data)[ 
c(i:length(data)) ]); print(mean(sort(data)[ c(i:length(data)) ])) } ) }


Produces:

[1] 36 45 46
[1] 42.3
[1]  66  78 125
[1] 89.7
[1] 193 209 242 297
[1] 235.25

HTH,
Kimmo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Val
Thank you all (David, Michael, Giovanni)  for your prompt response.

First there was a typo error for the group mean it was 89.6 not 87.

For a small data set and few groupings I can use  prob=c(0, .333, .66 ,1)
to group in to three groups in this case. However,  if I want to extend the
number of groupings say 10 or 15 then do I have to figure it out the
  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

Is there a short cut for that?


Thanks











On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 Ignoring the fact your desired answers are wrong, I'd split the
 separating part and the group means parts into three steps:

 i) quantile() can help you get the split points,
 ii)  findInterval() can assign each y to a group
 iii) then ave() or tapply() will do group-wise means

 Something like:

 y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here.
 ave(y, findInterval(y, quantile(y, c(0.33, 0.66
 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

 You could also use cut2 from the Hmisc package to combine findInterval
 and quantile into a single step.

 Depending on your desired output.

 Hope that helps,
 Michael

 On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:
  Hi all,
 
  Assume that I have the following 10 data points.
   x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
 
  sort x  and get the following
   y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
 
  I want to  group the sorted  data point (y)  into  equal number of
  observation per group. In this case there will be three groups.  The
 first
  two groups  will have three observation  and the third will have four
  observations
 
  group 1  = 34, 45, 46
  group 2  = 66, 78, 125
  group 3  = 193, 209, 242,297
 
  Finally I want to calculate the group mean
 
  group 1  =  42
  group 2  =  87
  group 3  =  234
 
  Can anyone help me out?
 
  In SAS I used to do it using proc rank.
 
  thanks in advance
 
  Val
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread R. Michael Weylandt
Use cut2 as I suggested and David demonstrated.

Michael

On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote:
 Thank you all (David, Michael, Giovanni)  for your prompt response.

 First there was a typo error for the group mean it was 89.6 not 87.

 For a small data set and few groupings I can use  prob=c(0, .333, .66 ,1) to
 group in to three groups in this case. However,  if I want to extend the
 number of groupings say 10 or 15 then do I have to figure it out the
   split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

 Is there a short cut for that?


 Thanks











 On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 Ignoring the fact your desired answers are wrong, I'd split the
 separating part and the group means parts into three steps:

 i) quantile() can help you get the split points,
 ii)  findInterval() can assign each y to a group
 iii) then ave() or tapply() will do group-wise means

 Something like:

 y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c here.
 ave(y, findInterval(y, quantile(y, c(0.33, 0.66
 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

 You could also use cut2 from the Hmisc package to combine findInterval
 and quantile into a single step.

 Depending on your desired output.

 Hope that helps,
 Michael

 On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:
  Hi all,
 
  Assume that I have the following 10 data points.
   x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
 
  sort x  and get the following
   y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
 
  I want to  group the sorted  data point (y)  into  equal number of
  observation per group. In this case there will be three groups.  The
  first
  two groups  will have three observation  and the third will have four
  observations
 
  group 1  = 34, 45, 46
  group 2  = 66, 78, 125
  group 3  = 193, 209, 242,297
 
  Finally I want to calculate the group mean
 
  group 1  =  42
  group 2  =  87
  group 3  =  234
 
  Can anyone help me out?
 
  In SAS I used to do it using proc rank.
 
  thanks in advance
 
  Val
 
         [[alternative HTML version deleted]]

 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?

2012-04-03 Thread Philip Rhoades

Brian,


On 2012-04-03 18:58, Prof Brian Ripley wrote:

On Tue, 3 Apr 2012, peter dalgaard wrote:



On Apr 3, 2012, at 07:19 , Philip Rhoades wrote:


People,

On my home computer with a real plot I get what I expect - open 
circles as the plot character - on my work computer I get qs ! So I 
tried a trivial test plot on the work computer ie:




Argh. I have forgotten this before... There was a configuration 
issue leading to plotting symbols showing up as q, but I can't 
recall the details. Possibly libpango-related (do you have it 
installed?)



At home, yes:

pango-1.29.4-1.fc16.x86_64
pangomm-2.28.3-1.fc16.x86_64

(I will need to check work).



We were not told the device   This is well-known for pdf() and
poppler-based PDF viewers so you could try the workaround in ?pdf.



Yes, I was outputting to pdf in all cases . . do you mean this 
workaround?:


useDingbats
logical. Should small circles be rendered via the Dingbats font? 
Defaults to TRUE, which produces smaller and better output. Setting this 
to FALSE can work around font display problems in broken PDF viewers.




I don't really see how this could happen on the default X11 device,
as it draws circles rather than uses fonts.



Ah . . when I was testing on the problem work machine and doing the 
trivial test . . I was outputting to the screen . . so if I repeat the 
trivial test on the work machine (I am at home now) to PDF I should get 
the same problem as I do on the real script?


Thanks,

Phil.
--
Philip Rhoades

GPO Box 3411
Sydney NSW  2001
Australia
E-mail:  p...@pricom.com.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A trivial plot gives open circles as the plot char but another one gives me qs . . why is this?

2012-04-03 Thread Philip Rhoades

David,


On 2012-04-03 22:29, David Winsemius wrote:

On Apr 3, 2012, at 1:19 AM, Philip Rhoades wrote:


People,

On my home computer with a real plot I get what I expect - open  
circles as the plot character - on my work computer I get qs !  So  
I tried a trivial test plot on the work computer ie:


a - c(1,2,3,4,5)
b - c(1,2,3,4,5)
plot(a,b)

and that also gives me open circles as it should! - the plot line in 
my real script is:


plot( means5[ ,4 ], means6[ ,4 ], xlab=640loci-50reps Gen4(-1)  
Rst, ylab=ADf )


I have tried adding the pch=x command with no effect . .


Did you actually type pch=x  ?

Not pch=x ?



Sorry, I meant x as in 1,2,3, . . . 18

Thanks,

Phil.
--
Philip Rhoades

GPO Box 3411
Sydney NSW  2001
Australia
E-mail:  p...@pricom.com.au

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Petr Savicky
On Tue, Apr 03, 2012 at 09:31:29AM -0400, Val wrote:
 Thank you all (David, Michael, Giovanni)  for your prompt response.
 
 First there was a typo error for the group mean it was 89.6 not 87.
 
 For a small data set and few groupings I can use  prob=c(0, .333, .66 ,1)
 to group in to three groups in this case. However,  if I want to extend the
 number of groupings say 10 or 15 then do I have to figure it out the
   split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))
 
 Is there a short cut for that?

Hi.

There may be better ways for the whole task, but specifically
c(0, .333, .66 ,1) can be obtained as

  seq(0, 1, length=3+1)

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Histogram classwise

2012-04-03 Thread Sarah Goslee
I'm not entirely sure what you mean, but maybe split() and lapply() would help?

Sarah

On Tue, Apr 3, 2012 at 8:20 AM, arunkumar akpbond...@gmail.com wrote:
 Hi
 I have a data class wise. I want to create a histogram class wise without
 using for loop as it takes a long time
 my data looks like this

 x       class
 27      1
 93      3
 65      5
 1       2
 69      5
 2       1
 92      4
 49      5
 55      4
 46      1
 51      3
 100     4




 -
 Thanks in Advance
        Arun
 --

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package seems to be present but library don't find it

2012-04-03 Thread Marc Girondot

In case someone has the competence to check, the file is here:

setwd(~)

download.file(http://www.ese.u-psud.fr/epc/conservation/r-scripts/HelloWorld_1.0.tar.gz;, 
HelloWorld_1.0.tar.gz)


install.packages(HelloWorld_1.0.tar.gz, repos = NULL)

Thanks a lot

Marc


Le 03/04/12 10:55, Marc Girondot a écrit :

Hi,

I try to make my first package? The HelloWorld.R file is:
 HelloWorld.R 
#' showHello est une fonction R permettant d'afficher le message
#' Hello World! sur la console.
#' @title la fonction showHello()

showHello -function(){
cat(Hello World!\n)
}

I use the following procedure to get the tar:

# set the working directory where the file is located
 setwd(...)

 package.skeleton(HelloWorld,code_files=c(HelloWorld.R))

# to generate .rd files
 library(roxygen2)
 roxygenize(HelloWorld,copy.package=FALSE)

 system(R CMD build '/Users/marcgirondot/Documents/Espace de travail 
R/Phenology/Source fit/Essai_package/HelloWorld')
* checking for file ‘/Users/marcgirondot/Documents/Espace de travail 
R/Phenology/Source fit/Essai_package/HelloWorld/DESCRIPTION’ ... OK

* preparing ‘HelloWorld’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
Removed empty directory ‘HelloWorld/inst’
* building ‘HelloWorld_1.0.tar.gz’


 install.packages(/Users/marcgirondot/Documents/Espace\ de\ travail\ 
R/Phenology/Source\ fit/Essai_package/HelloWorld_1.0.tar.gz, repos = 
NULL)
Installing package(s) into 
‘/Library/Frameworks/R.framework/Versions/2.14/Resources/library’

(as ‘lib’ is unspecified)

 library(HelloWorld)
Erreur dans library(HelloWorld) :
‘HelloWorld’ n'est pas un nom correct de package installé

Whereas the Helloworld folder is available in the library folder with 
other packages
/Library/Frameworks/R.framework/Versions/2.14/Resources/library/HelloWorld 







--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:


Use cut2 as I suggested and David demonstrated.


Agree that Hmisc::cut2 is extremely handy and I also like that fact  
that the closed ends of intervals are on the left side (which is not  
the same behavior as cut()), which has the otehr effect of setting  
include.lowest = TRUE which is not the default for cut() either (to my  
continued amazement).


But let me add the method I use when doing it by hand:

cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE)

--
David.




Michael

On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote:

Thank you all (David, Michael, Giovanni)  for your prompt response.

First there was a typo error for the group mean it was 89.6 not 87.

For a small data set and few groupings I can use  prob=c(0, .333, . 
66 ,1) to
group in to three groups in this case. However,  if I want to  
extend the

number of groupings say 10 or 15 then do I have to figure it out the
  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

Is there a short cut for that?


Thanks











On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:


Ignoring the fact your desired answers are wrong, I'd split the
separating part and the group means parts into three steps:

i) quantile() can help you get the split points,
ii)  findInterval() can assign each y to a group
iii) then ave() or tapply() will do group-wise means

Something like:

y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a  
c here.

ave(y, findInterval(y, quantile(y, c(0.33, 0.66
tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

You could also use cut2 from the Hmisc package to combine  
findInterval

and quantile into a single step.

Depending on your desired output.

Hope that helps,
Michael

On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:

Hi all,

Assume that I have the following 10 data points.
 x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

sort x  and get the following
 y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

I want to  group the sorted  data point (y)  into  equal number of
observation per group. In this case there will be three groups.   
The

first
two groups  will have three observation  and the third will have  
four

observations

group 1  = 34, 45, 46
group 2  = 66, 78, 125
group 3  = 193, 209, 242,297

Finally I want to calculate the group mean

group 1  =  42
group 2  =  87
group 3  =  234

Can anyone help me out?

In SAS I used to do it using proc rank.

thanks in advance

Val

   [[alternative HTML version deleted]]




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread David L Carlson
Or just replace c(0, .333, .667, 1) with 

n - 10
split(x, cut(x, quantile(x, prob= c(0, 1:(n-1)/n, 1)), include.lowest=TRUE))

where n is the number of groups you want.

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of R. Michael Weylandt
Sent: Tuesday, April 03, 2012 8:32 AM
To: Val
Cc: r-help@r-project.org
Subject: Re: [R] grouping

Use cut2 as I suggested and David demonstrated.

Michael

On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote:
 Thank you all (David, Michael, Giovanni)  for your prompt response.

 First there was a typo error for the group mean it was 89.6 not 87.

 For a small data set and few groupings I can use  prob=c(0, .333, .66 ,1)
to
 group in to three groups in this case. However,  if I want to extend the
 number of groupings say 10 or 15 then do I have to figure it out the
   split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

 Is there a short cut for that?


 Thanks











 On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 Ignoring the fact your desired answers are wrong, I'd split the
 separating part and the group means parts into three steps:

 i) quantile() can help you get the split points,
 ii)  findInterval() can assign each y to a group
 iii) then ave() or tapply() will do group-wise means

 Something like:

 y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c
here.
 ave(y, findInterval(y, quantile(y, c(0.33, 0.66
 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

 You could also use cut2 from the Hmisc package to combine findInterval
 and quantile into a single step.

 Depending on your desired output.

 Hope that helps,
 Michael

 On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:
  Hi all,
 
  Assume that I have the following 10 data points.
   x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
 
  sort x  and get the following
   y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
 
  I want to  group the sorted  data point (y)  into  equal number of
  observation per group. In this case there will be three groups.  The
  first
  two groups  will have three observation  and the third will have four
  observations
 
  group 1  = 34, 45, 46
  group 2  = 66, 78, 125
  group 3  = 193, 209, 242,297
 
  Finally I want to calculate the group mean
 
  group 1  =  42
  group 2  =  87
  group 3  =  234
 
  Can anyone help me out?
 
  In SAS I used to do it using proc rank.
 
  thanks in advance
 
  Val
 
         [[alternative HTML version deleted]]

 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] regression for poisson distributed data

2012-04-03 Thread Joachim Audenaert
Hello all,

I would like to get parameter estimates for different models. For one of 
them I give the code in example. I am estimating the parameters (i,j and 
k) with the nls function, which sees the error distribution as normal, I 
would however like to do the same as nls with the assumption that the 
errors are poisson distributed.

Is there a way to do this with R? Are there packages designed for this? I 
tried with the gnm package, but don't understand how to transform my 
equation to a generalised equation. Is there an option for nls to choose 
family = poisson? 

Lower in the mail the code with the model and visualisations I use to 
check my results. I also copied the test dataset from my txt file.

I am using R 2.15 and Rstudio to visualise it.


plot(FR~N0)
x - 
nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6))
summary(x)
hatx - predict(x)
lines(spline(N0,hatx))

N0  FR
10  2
10  3
10  2
10  4
10  2
10  2
10  1
10  2
10  2
10  2
20  2 
20  3
20  3
20  3
20  4
20  2
20  4
20  2
20  3
20  2
30  1
30  2
30  3
30  4
30  5
30  6
30  2
30  3
30  2
30  2
40  2
40  3
40  3
40  6
40  5
40  4
40  3
40  3
40  2
40  3
50  4
50  5
50  2
50  3
50  7
50  5
50  4
50  3
50  4
50  5
60  5
60  6
60  8
60  4
60  4
60  3
60  2
60  2
60  5
60  4

Kind Regards
Joachim

Don't waste paper! Think about the environment before printing this e-mail

__

Joachim Audenaert
Adviesdienst Gewasbescherming
Proefcentrum voor Sierteelt
Schaessestraat 18
B-9070 Destelbergen
Tel. +32 9 353 94 71
Fax +32 9 353 94 95
E-mail: joachim.audena...@pcsierteelt.be
www.pcsierteelt.be
__
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package seems to be present but library don't find it

2012-04-03 Thread Duncan Murdoch

On 03/04/2012 9:50 AM, Marc Girondot wrote:

In case someone has the competence to check, the file is here:

setwd(~)

download.file(http://www.ese.u-psud.fr/epc/conservation/r-scripts/HelloWorld_1.0.tar.gz;,
HelloWorld_1.0.tar.gz)

install.packages(HelloWorld_1.0.tar.gz, repos = NULL)



The problem is that you try to export the name HelloWorld in your 
NAMESPACE file, but you don't have an object of that name.  You should 
export showHello instead.


Not sure why you didn't see the error message I got:

$ R CMD INSTALL HelloWorld_1.0.tar.gz
* installing to library 'F:/cygwin/home/murdoch/R/win-library/2.15'
* installing *source* package 'HelloWorld' ...
** R
** preparing package for lazy loading
** help
Warning: 
C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag

e.Rd:32: All text must be in a section
Warning: 
C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag

e.Rd:33: All text must be in a section
*** installing help indices
** building package indices
** testing if installed package can be loaded
Error in namespaceExport(ns, exports) : undefined exports: HelloWorld
Error: loading failed
Execution halted
ERROR: loading failed
* removing 'F:/cygwin/home/murdoch/R/win-library/2.15/HelloWorld'

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] When lack of data is data and not n/a

2012-04-03 Thread Wayne Gray
Greetings.

Here is a problem I don't know how to handle, even by brute force. We have an 
800k line data file that includes eye fixations for subjects in a 3 x 2 
factorial design. There are several screen locations where information is 
available while Ss do their task. These locations vary by condition so there is 
no reason for people in some conditions (i.e., the 3-factor one) to look at 
some locations. So I am analyzing these conditions, two pairs at a time (i.e., 
the 2-factor one).

So the problem: My ANOVA of these data had messed up the within and between 
factors the way aov does when there are different numbers of Ss contributing 
data to some of the variables than for others. A bit of sleuthing with plyr 
revealed that one of our Ss in one of our conditions never looked at one of our 
locations. 

Given the nature of the DV, zero is a fine number. Although a little unexpected 
in this condition, it is reasonable and cannot be ignored. However, R 
recognizes, rightly, that lack of data is not the same as zero. 

I guess I could subset and aggregate the dataset to pull out a data.frame 
that contains the data aggregated at the right level for this aov and then add 
one record for this one Ss (well - actually I would add 8 records as block is 
a within-Ss factor that we have been looking at). But there must be a more 
elegant way of doing this especially as we are still in the exploratory phase 
and will be pulling out factors such as mean dwell time, total dwell time 
(dwell time per fix, summed over all fix) and other factors (e.g., tallies of 
sequential fixation chains -- e.g., obj-A, obj-B, vs obj-A, obj-C, etc, etc, 
etc). 

As always, your thoughts, comments, and code would be appreciated.

Thanks,

Wayne Gray

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Val
David W and all,

Thank you very much for your help.

Here is the final output that I want in the form of data frame. The data
frame should contain  x, group and group_ mean in the following way

x   group   group mean
46   142.3
125 289.6
36   142.3
193 3235.25
209 3235.25
78   289.6
66   289.6
242 3235.25
297 3235.25
45   142.3

Thanks a lot








On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:

  Use cut2 as I suggested and David demonstrated.


 Agree that Hmisc::cut2 is extremely handy and I also like that fact that
 the closed ends of intervals are on the left side (which is not the same
 behavior as cut()), which has the otehr effect of setting include.lowest =
 TRUE which is not the default for cut() either (to my continued amazement).

 But let me add the method I use when doing it by hand:

 cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE)

 --
 David.




 Michael

 On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote:

 Thank you all (David, Michael, Giovanni)  for your prompt response.

 First there was a typo error for the group mean it was 89.6 not 87.

 For a small data set and few groupings I can use  prob=c(0, .333, .66
 ,1) to
 group in to three groups in this case. However,  if I want to extend the
 number of groupings say 10 or 15 then do I have to figure it out the
  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

 Is there a short cut for that?


 Thanks











 On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:


 Ignoring the fact your desired answers are wrong, I'd split the
 separating part and the group means parts into three steps:

 i) quantile() can help you get the split points,
 ii)  findInterval() can assign each y to a group
 iii) then ave() or tapply() will do group-wise means

 Something like:

 y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c
 here.
 ave(y, findInterval(y, quantile(y, c(0.33, 0.66
 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

 You could also use cut2 from the Hmisc package to combine findInterval
 and quantile into a single step.

 Depending on your desired output.

 Hope that helps,
 Michael

 On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:

 Hi all,

 Assume that I have the following 10 data points.
  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

 sort x  and get the following
  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

 I want to  group the sorted  data point (y)  into  equal number of
 observation per group. In this case there will be three groups.  The
 first
 two groups  will have three observation  and the third will have four
 observations

 group 1  = 34, 45, 46
 group 2  = 66, 78, 125
 group 3  = 193, 209, 242,297

 Finally I want to calculate the group mean

 group 1  =  42
 group 2  =  87
 group 3  =  234

 Can anyone help me out?

 In SAS I used to do it using proc rank.

 thanks in advance

 Val

   [[alternative HTML version deleted]]



 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 West Hartford, CT



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread David Winsemius

On Apr 3, 2012, at 10:11 AM, Val wrote:

 David W and all,

 Thank you very much for your help.

 Here is the final output that I want in the form of data frame. The  
 data frame should contain  x, group and group_ mean in the following  
 way

 x   group   group mean
 46   142.3
 125 289.6
 36   142.3
 193 3235.25
 209 3235.25
 78   289.6
 66   289.6
 242 3235.25
 297 3235.25
 45   142.3

I you want group means in a vector the same length as x then instead  
of using tapply as done in earlier solutions you should use `ave`.

-- 
DW



 Thanks a lot








 On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius dwinsem...@comcast.net 
  wrote:

 On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:

 Use cut2 as I suggested and David demonstrated.

 Agree that Hmisc::cut2 is extremely handy and I also like that fact  
 that the closed ends of intervals are on the left side (which is not  
 the same behavior as cut()), which has the otehr effect of setting  
 include.lowest = TRUE which is not the default for cut() either (to  
 my continued amazement).

 But let me add the method I use when doing it by hand:

 cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)),  
 include.lowest=TRUE)

 -- 
 David.




 Michael

 On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote:
 Thank you all (David, Michael, Giovanni)  for your prompt response.

 First there was a typo error for the group mean it was 89.6 not 87.

 For a small data set and few groupings I can use  prob=c(0, .333, . 
 66 ,1) to
 group in to three groups in this case. However,  if I want to extend  
 the
 number of groupings say 10 or 15 then do I have to figure it out the
  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

 Is there a short cut for that?


 Thanks











 On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 Ignoring the fact your desired answers are wrong, I'd split the
 separating part and the group means parts into three steps:

 i) quantile() can help you get the split points,
 ii)  findInterval() can assign each y to a group
 iii) then ave() or tapply() will do group-wise means

 Something like:

 y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c  
 here.
 ave(y, findInterval(y, quantile(y, c(0.33, 0.66
 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

 You could also use cut2 from the Hmisc package to combine findInterval
 and quantile into a single step.

 Depending on your desired output.

 Hope that helps,
 Michael

 On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:
 Hi all,

 Assume that I have the following 10 data points.
  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

 sort x  and get the following
  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

 I want to  group the sorted  data point (y)  into  equal number of
 observation per group. In this case there will be three groups.  The
 first
 two groups  will have three observation  and the third will have four
 observations

 group 1  = 34, 45, 46
 group 2  = 66, 78, 125
 group 3  = 193, 209, 242,297

 Finally I want to calculate the group mean

 group 1  =  42
 group 2  =  87
 group 3  =  234

 Can anyone help me out?

 In SAS I used to do it using proc rank.

 thanks in advance

 Val

   [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 David Winsemius, MD
 West Hartford, CT



David Winsemius, MD
West Hartford, CT


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression for poisson distributed data

2012-04-03 Thread Gabor Grothendieck
On Tue, Apr 3, 2012 at 9:58 AM, Joachim Audenaert
joachim.audena...@pcsierteelt.be wrote:
 Hello all,

 I would like to get parameter estimates for different models. For one of
 them I give the code in example. I am estimating the parameters (i,j and
 k) with the nls function, which sees the error distribution as normal, I
 would however like to do the same as nls with the assumption that the
 errors are poisson distributed.

 Is there a way to do this with R? Are there packages designed for this? I
 tried with the gnm package, but don't understand how to transform my
 equation to a generalised equation. Is there an option for nls to choose
 family = poisson?

 Lower in the mail the code with the model and visualisations I use to
 check my results. I also copied the test dataset from my txt file.

 I am using R 2.15 and Rstudio to visualise it.


 plot(FR~N0)
 x -
 nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6))
 summary(x)
 hatx - predict(x)
 lines(spline(N0,hatx))


Just minimize the negative log likelihood from first principles.
Define the mean function and then define the negative log likelihood
in terms of that:

Mean - function(p) { i - p[1]; j - p[2]; k - p[3];
exp(i+j*N0)/(1+exp(i+j*N0))*(k*N0/(k+N0)) }
ll - function(p) - sum(dpois(FR, Mean(p), log = TRUE))
out - optim(c(1, 1, 1), ll); out

plot(FR ~ N0, pch = 20)
lines(Mean(out$par) ~ N0)

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identify with mfcol=c(1,2)

2012-04-03 Thread John Sorkin
I would like to have a figure with two graphs. This is easily accomplished 
using mfcol:

oldpar - par(mfcol=c(1,2))
plot(x,y)
plot(z,x)
par(oldpar) 

I run into trouble if I try to use identify with the two plots. If, after 
identifying points on my first graph I hit the ESC key, or hitting stop menu 
bar of my R session, the system stops the identification process, but fails to 
give me my second graph. Is there a way to allow for the identification of 
points when one is plotting to graphs in a single graph window? My code follows.

plotter - function(first,second) {
  # Allow for two plots in on graph window.
  oldpar-par(mfcol=c(1,2))
  
  #Bland-Altman plot.
  plot((second+first)/2,second-first)
  abline(0,0)
  # Allow for indentification of extreme values.
  BAzap-identify((second+first)/2,second-first,labels = seq_along(data$Line))
  print(BAzap)

  # Plot second as a function of first value.
  plot(first,second,main=Limin vs. Limin,xlab=First (cm^2),ylab=Second 
(cm^3))
  # Add identity line.
  abline(0,1,lty=2,col=red)
  # Allow for identification of extreme values.
  zap-identify(first,second,labels = seq_along(data$Line))
  print(zap)
  # Add regression line.
  fit1-lm(first~second)
  print(summary(fit1))
  abline(fit1)
  print(summary(fit1)$sigma)

  # reset par to default values.  
  par(oldpar)

}
plotter(first,second)


Thanks,
John






John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] meaning of sigma from LM, is it the same as RMSE

2012-04-03 Thread John Sorkin
Is the sigma from a lm, i.e.

fit1 - lm(y~x)
summary(fit1)
summary(fit1)$sigma

the RMSE (root mean square error)

Thanks,
John

John David Sorkin M.D., Ph.D.
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing)

Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:6}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] indexing in a function doesn't work?

2012-04-03 Thread Joshua Wiley
Hi Benjamin,

You seem to have the right basic ideas, but a lot of your code had
typos and some logic flaws that I guess came from trying to move from
just code to in a function.  I attached the changes I made.  What I
would strongly encourage you to do, is work through each of the little
functions I made and:

1) make sure you understand what it is doing
2) make sure each small function works properly---this means creating
a *variety* of plausible test cases and trying them out
3) once all of the pieces work, then try to wrap them up in your
overall plotter()

The original function you had was large and had many errors, but it is
difficult to debug something when there can be errors coming from
multiple places.  Easier is to break your work into small, tractable
chunks.  Plan in advance what your final goal is, and how each piece
will fit in to that.  Then write each piece and ensure that it works.
From this point, you will have a much easier time bundling all the
pieces together (even still, there may be additional work, but it will
be more doable because you can be reasonably certain all your code
works, it just does not quite work together.

A few functions I used may be new to you, so I would also suggest
reading the documentation (I know this can be tedious, but it is
valuable)

?match.arg
?switch
?on.exit

note that ... are arguments passed down from the final function
plotter to internal ones.  They must be named if they are passed to


You are off to a good start, and I think with some more work, you can
get this going fine.  Long run you may have less headaches and stress
if you take more time at the beginning to write clean code.

I hope this helps,

Josh

On Sun, Apr 1, 2012 at 4:34 PM, Benjamin Caldwell
btcaldw...@berkeley.edu wrote:
 Josh,

 Many thanks - here's a subset of the data and a couple examples:

 plotter(10,3,fram=rwb,framvec=rwb$prcnt.char.depth,obj=prcnt.char.depth,form1=
 post.f.crwn.length~shigo.av,form2=post.f.crwn.length~shigo.av-1,
 form3=leaf.area~(1/exp(shigo.av*x))*n,type=2,xlm=70,ylm=35)

 plotter(10,3,fram=rwb, framvec=rwb$prcnt.char.depth, obj=prcnt.char.depth,
 form1= post.f.crwn.length~leaf.area, form2=post.f.crwn.length~leaf.area-1,
 form3=leaf.area~(1/exp(shigo.av*x))*n,type=1, xlm=1500, ylm=35,
 sx=.01,sn=25)




 plotter-function(a,b,fram,framvec,obj,form1,form2,form3, type=1, xlm, ylm,
 sx=.01,sn=25){
 g-ceiling(a/b)
 par(mfrow=c(b,g))
 num-rep(0,a)
 sub.plotter-function(i,fram,framvec,obj,form1,form2,form3,type,
 xlm,ylm,var1,var2){
 temp.i-fram[framvec =(i*.10),] #trees in the list that have an attribute
 less than or equal to a progressively larger percentage
 plot(form1, data=temp.i, xlim=c(0,xlm), ylim=c(0,ylm), main=((i-1)*.10))
 if(type==1){
 mod-lm(form2,data=temp.i)
 r2-summary(mod)$adj.r.squared
 num[i]-r2
 legend(bottomright, legend=signif(r2), col=black)
 abline(mod)
 num}
 else{
 if(type==2){
 try(mod-nls(form3, data=temp.i, start=list(x=sx,n=sn),
 na.action=na.omit), silent=TRUE)
 try(x1-summary(mod)$coefficients[1,1], silent=TRUE)
 try(n1-summary(mod)$coefficients[2,1], silent=TRUE)
 try(lines((1/exp(c(0:70)*x1)*n1)), silent=TRUE)
 try(num[i]-AIC(mod), silent=TRUE)
 try(legend(bottomright, legend=round(num[i],3) , col=black),
 silent=TRUE)
 try((num), silent=TRUE)
   }
 }}
 for(i in 0:a+1){
  num-sub.plotter(i,fram,framvec,obj,form1,form2,form3,type,xlm,ylm)
 }
 plot.cor-function(x){
 temp-a+1
 lengthx-c(1:temp)
 plot(x~c(1:temp))
 m2-lm(x~c(1:temp))
 abline(m2)
 n-summary(m2)$adj.r.squared
 legend(bottomright, legend=signif(n), col=black)
 slope-(coef(m2)[2])# slope
 values-(num)#values for aic or adj r2
 r2ofr2-(n) #r2 of r2 or AIC
 output-data.frame(lengthx,slope,values,r2ofr2)
 }
 plot.cor(num)
 write.csv(plot.cor(num)$output,output.csv) # can't seem to use
 paste(substitute(form3),.csv,sep=) to name it at the moment
 par(mfrow=c(1,1))
 }




 On Sun, Apr 1, 2012 at 3:25 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:

 Hi,

 Glancing through your code it was not immediately obvious to me why it
 does not work, but I can see a lot of things that could be simplified.
  It would really help if you could give us a reproducible example.
 Find/upload/create (in R) some data, and examples of how you would use
 the function.  Right now, I can only guess what your data etc. are
 like and based on your description plus what the code you wrote seems
 to expect to be given.  I could try to give code suggestions, but I
 have no easy way of testing them so it would be very easy to make
 typos, etc.  Then you just get back my edits to your code that still
 do not work and maybe it is because of something fundamentally wrong
 with what I have done, a simple typo, or something else still wrong in
 your code that I did not fix.

 Anyway, if you send some data and an example using your function
 (i.e., using the data you send, write our form1, form2, type, etc.), I
 will take a look at your function and see if I can make it run.

 Cheers,

 Josh

 On Sun, 

[R] How to change color in R2HTML?

2012-04-03 Thread Michael
Hi all,

I just wanted to highlight something unusual in the output from R.

Lets say we expect most results to be positive and we would like to flag
the negative outputs as unusual and hight them in bold red or yellow in
the R2HTML output...

How to do that?

I am not sophisticated enough for handling CSS stuff... and just wanted to
get something done quick...

Could you please help me?

Thanks a lot!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] hclust and plot functions work, cutree does not

2012-04-03 Thread Peter Langfelder
On Tue, Apr 3, 2012 at 2:41 AM, vinod1 vinod.hegd...@gmail.com wrote:
 Sarah,

 .       clust_tree=hclust(as.dist(x),method=complete)
 .       plot(clust_tree)

 this produces a dendrogram, whereas
 .       clust_tree=hclust(as.dist(x),method=complete)
 .        cut = cutree(clust_tree,k=1:5)
 .        plot(cut)

 produces a plot with 2 dots. The dissimilarity matrix x is 100*100 matrix.

What kind of output do you expect?

If you want to see the clustering tree and an indication of which
object belongs to which cluster, install package WGCNA and use the
function plotDendroAndColors. Continuing with Sarah's example, you can
use

hc - hclust(dist(USArrests))
labels =  cutree(hc, k=1:5)

require(WGCNA)
plotDendroAndColors(hc, labels)

You get the clustering tree plus a color indication of the cluster
each object belongs to in each cut.

You can also produce MDS plots colored by the cluster label.

If you'd like to experiment with more involved ways of identifying
branches (or subtrees) in the dendrogram, I can recommend the article
(warning, shameless plug)

Langfelder P, Zhang B, Horvath S (2007) Defining clusters from a
hierarchical cluster tree: the Dynamic Tree Cut package for R.
Bioinformatics 2008 24(5):719-720

and the package dynamicTreeCut that the short article describes. You
can see some example code at

http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork/BranchCutting/

We use this approach in large gene expression data sets.

HTH,

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression for poisson distributed data

2012-04-03 Thread Kjetil Halvorsen
¿Just write down the loglikelihood function and send it to optim?

Kjetil

On Tue, Apr 3, 2012 at 8:58 AM, Joachim Audenaert
joachim.audena...@pcsierteelt.be wrote:
 Hello all,

 I would like to get parameter estimates for different models. For one of
 them I give the code in example. I am estimating the parameters (i,j and
 k) with the nls function, which sees the error distribution as normal, I
 would however like to do the same as nls with the assumption that the
 errors are poisson distributed.

 Is there a way to do this with R? Are there packages designed for this? I
 tried with the gnm package, but don't understand how to transform my
 equation to a generalised equation. Is there an option for nls to choose
 family = poisson?

 Lower in the mail the code with the model and visualisations I use to
 check my results. I also copied the test dataset from my txt file.

 I am using R 2.15 and Rstudio to visualise it.


 plot(FR~N0)
 x -
 nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)),start=list(i=0.02,j=0.002,k=6))
 summary(x)
 hatx - predict(x)
 lines(spline(N0,hatx))

 N0      FR
 10      2
 10      3
 10      2
 10      4
 10      2
 10      2
 10      1
 10      2
 10      2
 10      2
 20      2
 20      3
 20      3
 20      3
 20      4
 20      2
 20      4
 20      2
 20      3
 20      2
 30      1
 30      2
 30      3
 30      4
 30      5
 30      6
 30      2
 30      3
 30      2
 30      2
 40      2
 40      3
 40      3
 40      6
 40      5
 40      4
 40      3
 40      3
 40      2
 40      3
 50      4
 50      5
 50      2
 50      3
 50      7
 50      5
 50      4
 50      3
 50      4
 50      5
 60      5
 60      6
 60      8
 60      4
 60      4
 60      3
 60      2
 60      2
 60      5
 60      4

 Kind Regards
 Joachim

 Don't waste paper! Think about the environment before printing this e-mail

 __

 Joachim Audenaert
 Adviesdienst Gewasbescherming
 Proefcentrum voor Sierteelt
 Schaessestraat 18
 B-9070 Destelbergen
 Tel. +32 9 353 94 71
 Fax +32 9 353 94 95
 E-mail: joachim.audena...@pcsierteelt.be
 www.pcsierteelt.be
 __
        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] regression for poisson distributed data

2012-04-03 Thread Gabor Grothendieck
On Tue, Apr 3, 2012 at 11:03 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Apr 3, 2012, at 9:58 AM, Joachim Audenaert wrote:

 Hello all,

 I would like to get parameter estimates for different models. For one of
 them I give the code in example. I am estimating the parameters (i,j and
 k) with the nls function, which sees the error distribution as normal,


 Doesn't it really just minimize the squared residuals from a nonlinear
 objective function? I didn't think there was any assumption that there was a
 particular error structure.


 I would however like to do the same as nls with the assumption that the
 errors are poisson distributed.


 Is there a way to do this with R? Are there packages designed for this? I
 tried with the gnm package, but don't understand how to transform my
 equation to a generalised equation. Is there an option for nls to choose
 family = poisson?


 dat - read.table(text=N0      FR
 10      2
 10      3
 snipped

 60      2
 60      2
 60      5
 60      4', header=TRUE)


 I was wondering if you could just use `glm` in the base/stats package:


 plot(jitter(FR)~jitter(N0), data=dat)
 ( reg=glm(FR ~ N0, data=dat, family=poisson) )
 lines(10:60, predict(reg, newdata=list(N0=10:60), type=response))

 # It gives you:

 FR ~ exp(alpha + beta*N0) + pois-error

 # The plot looks adequate. And I would say arguably better than the one I
 get with:

  x - nls(FR~(exp(i+j*N0)/(1+exp(i+j*N0)))*(k*N0/(k+N0)), data=dat,
           control =nls.control(maxiter=150, tol=0.01),
            start=list(i=1,j=.02,k=6))
 summary(x)
 hatx - predict(x)
 lines(spline(dat$N0,hatx), col=red)

 # I also saw Gabor Grothendieck's suggestion which I'm sure is more
 # mathematically sophisticated than my hacking, but when I plot its
 # results, it seems to fit even less well than your original nls function.


The normal will tend to overfit to the right side of the plot if the
data is truly poisson since it won't take into account that the
variance of those points is larger.  Note how the normal plot rises
toward the right more than the poisson plot suggesting the normal is
letting those points unduly influence the fit.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package seems to be present but library don't find it

2012-04-03 Thread Marc Girondot
Indeed I get this error message when I install the library using R CMD 
INSTALL but not within the GUI (in MacOsX). Good to know that R CMD 
INSTALL is more verbose and permits to track bug.

Thanks a lot.

It works fine now.

Marc

Le 03/04/12 16:03, Duncan Murdoch a écrit :
 On 03/04/2012 9:50 AM, Marc Girondot wrote:
 In case someone has the competence to check, the file is here:

 setwd(~)

 download.file(http://www.ese.u-psud.fr/epc/conservation/r-scripts/HelloWorld_1.0.tar.gz;,
  

 HelloWorld_1.0.tar.gz)

 install.packages(HelloWorld_1.0.tar.gz, repos = NULL)


 The problem is that you try to export the name HelloWorld in your 
 NAMESPACE file, but you don't have an object of that name.  You should 
 export showHello instead.

 Not sure why you didn't see the error message I got:

 $ R CMD INSTALL HelloWorld_1.0.tar.gz
 * installing to library 'F:/cygwin/home/murdoch/R/win-library/2.15'
 * installing *source* package 'HelloWorld' ...
 ** R
 ** preparing package for lazy loading
 ** help
 Warning: 
 C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag
 e.Rd:32: All text must be in a section
 Warning: 
 C:/temp/RtmpimSWuh/R.INSTALLe801c296e9/HelloWorld/man/HelloWorld-packag
 e.Rd:33: All text must be in a section
 *** installing help indices
 ** building package indices
 ** testing if installed package can be loaded
 Error in namespaceExport(ns, exports) : undefined exports: HelloWorld
 Error: loading failed
 Execution halted
 ERROR: loading failed
 * removing 'F:/cygwin/home/murdoch/R/win-library/2.15/HelloWorld'

 Duncan Murdoch 


-- 
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time-series features skewness kurtosis periodicity trend seasonality

2012-04-03 Thread John Kohr

Hello everyone,

I found the paper A scalable method for time-series clustering and there are 
proposed several measures to characterize time-series like trend, seasonality, 
periodicity, serial correlation, skewness, kurtosis, self-similarity etc.

They say they have implemented them on R, do you have any clue if there is a 
package calculating them? or any other packages that calculate some of them so 
that i can combine them?

Thanks,
John
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to map microarray probe to gene, homology

2012-04-03 Thread email mail
Hi:

I have clustered microarray gene expression data and trying to map between
microarray probe, gene, pathway, gene ontology, and homology for a set of
(affy) microarray probes. Is there any package in R which facilitates this?
I am looking at bioconductor, but till now could not find a solution. A
link to some worked example would be appreciated.

Thanks and regards.

John

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do you distinguish between characters on a pco plot?

2012-04-03 Thread leannehaggerty
I've read everything I can find, maybe I'm just missing something but I can't
find an example which colours the points on a pco plot differently based on
user specified parameters, I can colour every point differently or all the
same, or randomly with a number of colours. 
What I need to do is specify that points a, b and c, for example will be
blue, while points d, e, f, and g will be red and points h and i will be
green. Is it possible to colour with uneven distribution, I suppose is my
question? I can't find a solution in any manual.

--
View this message in context: 
http://r.789695.n4.nabble.com/How-do-you-distinguish-between-characters-on-a-pco-plot-tp4496217p4529425.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] identify time span in date vector

2012-04-03 Thread Fischer, Felix
Hello everyone,

i try to identify the first element of a date vector, for which the following 
condition holds: at least 3 more dates within the next 365 days, but at least 
one of these must be between 3-12 month later.

dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1)

Has anyone an idea how to do this economically? I'll need to apply this to a 
large dataset with date vectors of various lengths and I can think only of 
quite difficult algorithms :(

Any ideas would be appreciated,
Felix


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread jeff6868
Michael,

First of all, thank you very much for your answer.
I've read your 2 answers, but I'm not really sure that they corresponds to
my problem of NAs.
I'll try to detail you a bit more.

This problem concerns the second part of my program. In the first part, I've
already created a timeseries object with the library (timeseries). I had to
delete first all the wrong values in my data and replace it with NAs. 
So my data contains already missing data (NAs), as I have cleaned it before.

The thing is that sometimes I have small gaps of missing data (only 2 or 3
following) like in example 1 below:

example 1:

09/01/2008 12:00  1.93   
09/01/2008 12:15  3.93   
09/01/2008 12:30   NASo here you have a small gap with only
2 NAs
09/01/2008 12:45   NA   
09/01/2008 13:00  4.93  
09/01/2008 13:15  5.93

But sometimes, always in the same file, I have big gaps, such as 10 or more
NAs following each other like in example 2 below:

example 2:

09/01/2008 16:152.93
09/01/2008 16:302.93
09/01/2008 16:45NA
09/01/2008 17:00NA
09/01/2008 17:15NA
09/01/2008 17:30NA
09/01/2008 17:45NA
09/01/2008 18:00NA  So here you have a big gap with 
more than 10
NAs following each other
09/01/2008 18:15NA
09/01/2008 18:30NA
09/01/2008 18:45NA
09/01/2008 19:00NA
09/01/2008 19:15NA
09/01/2008 19:30NA
09/01/2008 19:45NA
09/01/2008 20:00NA
09/01/2008 20:157.93
09/01/2008 20:307.93

So in the whole same file, I can have sometimes big gaps (2 or 3 NAs),
sometimes big or very big gaps (10 or 100 NAs following).

The aim of my problem is to apply the function: na.approx(x) of the library
(zoo) to fill NAs ONLY for small gaps.

If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big
gaps + small gaps). It's exactly what I DON'T WANT.

My problem is to say to R:  you apply the function (na.approx) to fill NAs
ONLY if you see 4 NAs maximum following each other (small gaps) (like
example 1). If you see more than 4 NAs following each other (big gaps like
in example 2), you keep these NAs and you DON'T fill this big gap.

My question is: how can I say this to R? I don't know how to do it.
Hope I've been understandable this time ^^
Thanks a lot again for all your answers!



--
View this message in context: 
http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] object of type 'S4' is not subsettable

2012-04-03 Thread phillen
hey there!

The object 'cit' contains:

 cit

# 
# Johansen-Procedure Unit Root / Cointegration Test # 
# 

The value of the test statistic is: 5.3484 9.0681 10.6433
---
I want R to save the value 5.3484 in a new object. I am used to use the
command x=cit[a] where a stands for the place where R has saved the value.
However, the johansen procedure works with S4 objects with which subsetting
does not seem to work.

Any ideas how I could get R save a certain value of the output into a new
object?

Thousand thanks,
Philipp

--
View this message in context: 
http://r.789695.n4.nabble.com/object-of-type-S4-is-not-subsettable-tp4529063p4529063.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] conversion of Modified Julian Days to YMDHMS format

2012-04-03 Thread uday
Does anybody knows how to convert Modified Julian Days(MJD) to YMDHMS format? 
The sample data set which I have is as follows 
time - c(1832.415 ,1832.415 ,1832.415 ,1832.415 ,1832.415 ,1832.415
,1832.415, 1832.415, 1832.415 ,1832.415)
which is Days since 1.1.2000 (in UT time, starting at 00:00:00)

--
View this message in context: 
http://r.789695.n4.nabble.com/conversion-of-Modified-Julian-Days-to-YMDHMS-format-tp4528975p4528975.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Import from excel button in R-command

2012-04-03 Thread nimoke
Hello

I have been searching for almost 2 hours for a certain plug-in/package, so
im making this thread as i hope you can help me find it. 

I had my first lesson in Statistics in use today, and when we worked on
the school computers, we could do this to import data from excel:

Data  Import Data  Import from excel or something else

Now i downloaded it for my laptop as i want to work with it from home, but
there is no button to import from excel in my edition. Is there anyone who
knows how to get this feature, so i dont have to type in commands or change
my excel documents everytime?

Thanks in advance
/Nick

--
View this message in context: 
http://r.789695.n4.nabble.com/Import-from-excel-button-in-R-command-tp4529023p4529023.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object of type 'S4' is not subsettable

2012-04-03 Thread phillen
Figured it out.

given that object 'cit' is of class S4, extracting information works as
follows:

1)finding out names of slots in object 'cit'

 slotNames(cit)
 [1] x Z0Z1ZKtype  model
 [7] ecdet lag   P seasondumvarcval 
[13] teststat  lambdaVorg  V W PI   
[19] DELTA GAMMA R0RKbpspec 
[25] call  test.name

2) extracting info from wanted slot
 cit@teststat
[1]  5.348440  9.068113 10.643293



--
View this message in context: 
http://r.789695.n4.nabble.com/object-of-type-S4-is-not-subsettable-tp4529063p4529432.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create Model Object (setClass?setMethod?)

2012-04-03 Thread casperyc
Hi all,

I have a self written likelihood as a model and functions to optimize and
get fitted values, confidence intervals ect.

I wonder if there is a way to define a 'class', or a 'model' (or a certain
object)? so that I can use 'summary' to produce a summary like it does for a
lm object.

Also, it should be able to use 'predict' and 'plot' and other various
generic functions.

I am reading bits and pieces on the internet on 'setClass', 'setMethod'.  Am
I looking for the correct thing? Is there any up to date references that I
can get help? I need some examples to get started with.

Thanks!

Casper 

-
###
PhD candidate in Statistics
School of Mathematics, Statistics and Actuarial Science, University of Kent
###

--
View this message in context: 
http://r.789695.n4.nabble.com/Create-Model-Object-setClass-setMethod-tp4529473p4529473.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do you distinguish between characters on a pco plot?

2012-04-03 Thread Sarah Goslee
You can pass a vector of colors to the col= argument,
eg
col=c(1, 1, 2, 3, 2, 1)
to match your parameters. Or, in your case,
c(rep(blue, 3), rep(red, 4), rep(2, green))
- untested because you didn't provide an example.

You should read ?par and ?plot

Unless you're using lattice graphics, but you don't tell us.

Sarah

On Tue, Apr 3, 2012 at 12:39 PM, leannehaggerty
leannehagge...@gmail.com wrote:
 I've read everything I can find, maybe I'm just missing something but I can't
 find an example which colours the points on a pco plot differently based on
 user specified parameters, I can colour every point differently or all the
 same, or randomly with a number of colours.
 What I need to do is specify that points a, b and c, for example will be
 blue, while points d, e, f, and g will be red and points h and i will be
 green. Is it possible to colour with uneven distribution, I suppose is my
 question? I can't find a solution in any manual.

 --
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] identify time span in date vector

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 9:35 AM, Fischer, Felix wrote:


Hello everyone,

i try to identify the first element of a date vector, for which the  
following condition holds: at least 3 more dates within the next 365  
days, but at least one of these must be between 3-12 month later.


dates = as.Date(sort(rnorm(10,3000,100)), origin = 2000-1-1)

Has anyone an idea how to do this economically? I'll need to apply  
this to a large dataset with date vectors of various lengths and I  
can think only of quite difficult algorithms :(




which( dates[4:(length(dates))] -dates[1:(length(dates)-3)] 365 
   dates[3:(length(dates)-1)] -dates[1:(length(dates)-3)]  90)

[1] 2 3


Any ideas would be appreciated,
Felix


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time-series features skewness kurtosis periodicity trend seasonality

2012-04-03 Thread Thomas Adams
John,

If the paper is the one by Xiaozhe Wang, Kate A. Smith, Rob Hyndman, and
Damminda Alahakoon, the way I read it, it does not say they created an R
package (that may be on CRAN). They may have just used R in their analyses;
just a guess...

Tom

On Tue, Apr 3, 2012 at 12:34 PM, John Kohr illuminati...@hotmail.comwrote:


 Hello everyone,

 I found the paper A scalable method for time-series clustering and there
 are proposed several measures to characterize time-series like trend,
 seasonality, periodicity, serial correlation, skewness, kurtosis,
 self-similarity etc.

 They say they have implemented them on R, do you have any clue if there is
 a package calculating them? or any other packages that calculate some of
 them so that i can combine them?

 Thanks,
 John

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  thomas.ad...@noaa.gov
VOICE:  937-383-0528
FAX:937-383-0033

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time-series features skewness kurtosis periodicity trend seasonality

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 12:34 PM, John Kohr wrote:



Hello everyone,




I found the paper A scalable method for time-series clustering and  
there are proposed several measures to characterize time-series like  
trend, seasonality, periodicity, serial correlation, skewness,  
kurtosis, self-similarity etc.


They say they have implemented them on R, do you have any clue if  
there is a package calculating them? or any other packages that  
calculate some of them so that i can combine them?


http://cran.r-project.org/web/views/TimeSeries.html

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread jim holtman
 x - read.table(text=09/01/2008 12:00  1.93
+ 09/01/2008 12:15  3.93
+ 09/01/2008 12:30   NA
+ 09/01/2008 12:45   NA
+ 09/01/2008 13:00  4.93
+ 09/01/2008 13:15  5.93
+ 09/01/2008 16:152.93
+ 09/01/2008 16:302.93
+ 09/01/2008 16:45NA
+ 09/01/2008 17:00NA
+ 09/01/2008 17:15NA
+ 09/01/2008 17:30NA
+ 09/01/2008 17:45NA
+ 09/01/2008 18:00NA
+ 09/01/2008 18:15NA
+ 09/01/2008 18:30NA
+ 09/01/2008 18:45NA
+ 09/01/2008 19:00NA
+ 09/01/2008 19:15NA
+ 09/01/2008 19:30NA
+ 09/01/2008 19:45NA
+ 09/01/2008 20:00NA
+ 09/01/2008 20:157.93
+ 09/01/2008 20:307.93, as.is = TRUE)

 # find the NA gaps and process differently
 gaps - rle(is.na(x$V3))
 offsets - c(1, cumsum(head(gaps$lengths, -1)) + 1)
 (shortgaps - which(gaps$values  (gaps$lengths = 4)))
[1] 2
 (longgaps - which(gaps$values  (gaps$lengths  4)))
[1] 4

 # now that you have the indices of where the short/long gaps are
 # you can process each individually; left as an exercise to the reader



On Tue, Apr 3, 2012 at 10:13 AM, jeff6868
geoffrey_kl...@etu.u-bourgogne.fr wrote:
 Michael,

 First of all, thank you very much for your answer.
 I've read your 2 answers, but I'm not really sure that they corresponds to
 my problem of NAs.
 I'll try to detail you a bit more.

 This problem concerns the second part of my program. In the first part, I've
 already created a timeseries object with the library (timeseries). I had to
 delete first all the wrong values in my data and replace it with NAs.
 So my data contains already missing data (NAs), as I have cleaned it before.

 The thing is that sometimes I have small gaps of missing data (only 2 or 3
 following) like in example 1 below:

 example 1:

 09/01/2008 12:00      1.93
 09/01/2008 12:15      3.93
 09/01/2008 12:30       NA            So here you have a small gap with only
 2 NAs
 09/01/2008 12:45       NA
 09/01/2008 13:00      4.93
 09/01/2008 13:15      5.93

 But sometimes, always in the same file, I have big gaps, such as 10 or more
 NAs following each other like in example 2 below:

 example 2:

 09/01/2008 16:15                2.93
 09/01/2008 16:30                2.93
 09/01/2008 16:45                NA
 09/01/2008 17:00                NA
 09/01/2008 17:15                NA
 09/01/2008 17:30                NA
 09/01/2008 17:45                NA
 09/01/2008 18:00                NA          So here you have a big gap with 
 more than 10
 NAs following each other
 09/01/2008 18:15                NA
 09/01/2008 18:30                NA
 09/01/2008 18:45                NA
 09/01/2008 19:00                NA
 09/01/2008 19:15                NA
 09/01/2008 19:30                NA
 09/01/2008 19:45                NA
 09/01/2008 20:00                NA
 09/01/2008 20:15                7.93
 09/01/2008 20:30                7.93

 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs),
 sometimes big or very big gaps (10 or 100 NAs following).

 The aim of my problem is to apply the function: na.approx(x) of the library
 (zoo) to fill NAs ONLY for small gaps.

 If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big
 gaps + small gaps). It's exactly what I DON'T WANT.

 My problem is to say to R:  you apply the function (na.approx) to fill NAs
 ONLY if you see 4 NAs maximum following each other (small gaps) (like
 example 1). If you see more than 4 NAs following each other (big gaps like
 in example 2), you keep these NAs and you DON'T fill this big gap.

 My question is: how can I say this to R? I don't know how to do it.
 Hope I've been understandable this time ^^
 Thanks a lot again for all your answers!



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread jim holtman
Forgot to mention that the offsets were into the 'gaps' (result of the
rle) and 'offsets' which is the index into the original data there the
gap starts.

 gaps
Run Length Encoding
  lengths: int [1:5] 2 2 4 14 2
  values : logi [1:5] FALSE TRUE FALSE TRUE FALSE
 offsets
[1]  1  3  5  9 23



On Tue, Apr 3, 2012 at 1:12 PM, jim holtman jholt...@gmail.com wrote:
 x - read.table(text=09/01/2008 12:00      1.93
 + 09/01/2008 12:15      3.93
 + 09/01/2008 12:30       NA
 + 09/01/2008 12:45       NA
 + 09/01/2008 13:00      4.93
 + 09/01/2008 13:15      5.93
 + 09/01/2008 16:15                2.93
 + 09/01/2008 16:30                2.93
 + 09/01/2008 16:45                NA
 + 09/01/2008 17:00                NA
 + 09/01/2008 17:15                NA
 + 09/01/2008 17:30                NA
 + 09/01/2008 17:45                NA
 + 09/01/2008 18:00                NA
 + 09/01/2008 18:15                NA
 + 09/01/2008 18:30                NA
 + 09/01/2008 18:45                NA
 + 09/01/2008 19:00                NA
 + 09/01/2008 19:15                NA
 + 09/01/2008 19:30                NA
 + 09/01/2008 19:45                NA
 + 09/01/2008 20:00                NA
 + 09/01/2008 20:15                7.93
 + 09/01/2008 20:30                7.93, as.is = TRUE)

 # find the NA gaps and process differently
 gaps - rle(is.na(x$V3))
 offsets - c(1, cumsum(head(gaps$lengths, -1)) + 1)
 (shortgaps - which(gaps$values  (gaps$lengths = 4)))
 [1] 2
 (longgaps - which(gaps$values  (gaps$lengths  4)))
 [1] 4

 # now that you have the indices of where the short/long gaps are
 # you can process each individually; left as an exercise to the reader



 On Tue, Apr 3, 2012 at 10:13 AM, jeff6868
 geoffrey_kl...@etu.u-bourgogne.fr wrote:
 Michael,

 First of all, thank you very much for your answer.
 I've read your 2 answers, but I'm not really sure that they corresponds to
 my problem of NAs.
 I'll try to detail you a bit more.

 This problem concerns the second part of my program. In the first part, I've
 already created a timeseries object with the library (timeseries). I had to
 delete first all the wrong values in my data and replace it with NAs.
 So my data contains already missing data (NAs), as I have cleaned it before.

 The thing is that sometimes I have small gaps of missing data (only 2 or 3
 following) like in example 1 below:

 example 1:

 09/01/2008 12:00      1.93
 09/01/2008 12:15      3.93
 09/01/2008 12:30       NA            So here you have a small gap with only
 2 NAs
 09/01/2008 12:45       NA
 09/01/2008 13:00      4.93
 09/01/2008 13:15      5.93

 But sometimes, always in the same file, I have big gaps, such as 10 or more
 NAs following each other like in example 2 below:

 example 2:

 09/01/2008 16:15                2.93
 09/01/2008 16:30                2.93
 09/01/2008 16:45                NA
 09/01/2008 17:00                NA
 09/01/2008 17:15                NA
 09/01/2008 17:30                NA
 09/01/2008 17:45                NA
 09/01/2008 18:00                NA          So here you have a big gap with 
 more than 10
 NAs following each other
 09/01/2008 18:15                NA
 09/01/2008 18:30                NA
 09/01/2008 18:45                NA
 09/01/2008 19:00                NA
 09/01/2008 19:15                NA
 09/01/2008 19:30                NA
 09/01/2008 19:45                NA
 09/01/2008 20:00                NA
 09/01/2008 20:15                7.93
 09/01/2008 20:30                7.93

 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs),
 sometimes big or very big gaps (10 or 100 NAs following).

 The aim of my problem is to apply the function: na.approx(x) of the library
 (zoo) to fill NAs ONLY for small gaps.

 If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big
 gaps + small gaps). It's exactly what I DON'T WANT.

 My problem is to say to R:  you apply the function (na.approx) to fill NAs
 ONLY if you see 4 NAs maximum following each other (small gaps) (like
 example 1). If you see more than 4 NAs following each other (big gaps like
 in example 2), you keep these NAs and you DON'T fill this big gap.

 My question is: how can I say this to R? I don't know how to do it.
 Hope I've been understandable this time ^^
 Thanks a lot again for all your answers!



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Jim Holtman
 Data Munger Guru

 What is the problem that you are trying to solve?
 Tell me what you want to do, not how you want to do it.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what 

Re: [R] Import from excel button in R-command

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 10:57 AM, nimoke wrote:


Hello

I have been searching for almost 2 hours for a certain plug-in/ 
package, so

im making this thread as i hope you can help me find it.

I had my first lesson in Statistics in use today, and when we  
worked on

the school computers, we could do this to import data from excel:

Data  Import Data  Import from excel or something else

Now i downloaded it for my laptop as i want to work with it from  
home, but
there is no button to import from excel in my edition. Is there  
anyone who
knows how to get this feature, so i dont have to type in commands or  
change

my excel documents everytime?


You were probably set up with one of the GUI's, perhaps the Rcmdr GUI

http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/

This would be a widely cited set of instructions about how to do it  
without dependance on a GUI:



http://rwiki.sciviews.org/doku.php?id=tips:data-io:ms_windows

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] time-series features skewness kurtosis periodicity trend seasonality

2012-04-03 Thread John Kohr

Thx for the link, I have seen these packages before, but looking to them, i 
couldn't find any that computes these statistics/measures.

Any clue if any particular of them computes any of these characteristics?

One of the co-authors of the paper is Prof. Hyndman who has contributed in many 
of these packages.. that's why i was guessing there would be implementations 
for these measures.

best,
john

 CC: r-help@r-project.org
 From: dwinsem...@comcast.net
 To: illuminati...@hotmail.com
 Subject: Re: [R] time-series features skewness kurtosis periodicity trend 
 seasonality
 Date: Tue, 3 Apr 2012 13:00:42 -0400
 
 
 On Apr 3, 2012, at 12:34 PM, John Kohr wrote:
 
 
  Hello everyone,
 
 
  I found the paper A scalable method for time-series clustering and  
  there are proposed several measures to characterize time-series like  
  trend, seasonality, periodicity, serial correlation, skewness,  
  kurtosis, self-similarity etc.
 
  They say they have implemented them on R, do you have any clue if  
  there is a package calculating them? or any other packages that  
  calculate some of them so that i can combine them?
 
 http://cran.r-project.org/web/views/TimeSeries.html
 
 -- 
 
 David Winsemius, MD
 West Hartford, CT
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to map microarray probe to gene, homology

2012-04-03 Thread Peter Langfelder
On Tue, Apr 3, 2012 at 7:21 AM, email mail email8...@gmail.com wrote:
 Hi:

 I have clustered microarray gene expression data and trying to map between
 microarray probe, gene, pathway, gene ontology, and homology for a set of
 (affy) microarray probes. Is there any package in R which facilitates this?
 I am looking at bioconductor, but till now could not find a solution. A
 link to some worked example would be appreciated.

Yes, Bioconductor has annotation packages for microarray chips plus
other packages for connecting gene identifiers to (for example) gene
ontologies.

For example, the package hgu133plus2.db contains annotations for the
U133 plus 2 chip and you can use to map affy probe IDs to gene
symbols, IDs, GO terms etc. Install the package, load it, then type

ls(package:hgu133plus2.db)

and look at the help files for each topic for more details and examples.

The array annotations are chip-specific so you need to choose the
right package for your chip.

Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread Gabor Grothendieck
On Tue, Apr 3, 2012 at 4:52 AM, jeff6868
geoffrey_kl...@etu.u-bourgogne.fr wrote:
 Hi everybody,

 I'm a new R french user. Sorry if my english is not perfect. Hope you'll
 understand my problem ;)

 I have to work on temperature data (35000 lines in one file) containing some
 missing data (N/A). Sometimes I have only 2 or 3 N/A following each other,
 but I have also sometimes 100 or 200 N/A following each other. Here's an
 example of my data, when I have only small gaps of missing data (2 or 3
 N/A):

 09/01/2008 12:00   2   1.93   2.93   4.56   5.43
 09/01/2008 12:15   2   *3.93*   3.25   4.93   5.56
 09/01/2008 12:30   2    NA   3.5   5.06   5.56
 09/01/2008 12:45   2    NA   3.68 5.25   5.68
 09/01/2008 13:00   2   *4.93 *  3.87   5.56   5.93
 09/01/2008 13:15   2   5.93   4.25   5.75   6.06
 09/01/2008 13:30   2   3.93   4.56   5.93   6.18

 My question is: how can I replace these small gaps of N/A by numeric values?
 I would like a fonction which only replace the small gaps (2 or 3 N/A) in my
 data, but not the big gaps (more than 5 N/A following each other).


Try na.locf, na.approx or na.spline in the zoo package noting the
maxgap= argument on each.

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ff problems

2012-04-03 Thread Bond, Stephen
this works:
bigcl - read.table(file=bigCL1.csv,sep=',',header=T , 
colClasses=c(rep(factor,3),numeric,NULL,integer,
numeric,integer,NULL,numeric,NULL),nrow=1000)

this doesn't:
bigcl - read.table.ffdf(file=bigCL1.csv,sep=',',header=T ,
  colClasses=c(TERMGROUP=factor,SN=factor,

INS=factor,INCENTIVE=numeric,SRC=character,FLAG=integer,

RELAGE=numeric,BURNSUM=integer,prod=character,BAL=numeric,origbal=NULL)
  )
Error in ff(initdata = initdata, length = length, levels = levels, ordered = 
ordered,  :
  vmode 'character' not implemented

neither this:
 bigcl - read.table.ffdf(file=bigCL1.csv,sep=',',header=T  ,
+  colClasses=c(TERMGROUP=factor,SN=factor,
+
INS=factor,INCENTIVE=numeric,SRC=NULL,FLAG=integer,
+
RELAGE=numeric,BURNSUM=integer,prod=NULL,BAL=numeric,origbal=NULL)
+  )
Error in repnam(colClasses, colnames(x), default = NA) :
  the following argument names do not match'SRC','prod','origbal'


if I load as is without classes then:
 bigcl - read.table.ffdf(file=bigCL1.csv,sep=',',header=T)
 bigcl$term - factor(bigcl$term)
Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

attempting to save it:
 ffsave(bigcl,file=fcl)
Error in system(cmd, input = filelist, intern = TRUE) : 'zip' not found

attempting to get rid of the unwanted columns:
 bigcl - bigcl[,c(-5,-9,-11)]
Error: cannot allocate vector of size 129.8 Mb
In addition: Warning messages:
1: In class(df) - data.frame :
  Reached total allocation of 1535Mb: see help(memory.size)
2: In class(df) - data.frame :
  Reached total allocation of 1535Mb: see help(memory.size)
3: In class(df) - data.frame :
  Reached total allocation of 1535Mb: see help(memory.size)
4: In class(df) - data.frame :
  Reached total allocation of 1535Mb: see help(memory.size)

 ff was supposed to be gentle on RAM usage??

I am attaching the first 1000 lines from the csv file.
thanks a lot.


Stephen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Val
Hi All,

On the same data  points
x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )

I want to have have the following output  as data frame

x   group   group mean
46   142.3
125 289.6
36   142.3
193 3235.25
209 3235.25
78   289.6
66   289.6
242 3235.25
297 3235.25
45   142.3

I tried the following code


dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1
gxc - with(dat, tapply(xc, group, mean))
dat$gxc - gxce[as.character(dat$group)]
txc=dat$gxc

it did not work for me.













On Tue, Apr 3, 2012 at 10:15 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Apr 3, 2012, at 10:11 AM, Val wrote:

 David W and all,

 Thank you very much for your help.

 Here is the final output that I want in the form of data frame. The data
 frame should contain  x, group and group_ mean in the following way

 x   group   group mean
 46   142.3
 125 289.6
 36   142.3
 193 3235.25
 209 3235.25
 78   289.6
 66   289.6
 242 3235.25
 297 3235.25
 45   142.3


 I you want group means in a vector the same length as x then instead of
 using tapply as done in earlier solutions you should use `ave`.

 --
 DW



 Thanks a lot








 On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius dwinsem...@comcast.netwrote:


 On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:

  Use cut2 as I suggested and David demonstrated.


 Agree that Hmisc::cut2 is extremely handy and I also like that fact that
 the closed ends of intervals are on the left side (which is not the same
 behavior as cut()), which has the otehr effect of setting include.lowest =
 TRUE which is not the default for cut() either (to my continued amazement).

 But let me add the method I use when doing it by hand:

 cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE)

 --
 David.




 Michael

 On Tue, Apr 3, 2012 at 9:31 AM, Val valkr...@gmail.com wrote:

 Thank you all (David, Michael, Giovanni)  for your prompt response.

 First there was a typo error for the group mean it was 89.6 not 87.

 For a small data set and few groupings I can use  prob=c(0, .333, .66
 ,1) to
 group in to three groups in this case. However,  if I want to extend the
 number of groupings say 10 or 15 then do I have to figure it out the
  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

 Is there a short cut for that?


 Thanks











 On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:


 Ignoring the fact your desired answers are wrong, I'd split the
 separating part and the group means parts into three steps:

 i) quantile() can help you get the split points,
 ii)  findInterval() can assign each y to a group
 iii) then ave() or tapply() will do group-wise means

 Something like:

 y - c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a c
 here.
 ave(y, findInterval(y, quantile(y, c(0.33, 0.66
 tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

 You could also use cut2 from the Hmisc package to combine findInterval
 and quantile into a single step.

 Depending on your desired output.

 Hope that helps,
 Michael

 On Tue, Apr 3, 2012 at 8:47 AM, Val valkr...@gmail.com wrote:

 Hi all,

 Assume that I have the following 10 data points.
  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

 sort x  and get the following
  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

 I want to  group the sorted  data point (y)  into  equal number of
 observation per group. In this case there will be three groups.  The
 first
 two groups  will have three observation  and the third will have four
 observations

 group 1  = 34, 45, 46
 group 2  = 66, 78, 125
 group 3  = 193, 209, 242,297

 Finally I want to calculate the group mean

 group 1  =  42
 group 2  =  87
 group 3  =  234

 Can anyone help me out?

 In SAS I used to do it using proc rank.

 thanks in advance

 Val

   [[alternative HTML version deleted]]



 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/**posting-guide.htmlhttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 West Hartford, CT



 David Winsemius, MD
 West Hartford, CT




Re: [R] grouping

2012-04-03 Thread Petr Savicky
On Tue, Apr 03, 2012 at 02:21:36PM -0400, Val wrote:
 Hi All,
 
 On the same data  points
 x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
 
 I want to have have the following output  as data frame
 
 x   group   group mean
 46   142.3
 125 289.6
 36   142.3
 193 3235.25
 209 3235.25
 78   289.6
 66   289.6
 242 3235.25
 297 3235.25
 45   142.3
 
 I tried the following code
 
 
 dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1
 gxc - with(dat, tapply(xc, group, mean))
 dat$gxc - gxce[as.character(dat$group)]
 txc=dat$gxc
 
 it did not work for me.

David Winsemius suggested to use ave(), when you asked this
question for the first time. Can you have look at it?

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] filling small gaps of N/A

2012-04-03 Thread R. Michael Weylandt
Like I said in my followup, please pass the maxgap argument: i.e.,
na.approx(x, maxgap = 4)

x - zoo(1:20, Sys.Date() + 1:20)

x[2:4] - NA # Short run of NA's
x[10:16] - NA # Long run of NA's

na.approx(x) # All filled in
na.approx(x, maxgap = 4) # Only the short one filled in

Michael

On Tue, Apr 3, 2012 at 10:13 AM, jeff6868
geoffrey_kl...@etu.u-bourgogne.fr wrote:
 Michael,

 First of all, thank you very much for your answer.
 I've read your 2 answers, but I'm not really sure that they corresponds to
 my problem of NAs.
 I'll try to detail you a bit more.

 This problem concerns the second part of my program. In the first part, I've
 already created a timeseries object with the library (timeseries). I had to
 delete first all the wrong values in my data and replace it with NAs.
 So my data contains already missing data (NAs), as I have cleaned it before.

 The thing is that sometimes I have small gaps of missing data (only 2 or 3
 following) like in example 1 below:

 example 1:

 09/01/2008 12:00      1.93
 09/01/2008 12:15      3.93
 09/01/2008 12:30       NA            So here you have a small gap with only
 2 NAs
 09/01/2008 12:45       NA
 09/01/2008 13:00      4.93
 09/01/2008 13:15      5.93

 But sometimes, always in the same file, I have big gaps, such as 10 or more
 NAs following each other like in example 2 below:

 example 2:

 09/01/2008 16:15                2.93
 09/01/2008 16:30                2.93
 09/01/2008 16:45                NA
 09/01/2008 17:00                NA
 09/01/2008 17:15                NA
 09/01/2008 17:30                NA
 09/01/2008 17:45                NA
 09/01/2008 18:00                NA          So here you have a big gap with 
 more than 10
 NAs following each other
 09/01/2008 18:15                NA
 09/01/2008 18:30                NA
 09/01/2008 18:45                NA
 09/01/2008 19:00                NA
 09/01/2008 19:15                NA
 09/01/2008 19:30                NA
 09/01/2008 19:45                NA
 09/01/2008 20:00                NA
 09/01/2008 20:15                7.93
 09/01/2008 20:30                7.93

 So in the whole same file, I can have sometimes big gaps (2 or 3 NAs),
 sometimes big or very big gaps (10 or 100 NAs following).

 The aim of my problem is to apply the function: na.approx(x) of the library
 (zoo) to fill NAs ONLY for small gaps.

 If I just do: apply(na.approx(x)), it will fill all the NAs of my data (big
 gaps + small gaps). It's exactly what I DON'T WANT.

 My problem is to say to R:  you apply the function (na.approx) to fill NAs
 ONLY if you see 4 NAs maximum following each other (small gaps) (like
 example 1). If you see more than 4 NAs following each other (big gaps like
 in example 2), you keep these NAs and you DON'T fill this big gap.

 My question is: how can I say this to R? I don't know how to do it.
 Hope I've been understandable this time ^^
 Thanks a lot again for all your answers!



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/filling-small-gaps-of-N-A-tp4528184p4528907.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Val
I did look at it the result  is below,

x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )

#lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
include.lowest=TRUE) ), mean)
  ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
include.lowest=TRUE) ), mean)

 ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
include.lowest=TRUE) ), mean)
$`[36,74]`
[1] NA

$`(74,197]`
[1] NA

$`(197,297]`
[1] NA

There were 11 warnings (use warnings() to see them)





On Tue, Apr 3, 2012 at 2:35 PM, Petr Savicky savi...@cs.cas.cz wrote:

 On Tue, Apr 03, 2012 at 02:21:36PM -0400, Val wrote:
  Hi All,
 
  On the same data  points
  x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
 
  I want to have have the following output  as data frame
 
  x   group   group mean
  46   142.3
  125 289.6
  36   142.3
  193 3235.25
  209 3235.25
  78   289.6
  66   289.6
  242 3235.25
  297 3235.25
  45   142.3
 
  I tried the following code
 
 
  dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66
 ,1
  gxc - with(dat, tapply(xc, group, mean))
  dat$gxc - gxce[as.character(dat$group)]
  txc=dat$gxc
 
  it did not work for me.

 David Winsemius suggested to use ave(), when you asked this
 question for the first time. Can you have look at it?

 Petr Savicky.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mixed italic and non-italic in text

2012-04-03 Thread lukevancleve
Hi,

I need to italicize the journal in a citation but have thus far failed.  How
can I make 'Journal of Something' below italic but leave the rest?

mtext( See Author1 and Author2 (2007) , \Title\,  Journal of Something ,
pp. 1-50., side = 3, outer = T,  line=-0.75, cex = 0.7, at= 0.04, adj = 0,
font = 1, family = Times)

--
View this message in context: 
http://r.789695.n4.nabble.com/Mixed-italic-and-non-italic-in-text-tp4529710p4529710.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R equivalent for SQL query

2012-04-03 Thread Steven Raemaekers
Hi,

I have a query which I would like to translate into R, but I do not know how to 
do it in an easy way.
Assume a data frame has columns A, B and C:

A   B   C
1   1   3
1   1   4
1   1   5
1   2   6
1   2   7
1   3   8

The query is as follows: 

select A, B, count(*)
from data.frame
group by A, B
order by count(*) desc

How do I translate this into R statements in such way that the result is a data 
frame structured as follows: 

A   B   count(*)
1   1   3
1   2   2
1   3   1

Thanks,

Steven
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unable to move temporary installation

2012-04-03 Thread Drew Tyre
A final followup. I have identified a rather extreme workaround. The
problem arises when the function utils:::unpackPkgZip uses file.rename(...)
to move the unzipped binary package from the temporary directory that it
was unpacked into into the proper directory in the library tree. If one does
debug(utils:::unpackPkgZip)
and then steps through the function line by line, it works.

Thank you.

On Mon, Apr 2, 2012 at 12:06 PM, Drew Tyre aty...@unl.edu wrote:

 OK - so I followed the following steps, which I think rule out those causes

 1) I uninstalled all remaining versions of R, and then deleted all the
 directories in c:\progra~1\R
 2) I restarted the computer
 3) I installed 2.14.2, and attempted to install the Rcmdr package. Same
 error message for both the cars package and the Rcmdr package.
 4) I then exited and confirmed that I have write permission to
 C:\progra~1\R\R-2.14.2\libraries both by looking at the permissions, and by
 creating a directory in there. I appear to have full control, and I could
 create the directory. note that R is able to create the temporary directory
 to install the package, but not the correct, final directory.
 5) I then uninstalled 2.14.2, and installed 2.15.0, hoping for a fix. No
 luck. Same error message.
 6) I then tried installing the packages to a different directory, one that
 I created, c:\test, using
 install.packages(Rcmdr,c:\\test)
 This time, the car package installed correctly, but Rcmdr still had the
 same warning message

 Warning: unable to move temporary installation
 ‘c:\test\file136c67c337b3\Rcmdr’ to ‘c:\test\Rcmdr’

 There is clearly something messed up on this computer, but I'm at a loss
 for how to get around it. Thanks for the suggestions, and I guess I have to
 work on a different computer!

 2012/3/31 Uwe Ligges lig...@statistik.tu-dortmund.de

  On 31.03.2012 16:15, Drew Tyre wrote:
 
  Hi all,
 
  I'm having a strange error that prevents me from installing new
 packages,
  or updating packages after reinstalling. The error message is
  Warning: unable to move temporary installation ‘C:\Program
  Files\R\R-2.14.2\library\**file15045004ac2\sandwich’ to ‘C:\Program
  Files\R\R-2.14.2\library\**sandwich’
  for one of the packages that is failing to install/update. This error
  started happening after I attempted installing lme4Eigen from the
 R-Forge
  repositories - that installation failed too.
 
  Any suggestions for fixes welcome. I don't want to upgrade to 2.15 just
  yet
  because I'm in the middle of a project (although if that's the solution
 I
  guess I'll have to do it).
 
 
  Probably the package is in use by another instance of R. Otherwise, check
  permissions.
 
  Best,
  Uwe Ligges
 
 
   R version 2.14.2 (2012-02-29)
  Platform: i386-pc-mingw32/i386 (32-bit)
 
  locale:
  [1] LC_COLLATE=English_United States.1252
  [2] LC_CTYPE=English_United States.1252
  [3] LC_MONETARY=English_United States.1252
  [4] LC_NUMERIC=C
  [5] LC_TIME=English_United States.1252
 
  attached base packages:
  [1] stats graphics  grDevices utils datasets  methods   base
 
  loaded via a namespace (and not attached):
  [1] tools_2.14.2
 
 
 
 
 
  __**
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/**listinfo/r-help
 https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/**
  posting-guide.html http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 


 --
 Drew Tyre

 School of Natural Resources
 University of Nebraska-Lincoln
 416 Hardin Hall, East Campus
 3310 Holdrege Street
 Lincoln, NE 68583-0974

 phone: +1 402 472 4054
 fax: +1 402 472 2946
 email: aty...@unl.edu
 http://snr.unl.edu/tyre
 http://aminpractice.blogspot.com
 http://www.flickr.com/photos/atiretoo

 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Drew Tyre

School of Natural Resources
University of Nebraska-Lincoln
416 Hardin Hall, East Campus
3310 Holdrege Street
Lincoln, NE 68583-0974

phone: +1 402 472 4054
fax: +1 402 472 2946
email: aty...@unl.edu
http://snr.unl.edu/tyre
http://aminpractice.blogspot.com
http://www.flickr.com/photos/atiretoo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Italicize journal article in mid text

2012-04-03 Thread lukevancleve
Hi, 

I need to italicize the journal in a citation but have thus far failed.  How
can I make 'Journal of Something' below italic but leave the rest plain? 

mtext( See Author1 and Author2 (2007) , \Title\,  Journal of Something ,
pp. 1-50., side = 3, outer = T,  line=-0.75, cex = 0.7, at= 0.04, adj = 0,
font = 1, family = Times)  

Thanks

--
View this message in context: 
http://r.789695.n4.nabble.com/Italicize-journal-article-in-mid-text-tp4529715p4529715.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Val
On Tue, Apr 3, 2012 at 2:53 PM, Berend Hasselman b...@xs4all.nl wrote:


 On 03-04-2012, at 20:21, Val wrote:

  Hi All,
 
  On the same data  points
  x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
 
  I want to have have the following output  as data frame
 
  x   group   group mean
  46   142.3
  125 289.6
  36   142.3
  193 3235.25
  209 3235.25
  78   289.6
  66   289.6
  242 3235.25
  297 3235.25
  45   142.3
 
  I tried the following code
 
 
  dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66
 ,1
  gxc - with(dat, tapply(xc, group, mean))
  dat$gxc - gxce[as.character(dat$group)]
  txc=dat$gxc
 
  it did not work for me.
 

 I'm not surprised.

 In the line dat - there are 5 opening parentheses and 4 closing )'s.
 In the line dat$gxc - you reference an object gxce. Where was it created?

 So I tried this

  dat - data.frame(x, group=findInterval(x, quantile(x, prob=c(0, .333,
 .66 ,1)), all.inside=TRUE))
  dat$gmean - ave(dat$x, as.factor(dat$group))
  dat
 x group gmean
 1   46 1  42.3
 2  125 2  89.7
 3   36 1  42.3
 4  193 3 235.25000
 5  209 3 235.25000
 6   78 2  89.7
 7   66 2  89.7
 8  242 3 235.25000
 9  297 3 235.25000
 10  45 1  42.3


Thank you very much. It is working now.  there  was a type error on
gxce. But in the  r-code it was correct,  gxc..




 Berend



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about randomForest

2012-04-03 Thread Saruman
I dont see how this answered the original question of the poster.

He was quite clear: the value of the predictions coming out of RF do not
match what comes out of the predict function using the same RF object and
the same data. Therefore, what is predict() doing that is different from RF?
Yes, RF is making its predictions using OOB, but nowhere does it say way
predict() is doing; indeed, it says if newdata is not given, then the
results are just the OOB predictions. But newdata=oldata, then
predict(newdata) != OOB predictions. So what is it then? 

Opens another issue, which is if newdata is close but not exactly oldata,
then you get overfitted results?

--
View this message in context: 
http://r.789695.n4.nabble.com/Question-about-randomForest-tp4111311p4529770.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] np package problem

2012-04-03 Thread dnewbold
Hi, Im trying to run a non parametric regression and I wish to use function
npreg(), I installed the np package, but I am being told that npreg doesnt
exist. Any advice on how I could fix this?

--
View this message in context: 
http://r.789695.n4.nabble.com/np-package-problem-tp4529813p4529813.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mixed italic and non-italic in text

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 2:05 PM, lukevancleve wrote:

mtext( See Author1 and Author2 (2007) , \Title\,  Journal of  
Something ,
pp. 1-50., side = 3, outer = T,  line=-0.75, cex = 0.7, at= 0.04,  
adj = 0,

font = 1, family = Times)



?plotmath  # especially the italic function

mtext(expression(See Author1 and Author2 (2007)  
Title~italic(Journal of Something)~ ,
pp. 1-50.), side = 3, outer = T,  line=-0.75, cex = 0.7, at= 0.04,  
adj = 0,

font = 1, family = Times)

You didn't indicate your OS. The choice of screen or pdf fonts may be  
installation specific but this seemed to work as expected with output  
in a serif font and the output in the upper left corner on a Mac with  
the quartz() device.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Berend Hasselman

On 03-04-2012, at 21:02, Val wrote:

 
 
 On Tue, Apr 3, 2012 at 2:53 PM, Berend Hasselman b...@xs4all.nl wrote:
 
 On 03-04-2012, at 20:21, Val wrote:
 
  Hi All,
 
  On the same data  points
  x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
 
  I want to have have the following output  as data frame
 
  x   group   group mean
  46   142.3
  125 289.6
  36   142.3
  193 3235.25
  209 3235.25
  78   289.6
  66   289.6
  242 3235.25
  297 3235.25
  45   142.3
 
  I tried the following code
 
 
  dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1
  gxc - with(dat, tapply(xc, group, mean))
  dat$gxc - gxce[as.character(dat$group)]
  txc=dat$gxc
 
  it did not work for me.
 
 
 I'm not surprised.
 
 In the line dat - there are 5 opening parentheses and 4 closing )'s.
 In the line dat$gxc - you reference an object gxce. Where was it created?
 
 So I tried this
 
  dat - data.frame(x, group=findInterval(x, quantile(x, prob=c(0, .333, .66 
  ,1)), all.inside=TRUE))
  dat$gmean - ave(dat$x, as.factor(dat$group))

And the as.factor is not necessary. This will do

dat$gmean - ave(dat$x, dat$group)

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] np package problem

2012-04-03 Thread R. Michael Weylandt
You need to load it once per session with library(np)

Michael

On Tue, Apr 3, 2012 at 2:46 PM, dnewbold dwightnewbold...@hotmail.com wrote:
 Hi, Im trying to run a non parametric regression and I wish to use function
 npreg(), I installed the np package, but I am being told that npreg doesnt
 exist. Any advice on how I could fix this?

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/np-package-problem-tp4529813p4529813.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object of type 'S4' is not subsettable

2012-04-03 Thread Duncan Murdoch

On 03/04/2012 12:42 PM, phillen wrote:

Figured it out.

given that object 'cit' is of class S4, extracting information works as
follows:

1)finding out names of slots in object 'cit'

  slotNames(cit)
  [1] x Z0Z1ZKtype  model
  [7] ecdet lag   P seasondumvarcval
[13] teststat  lambdaVorg  V W PI
[19] DELTA GAMMA R0RKbpspec
[25] call  test.name

2) extracting info from wanted slot
  cit@teststat
[1]  5.348440  9.068113 10.643293


That's one way.  There may be another:  if you print the class of cit using

class(cit)

there may be a help page, which you would find using

class?foo

assuming foo is the name of the class.  The author may have defined a 
method to get what you want, or may give other guidance.  For example, 
in the Matrix package:


 x - Hilbert(6)
 class(x)
[1] dpoMatrix
attr(,package)
[1] Matrix
 class?dpoMatrix

gives quite useful information about the dpoMatrix class.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread Berend Hasselman

On 03-04-2012, at 20:21, Val wrote:

 Hi All,
 
 On the same data  points
 x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
 
 I want to have have the following output  as data frame
 
 x   group   group mean
 46   142.3
 125 289.6
 36   142.3
 193 3235.25
 209 3235.25
 78   289.6
 66   289.6
 242 3235.25
 297 3235.25
 45   142.3
 
 I tried the following code
 
 
 dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1
 gxc - with(dat, tapply(xc, group, mean))
 dat$gxc - gxce[as.character(dat$group)]
 txc=dat$gxc
 
 it did not work for me.
 

I'm not surprised.

In the line dat - there are 5 opening parentheses and 4 closing )'s.
In the line dat$gxc - you reference an object gxce. Where was it created?

So I tried this

 dat - data.frame(x, group=findInterval(x, quantile(x, prob=c(0, .333, .66 
 ,1)), all.inside=TRUE))
 dat$gmean - ave(dat$x, as.factor(dat$group))
 dat
 x group gmean
1   46 1  42.3
2  125 2  89.7
3   36 1  42.3
4  193 3 235.25000
5  209 3 235.25000
6   78 2  89.7
7   66 2  89.7
8  242 3 235.25000
9  297 3 235.25000
10  45 1  42.3

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] np package problem

2012-04-03 Thread David Winsemius


On Apr 3, 2012, at 2:46 PM, dnewbold wrote:

Hi, Im trying to run a non parametric regression and I wish to use  
function
npreg(), I installed the np package, but I am being told that npreg  
doesnt

exist. Any advice on how I could fix this?


The usual error following that set of actions is user failure to load  
the package.


?require
?library



--
View this message in context: 
http://r.789695.n4.nabble.com/np-package-problem-tp4529813p4529813.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping

2012-04-03 Thread R. Michael Weylandt
Please take a look at my first reply to you:

ave(y, findInterval(y, quantile(y, c(0.33, 0.66

Then read ?ave for an explanation of the syntax. ave takes two
vectors, the first being the data to be averaged, the second being an
index to split by. You don't want to use split() here.

Michael

On Tue, Apr 3, 2012 at 2:50 PM, Val valkr...@gmail.com wrote:
 I did look at it the result  is below,

 x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )

 #lapply( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
 include.lowest=TRUE) ), mean)
  ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
 include.lowest=TRUE) ), mean)

 ave( split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) ,
 include.lowest=TRUE) ), mean)
 $`[36,74]`
 [1] NA

 $`(74,197]`
 [1] NA

 $`(197,297]`
 [1] NA

 There were 11 warnings (use warnings() to see them)





 On Tue, Apr 3, 2012 at 2:35 PM, Petr Savicky savi...@cs.cas.cz wrote:

 On Tue, Apr 03, 2012 at 02:21:36PM -0400, Val wrote:
  Hi All,
 
  On the same data  points
  x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
 
  I want to have have the following output  as data frame
 
  x       group   group mean
  46       1        42.3
  125     2        89.6
  36       1        42.3
  193     3        235.25
  209     3        235.25
  78       2        89.6
  66       2        89.6
  242     3        235.25
  297     3        235.25
  45       1        42.3
 
  I tried the following code
 
 
  dat - data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66
 ,1
  gxc - with(dat, tapply(xc, group, mean))
  dat$gxc - gxce[as.character(dat$group)]
  txc=dat$gxc
 
  it did not work for me.

 David Winsemius suggested to use ave(), when you asked this
 question for the first time. Can you have look at it?

 Petr Savicky.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Create Model Object (setClass?setMethod?)

2012-04-03 Thread R. Michael Weylandt
Setting a class is quite easy if you are in the S3 (read: easier) system:

x - 1:15
class(x) - YourClass

summary.YourClass - function(x, ...) cat(The mean of your object is
, mean(x), \n)

summary(x)

Michael

On Tue, Apr 3, 2012 at 12:54 PM, casperyc caspe...@hotmail.co.uk wrote:
 Hi all,

 I have a self written likelihood as a model and functions to optimize and
 get fitted values, confidence intervals ect.

 I wonder if there is a way to define a 'class', or a 'model' (or a certain
 object)? so that I can use 'summary' to produce a summary like it does for a
 lm object.

 Also, it should be able to use 'predict' and 'plot' and other various
 generic functions.

 I am reading bits and pieces on the internet on 'setClass', 'setMethod'.  Am
 I looking for the correct thing? Is there any up to date references that I
 can get help? I need some examples to get started with.

 Thanks!

 Casper

 -
 ###
 PhD candidate in Statistics
 School of Mathematics, Statistics and Actuarial Science, University of Kent
 ###

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Create-Model-Object-setClass-setMethod-tp4529473p4529473.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ff usage for glm

2012-04-03 Thread Thomas Lumley
On Tue, Apr 3, 2012 at 1:42 AM, Benilton Carvalho
beniltoncarva...@gmail.com wrote:
 Did you try the example described on the ff man page?

Also, the last error message you report happens when chunks of the
data set give design matrices that don't line up correctly.  You said
you added one new variable, but there are actually two new variables
in the formula you show, compared to the previous run you showed.

If you only have 32bit R then doing this from a data frame is not
going to be efficient, and you do want to use ff (or SQLite or
something).  You may also want to decrease the chunk size from the
default -- 5000 observations at a time might be too much.

Incidentally, putting ATTN: Thomas Lumley on a nabble post would be
counterproductive if I read nabble, but since I don't it's completely
pointless.

-thomas


 On Monday, April 2, 2012, Bond, Stephen wrote:

 Thomas,

 I tried biglm and it does not work see


 http://r.789695.n4.nabble.com/unable-to-get-bigglm-working-ATTN-Thomas-Lumley-td2276524.html#a2278381

 . There are other posts from people who cannot get biglm working and
 others who get strange results.
 Please, advise if you can help.
 I have row based native code which works, but it is inconvenient as it
 does not produce an R object, which can be fed to anova etc. offered it to
 the developer forum, but message is still waiting for mod approval.
 regards

 Stephen B

 -Original Message-
 From: Thomas Lumley [mailto:tlum...@uw.edu javascript:;]
 Sent: Friday, March 30, 2012 7:32 PM
 To: Bond, Stephen
 Cc: r-help@r-project.org javascript:;
 Subject: Re: [R] ff usage for glm

 On Sat, Mar 31, 2012 at 9:05 AM, Bond, Stephen 
 stephen.b...@cibc.comjavascript:;
 wrote:
  Greetings useRs,
 
  Can anyone provide an example how to use ff to feed a very large data
 frame to glm?
  The data.frame cannot be loaded in R using conventional read.csv as it
 is too big.
 
  glm(...,data=ff.file) ??
 

 I shouldn't think glm() will work on data that are too big to read into R.
  However, bigglm() from the biglm package should work.  You just need to
 write a function that supplies chunks of data from ff.file as requested
 (see the example on ?bigglm).  I haven't used ff, but it looks from the
 documentation as though chunk() will do all the difficult parts.

  -thomas

 --
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland

 __
 R-help@r-project.org javascript:; mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R equivalent for SQL query

2012-04-03 Thread andrija djurovic
Hi,

here are some solutions:

DF - read.table(textConnection(
A   B   C
1   1   3
1   1   4
1   1   5
1   2   6
1   2   7
1   3   8 ), header=TRUE)

#using sqldf package
library(sqldf)
sqldf(select A, B, count(*)
  from DF
  group by A, B
  order by count(*) desc)

#using function table
as.data.frame(table(DF$A, DF$B))

As you can see, you can use sqldf package for performing sql queries
on R data frames.

Andrija



On Tue, Apr 3, 2012 at 8:26 PM, Steven Raemaekers s.raemaek...@sig.eu wrote:
 Hi,

 I have a query which I would like to translate into R, but I do not know how 
 to do it in an easy way.
 Assume a data frame has columns A, B and C:

 A       B       C
 1       1       3
 1       1       4
 1       1       5
 1       2       6
 1       2       7
 1       3       8

 The query is as follows:

 select A, B, count(*)
 from data.frame
 group by A, B
 order by count(*) desc

 How do I translate this into R statements in such way that the result is a 
 data frame structured as follows:

 A       B       count(*)
 1       1       3
 1       2       2
 1       3       1

 Thanks,

 Steven
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R equivalent for SQL query

2012-04-03 Thread jim holtman
Here is a solution using the data.table package:

 x - read.table(text = A   B   C
+ 1   1   3
+ 1   1   4
+ 1   1   5
+ 1   2   6
+ 1   2   7
+ 1   3   8, header = TRUE)
 require(data.table)
 x - data.table(x)  # convert to a data.table
 # query
 result - x[
+   , list(count = length(C))
+   , by = list(A,B)
+   ]
 # unsorted result
 result
 A B count
[1,] 1 1 3
[2,] 1 2 2
[3,] 1 3 1
 # sorted result
 result[order(result$count, decreasing = TRUE), ]
 A B count
[1,] 1 1 3
[2,] 1 2 2
[3,] 1 3 1




On Tue, Apr 3, 2012 at 3:50 PM, andrija djurovic djandr...@gmail.com wrote:
 Hi,

 here are some solutions:

 DF - read.table(textConnection(
 A       B       C
 1       1       3
 1       1       4
 1       1       5
 1       2       6
 1       2       7
 1       3       8 ), header=TRUE)

 #using sqldf package
 library(sqldf)
 sqldf(select A, B, count(*)
      from DF
      group by A, B
      order by count(*) desc)

 #using function table
 as.data.frame(table(DF$A, DF$B))

 As you can see, you can use sqldf package for performing sql queries
 on R data frames.

 Andrija



 On Tue, Apr 3, 2012 at 8:26 PM, Steven Raemaekers s.raemaek...@sig.eu wrote:
 Hi,

 I have a query which I would like to translate into R, but I do not know how 
 to do it in an easy way.
 Assume a data frame has columns A, B and C:

 A       B       C
 1       1       3
 1       1       4
 1       1       5
 1       2       6
 1       2       7
 1       3       8

 The query is as follows:

 select A, B, count(*)
 from data.frame
 group by A, B
 order by count(*) desc

 How do I translate this into R statements in such way that the result is a 
 data frame structured as follows:

 A       B       count(*)
 1       1       3
 1       2       2
 1       3       1

 Thanks,

 Steven
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] object of type 'S4' is not subsettable

2012-04-03 Thread phillen
Thanks for that hint.

Philipp Lentner

--
View this message in context: 
http://r.789695.n4.nabble.com/object-of-type-S4-is-not-subsettable-tp4529063p4529958.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >