Re: [R] Splitting Data Frame into Two Based on Source Array

2008-09-09 Thread Adam D. I. Kramer

data_main[ match(src,data_main$V1), ]

and the compliment of src (call it srcc)

data_main[ match(srcc,data_main$V1), ]

...this only works so long as there is only one occurrance of each item in
V1 in V1.

--Adam

On Tue, 9 Sep 2008, Gundala Viswanath wrote:


Dear all,

Suppose I have this data frame:



data_main

  V1  V2
foo13.1
bar   12.0
qux   10.4
cho  20.33
pox   8.21

And I want to split the data into two parts
first part are the one contain in the source array:


src

[1] bar pox

and the other one the complement.

In the end we hope to get this two dataframes:


data_child1

V1 V2
bar   13.1
pox   8.21

and


data_child2_complement

foo 13.1
qux 10.4
cho 20.33

Is there a compact way to do it in R?




- Gundala Viswanath
Jakarta - Indonesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cluster/snow question

2008-09-09 Thread Markus Schmidberger

Hi Tolga,

in SNOW you have to start a cluster with the command

 library(snow)
 cluster - makeCluster(#nodes)

The object cluster is a list with an object for each node and each 
object again is a list with all informations (rank, comm, tags)

The size of the cluster is the length of the list.

 #nodes == length(cluster)

E.g. the rank for node one you can get by
 cluster[[1]]$rank

Best
Markus

[EMAIL PROTECTED] schrieb:

Dear R Users,

I am attempting to use the snow package for clustering. Is there a way to 
identfy, in the environment of each node, a rank for that node and also, 
the total size of the cluster ? 

By way of analogy, I am looking for the functions in snow equivalent to 
mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things 
clearer.


Thanks in advance,
Tolga

Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  



--
Dipl.-Tech. Math. Markus Schmidberger

Ludwig-Maximilians-Universität München
IBE - Institut für medizinische Informationsverarbeitung,
Biometrie und Epidemiologie
Marchioninistr. 15, D-81377 Muenchen
URL: http://ibe.web.med.uni-muenchen.de 
Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de

Tel: +49 (089) 7095 - 4599

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] yahoo finance into R

2008-09-09 Thread Peter Dalgaard

thomastos wrote:
Hi R, 


I am familiar with the basics of R.
To learn more I would like how to get data from Yahoo!finance directly into
R. So basically I want a data frame or matrix to do some data analysis.
How do I do this?
  

RSiteSearch(yahoo)

get.hist.quote() from tseries
yahooSeries() from fImport (untried)

--
  O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
 c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
(*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] match problem by rownames

2008-09-09 Thread Xianming Wei
Hi all,

While dat['a1',] and dat['a10',] produce the same results in the
following example, I'd like dat['a1',] to return NAs.

dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5))
rownames(dat) - dat$x1
dat['a1',]
dat['a10',]

 sessionInfo()
R version 2.7.2 (2008-08-25)
i386-pc-mingw32

locale:
LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MON
ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


other attached packages:
[1] lattice_0.17-13

loaded via a namespace (and not attached):
[1] grid_2.7.2


Regards,
Xianming



DISCLAIMER:\ For details of our e-mail disclaimer, pleas...{{dropped:15}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Compiling date

2008-09-09 Thread Megh Dal
Hi,

I have following kind of dataset (all are dates) in my Excel sheet.

09/08/08
09/05/08
09/04/08
09/02/08
09/01/08
29/08/2008
28/08/2008
27/08/2008
26/08/2008
25/08/2008
22/08/2008
21/08/2008
20/08/2008
18/08/2008
14/08/2008
13/08/2008
08/12/08
08/11/08
08/08/08
08/07/08

However I want to use R to compile those data to make all dates in same format. 
Can anyone please tell me any automated way for doing that?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Memory allocation problem (during kmeans)

2008-09-09 Thread rami batal
Dear all,

I am trying to apply kmeans clusterring on a data file (size is about 300
Mb)

I read this file using

x=read.table('file path' , sep= )

then i do kmeans(x,25)

but the process stops after two minutes with an error :

Error: cannot allocate vector of size 907.3 Mb

when i read the archive i notice that the best solution is to use a 64bit
OS.

Error messages beginning cannot allocate vector of size indicate a failure
to obtain memory, either because the size exceeded the address-space limit
for a process or, more likely, because the system was unable to provide the
memory. Note that on a 32-bit OS there may well be enough free memory
available, but not a large enough contiguous block of address space into
which to map it. 

the problem that I have two machines with two OS (32bit and 64bit) and when
i used the 64bit OS the same error remains.

Thank you if you have any suggestions to me and excuse me because i am a
newbie.

Here the default information for the 64bit os:

 sessionInfo()
R version 2.7.1 (2008-06-23)
x86_64-redhat-linux-gnu

 gc()
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 137955  7.4 35 18.7   35 18.7
Vcells 141455  1.1 786432  6.0   601347  4.6

I tried also to start R using the options to control the available memory
and the result still the same. or maybe i don't assign the correct values.


Thank you in advance.

-- 
Rami BATAL

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to preserve date format while aggregating

2008-09-09 Thread Prof Brian Ripley

This is completely wrong: min _is_ defined for date-times:


min(.leap.seconds)

[1] 1972-07-01 01:00:00 BST

Please do study the posting guide and do your homework before posting: you 
seem unaware of what the POSIXct class is, so ?DateTimeClasses is one 
place you need to start.  And



methods(Summary)

[1] Summary.data.frame  Summary.DateSummary.difftime
[4] Summary.factor  Summary.numeric_version Summary.POSIXct
[7] Summary.POSIXlt

so ?Summary is another.

On Mon, 8 Sep 2008, Adam D. I. Kramer wrote:


Hi Erich,

Since min() is defined for numbers and not dates, the problem is in the
min() function. min() is converting from date format to number format.

Your best bet is to make this conversion explicit...such that it is
reversable. So, convert the date into UTC, then UTC to seconds since epoch,
then take the minimum, then convert back to UTC time.

This sounds like a pain...but that's basically what a version of min()
designed to work with dates would do. The reason this is a pain is basically
due to timezones:

Consider a comparison between x = 3:54 PM September 8 in California (right
now where I am) and y = 12:54 AM September 9 in Zurich (right now where you
are). Is it earlier here than there? Yes, because it's Sept 8 to your Sept
9. Is it earlier there than here? Yes, because your day started 56 minutes
ago, mine over 15 hours ago. Is it the same time here than there? Yes,
because our UTC times are equal.

So it's not clear what min should return, so min is not defined for dates.
However, min is defined for numbers, and dates can be converted to
numbers...but what those numbers actually mean is not necessarily clear.

--Adam

On Mon, 8 Sep 2008, Erich Studerus wrote:


Hi

I have a dataframe in which some subjects appear in more than one row. I
want to extract the subject-rows which have the minimum date per subject. I
tried the following aggregate function.

attach(dataframe.xy)

aggregate(Date,list(SubjectID),min)

Unfortunately, the format of the Date-column changes to numeric, when I'm
applying this function. How can I preserve the date format?

Thanks

Erich

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] isolate elements in vector that match one of many possible values

2008-09-09 Thread Adam D. I. Kramer

Check out ?match, ?%in%


x - c(1,2,3,4)
y - c(1,2,4)
match(y,x)

[1] 1 2 4




--Adam

On Mon, 8 Sep 2008, Andrew Barr wrote:


Hi all,

I want to get the index numbers of all elements of a vector which match any
of a long series of possible values.  Say x - c(1,2,3,4) and I want to know
which values are equal to 1, 2 or 4.  I could do

which(x == 1 | x==2 | x==4)
[1] 1 2 4

This gets really ugly though, when the list of values of interest is really
long.  Is there a nicer way to do this?  Something akin to the MySQL
construction in(), as in

#MySQL script example
Select * from table where parameter in(x,y,z);

Thanks!

--
W. Andrew Barr
Biological Anthropology
University of Texas at Austin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] make methods work in lapply - remove lapply's environment

2008-09-09 Thread Prof Brian Ripley
This is a side-effect of lapply being in the base namespace and not 
evaluating its arguments, as explained on its help page which also points 
out that using a wrapper is sometimes needed.  It also points out that 
code has been written that relies on the current behaviour.


On Mon, 8 Sep 2008, Tim Hesterberg wrote:


I've defined my own version of summary.default,
that gives a better summary for highly skewed vectors.

If I call
 summary(x)
the method is used.

If I call
 summary(data.frame(x))
the method is not used.

I've traced this to lapply; this uses the new method:
 lapply(list(x), function(x) summary(x))
and this does not:
 lapply(list(x), summary)

If I make a copy of lapply, WITHOUT the environment,
then the method is used.

lapply - function (X, FUN, ...) {
   FUN - match.fun(FUN)
   if (!is.vector(X) || is.object(X))
   X - as.list(X)
   .Internal(lapply(X, FUN))
}

I'm curious to hear reactions to this.
There is a March 2006 thread
   object size vs. file size
in which Duncan Murdoch wrote:

Functions in R consist of 3 parts: the formals, the body, and the
environment. You can't remove any part, but you can change it.

That is exactly what I want to do, remove the environment, so that
when I define a better version of some function that the better
version is used.

Here's a function to automate the process:
copyFunction - function(Name){
 # Copy a function, without its environment.
 # Name should be quoted
 # Return the copy
 file - tempfile()
 on.exit(unlink(file))
 dput(get(Name), file = file)
 f - source(file)$value
 f
}
lapply - copyFunction(lapply)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] naive variance in GEE

2008-09-09 Thread Prof Brian Ripley

On Mon, 8 Sep 2008, Qiong Yang wrote:

The standard error from logistic regression is slightly different from the 
naive SE from GEE under independence working correlation structure.


Shouldn't they be identical? Anyone has insight about this?


They are computed quantities from iterations with different stopping 
criteria.  The coefficients are not 'identical' either.


Your example is incorrect (the first line) and not reproducible (no seed 
is set, no library gee), so we don't know what you saw.  But with


set.seed(1)
a - rbinom(1000, 1, 0.2)
b - rbinom(1000, 2, 0.1)
c - rbinom(1000, 10, 0.5)
library(gee)
summary(gee(a ~ b, id=c, family=binomial, corstr=independence))$coef
summary(glm(a ~ b, family=binomial))$coef

the differences I see are negligible.  I suggest you talk to your 
supervisor about some courses on numerical methods.




Thanks,
Qiong

a-rbinom(1000,1)
b-rbinom(1000,2,0.1)
c-rbinom(1000,10,0.5)
summary(gee(a~b, id=c,family=binomial,corstr=independence))$coef
summary(glm(a~b,family=binomial))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] S.O.S try doesnot work in boot?

2008-09-09 Thread ctu

First thanks for Jinsong's suggestions
I would like to do a bootstrap in a nonlinear model. But it fails to  
converge in most of time. (it did converge if I just use nls without  
boot). Thus, I use try function to resolve my problem. This  
following code is from Jinsong's suggestion.


h1a.nls-nls(density~nmf(time, alpha, delta, psi, tau, gamma),data=h1a,
start=c(alpha=0.3, delta=0.08869, psi=1.26523, tau=3.93919,
   gamma=-1.41927))


h1a.data-data.frame(h1a,res=resid(h1a.nls),fitted=fitted(h1a.nls))
h1a.fun-function(data,i){
 d-data
 d$density-d$fitted+d$res[i]
 try(update(h1a.nls,data=d),silent=T)
 if(!inherits(h1a.nls,try-error)) h1a.coef-coef(h1a.nls)
 else h1a.coef-NA
 h1a.coef
 }
h1a.boot-boot(h1a.data, statistic = h1a.fun, R=1000)

h1a.boot


ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = h1a.data, statistic = h1a.fun, R = 1000)
Bootstrap Statistics :
   original  biasstd. error
t1*  0.27892590   0   0
t2*  0.08869433   0   0
t3*  1.26523275   0   0
t4*  3.93919567   0   0
t5* -1.41926966   0   0
all of the values of each column in h1a.boot$t are the same.
Is anyone know to how I can solve this problem?
Appreciate in advance

Chunhao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] correct lme syntax for this problem?

2008-09-09 Thread ONKELINX, Thierry
Dear Matthew,

First of all I'm forwarding this to R-SIG-Mixed, which is a more
appropriate list for your question.
Using a mixed effect with only 5 levels is a borderline situation.
Douglas Bates recommends at least 6 levels in order to get a more or
less reliable estimate. So I would consider the populations as fixed
effects. Do you have repeated measurements of individuals within your
populations? If you do you could use those as random effects.

Your anova tests whether the variances of the random slope on SPI is
zero. I think you might want this:

mod1 - lm(height ~ SPI * population + covariate1 + covariate2)
mod2 - lm(height ~ SPI + population + covariate1 + covariate2)
anova(mod1, mod2)

HTH,

Thierry



ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
[EMAIL PROTECTED] 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Namens Matthew Keller
Verzonden: dinsdag 9 september 2008 1:10
Aan: R Help
Onderwerp: [R] correct lme syntax for this problem?

Hello all,

I am about to send off a manuscript and, although I am fairly
confident I have used the lme function correctly, I want to be 100%
sure. Could some kind soul out there put my mind at ease?

I am simply interested in whether a predictor (SPI) is related to
height. However, there are five different populations, and each may
differ in mean level of height as well as the relationship between SPI
and height. Thus, I also want to a) account for mean level differences
in height and b) check whether the relationship between height and SPI
is different between the groups. I hope this is sufficient
information.

height, SPI, covariate1, and covariate2 are numeric. population is a
factor with 5 levels. Here are the steps I took:

summary(mod1 - lme(height ~ SPI + covariate1 + covariate2, random = ~
SPI | population))

summary(mod2 - lme(height ~ SPI + covariate1 + covariate2, random = ~
1 | population))

anova(mod1,mod2) #this checks whether there is evidence for IQ  SPI
being related differently between the 5 populations.

Is this correct? THANKS!

Matt


-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document. The views expressed in  this message and 
any annex are purely those of the writer and may not be regarded as stating an 
official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] S.O.S try doesnot work in boot?

2008-09-09 Thread Prof Brian Ripley
Returning NA (of the correct length, not length 1) will not help you, as 
all the derived statistics from the bootstrap runs will be NA.


But here you never looked at the result of try.

On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


First thanks for Jinsong's suggestions
I would like to do a bootstrap in a nonlinear model. But it fails to converge 
in most of time. (it did converge if I just use nls without boot). Thus, I 
use try function to resolve my problem. This following code is from 
Jinsong's suggestion.


h1a.nls-nls(density~nmf(time, alpha, delta, psi, tau, gamma),data=h1a,
  start=c(alpha=0.3, delta=0.08869, psi=1.26523, tau=3.93919, 
gamma=-1.41927))


h1a.data-data.frame(h1a,res=resid(h1a.nls),fitted=fitted(h1a.nls))
h1a.fun-function(data,i){
d-data
d$density-d$fitted+d$res[i]
try(update(h1a.nls,data=d),silent=T)
if(!inherits(h1a.nls,try-error)) h1a.coef-coef(h1a.nls)


h1a.nls is the original fit, not the result of try().


else h1a.coef-NA
h1a.coef
}
h1a.boot-boot(h1a.data, statistic = h1a.fun, R=1000)

h1a.boot


ORDINARY NONPARAMETRIC BOOTSTRAP
Call:
boot(data = h1a.data, statistic = h1a.fun, R = 1000)
Bootstrap Statistics :
 original  biasstd. error
t1*  0.27892590   0   0
t2*  0.08869433   0   0
t3*  1.26523275   0   0
t4*  3.93919567   0   0
t5* -1.41926966   0   0
all of the values of each column in h1a.boot$t are the same.
Is anyone know to how I can solve this problem?
Appreciate in advance

Chunhao

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I compute interactions with anova.mlm ?

2008-09-09 Thread Schadwinkel, Stefan
Hi,

I wish to compute multivariate test statistics for a within-subjects repeated 
measures design with anova.mlm. 

This works great if I only have two factors, but I don't know how to compute 
interactions with more than two factors. 
I suspect, I have to create a new grouping factor and then test with this 
factor to get these interactions (as it is hinted in R News 2007/2), 
but I don't really know how to use this approach. 

Here is my current code:

Two Factors: fac1, fac2

mlmfit - lm(mydata~1)
mlmfit0 - update(mlmfit, ~0)

% test fac1, works, produces same output as SAS
anova(mlmfit, mlmfit0, M = ~ fac1 + fac2, X = ~ fac2, idata = idata, test = 
Wilks)

% test fac1*fac2 interaction, also works, also the same output as SAS
anova(mlmfit, mlmfit0, X = ~ fac1 + fac2, idata = idata, test = Wilks)



Three Factors: fac1, fac2, fac3 

mlmfit - lm(mydata~1)
mlmfit0 - update(mlmfit, ~0)

% test fac1, works, same as SAS
anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac2 + fac3, idata = 
idata, test = Wilks)



Now, I try to compute the interactions the same way, but this doesn't work:

% fac1*fac2
anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac3, idata = idata, 
test = Wilks) 

% fac1*fac2*fac3
anova(mlmfit, mlmfit0, X = ~ fac1 + fac2 + fac3, idata = idata, test = Wilks)


Both of these above differ quite much from the SAS output and I suspect, my 
understanding of X and M is somewhat flawed. 

I would be very happy, if someone could tell me how to compute the two 
interactions above and an interaction of N factors in general.

I would also be interested in computing linear contrasts using the T matrix and 
anova.mlm.

Thank you very much,

Stefan 

 

--
Stefan Schadwinkel, Dipl.-Inf.
Neurologische Klinik
Sektion Biomagnetismus
Universität Heidelberg
Im Neuenheimer Feld 400
69120 Heidelberg

Telefon:  06221 - 56 5196
Email:[EMAIL PROTECTED] 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] exporting tapply objects to csv-files

2008-09-09 Thread Kunzler, Andreas
Dear Everyone,

I try to create a cvs-file with different results form the table function.

Imagine a data-frame with two vectors a and b where b is of the class factor. 

I use the tapply function to count a for the different values of b.

tapply(a,b,table)

and I use the table function to have a look of the frequencies as a total

table(a)

I would like to put both results together in one txt or csv file that I can 
import to e.g. Excel.

The export file should have a layout like

1,2,3,4,5,6,7 (possible values of a)
3,6,7,8,8,8,1 (Counts of a total)
1,2,3,4,5,3,0 (Counts of a where b==A)
2,4,4,4,3,5,1 (Counts of a where b==B)

I tried to change the class of the table result to a matrix but I could not 
find a way to use the results of tapply. I use tapply because b has 15 
different values.

Thanx

Andreas Kunzler

Bundeszahnärztekammer (BZÄK)
Chausseestraße 13
10115 Berlin

Tel.: 030 40005-113
Fax:  030 40005-119

E-Mail: [EMAIL PROTECTED] 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read from url requiring authentication?

2008-09-09 Thread Damien
René Sachse wrote:

 Damien schrieb:

  I'm looking into opening an url on a server which requires
 authentication.

 Under a Windows Operating System you could try to start R with the
 --internet2 option. This worked in my case.

Thanks René it did the trick for me too!

Best Regards,
Damien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Read from url requiring authentication?

2008-09-09 Thread Damien
On 8 Sep, 20:15, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 On Mon, 8 Sep 2008, Damien wrote:
  Hi all,

  I'm looking into opening an url on a server which requires
 authentication.

  After failing to find some kind of connection structure to fill in I
  turned to explicitly stating the credentials in the url itself (e.g.
  http://username:[EMAIL PROTECTED]).
  Sadly this didn't do the trick either and both source() and url()
  failed trying to resolve the username ()

  Is there anything I missed in the documentation/internet/groups?
  If not could I maybe add to the existing R functions as it doesn't
  seem too far of a stretch to allow the username and password in the
  url string fed to the web server?

 Look at the RCurl package: it is more like download.file than url, though,
 and you could perhaps wse the wget method of download.file.

Thank you for the quick reply,

it seems that the argument --internet2 did solve my immediate
problem
but I'll have a look at RCurl too.

Best Regards,
Damien

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compiling date

2008-09-09 Thread David Scott

On Mon, 8 Sep 2008, Megh Dal wrote:


Hi,

I have following kind of dataset (all are dates) in my Excel sheet.

09/08/08
09/05/08
09/04/08
09/02/08
09/01/08
29/08/2008
28/08/2008
27/08/2008
26/08/2008
25/08/2008
22/08/2008
21/08/2008
20/08/2008
18/08/2008
14/08/2008
13/08/2008
08/12/08
08/11/08
08/08/08
08/07/08

However I want to use R to compile those data to make all dates in same 
format. Can anyone please tell me any automated way for doing that?




Well you have to read them in as character first. Then use sub to make the 
two digit years into four digits. The following could probably be improved 
by a regular expression whiz, but works:



strngs - c(06/05/08,23/11/2008)
sub(([0-9][0-9]/[0-9][0-9]/)([0-9][0-9]$),\\120\\2,strngs)

[1] 06/05/2008 23/11/2008


David Scott



_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memory allocation problem (during kmeans)

2008-09-09 Thread Peter Dalgaard
rami batal skrev:
 Dear all,

 I am trying to apply kmeans clusterring on a data file (size is about 300
 Mb)

 I read this file using

 x=read.table('file path' , sep= )

 then i do kmeans(x,25)

 but the process stops after two minutes with an error :

 Error: cannot allocate vector of size 907.3 Mb

 when i read the archive i notice that the best solution is to use a 64bit
 OS.

 Error messages beginning cannot allocate vector of size indicate a failure
 to obtain memory, either because the size exceeded the address-space limit
 for a process or, more likely, because the system was unable to provide the
 memory. Note that on a 32-bit OS there may well be enough free memory
 available, but not a large enough contiguous block of address space into
 which to map it. 

 the problem that I have two machines with two OS (32bit and 64bit) and when
 i used the 64bit OS the same error remains.

 Thank you if you have any suggestions to me and excuse me because i am a
 newbie.

 Here the default information for the 64bit os:

   
 sessionInfo()
 
 R version 2.7.1 (2008-06-23)
 x86_64-redhat-linux-gnu

   
 gc()
 
  used (Mb) gc trigger (Mb) max used (Mb)
 Ncells 137955  7.4 35 18.7   35 18.7
 Vcells 141455  1.1 786432  6.0   601347  4.6

 I tried also to start R using the options to control the available memory
 and the result still the same. or maybe i don't assign the correct values.

   
It might be a good idea first to work out what the actual memory
requirements are. 64 bits does not help if you are running out of RAM
(+swap).

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I compute interactions with anova.mlm ?

2008-09-09 Thread Peter Dalgaard
Schadwinkel, Stefan skrev:
 Hi,

 I wish to compute multivariate test statistics for a within-subjects repeated 
 measures design with anova.mlm. 

 This works great if I only have two factors, but I don't know how to compute 
 interactions with more than two factors. 
 I suspect, I have to create a new grouping factor and then test with this 
 factor to get these interactions (as it is hinted in R News 2007/2), 
 but I don't really know how to use this approach. 

 Here is my current code:

 Two Factors: fac1, fac2

 mlmfit - lm(mydata~1)
 mlmfit0 - update(mlmfit, ~0)

 % test fac1, works, produces same output as SAS
 anova(mlmfit, mlmfit0, M = ~ fac1 + fac2, X = ~ fac2, idata = idata, test = 
 Wilks)

 % test fac1*fac2 interaction, also works, also the same output as SAS
 anova(mlmfit, mlmfit0, X = ~ fac1 + fac2, idata = idata, test = Wilks)



 Three Factors: fac1, fac2, fac3 

 mlmfit - lm(mydata~1)
 mlmfit0 - update(mlmfit, ~0)

 % test fac1, works, same as SAS
 anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac2 + fac3, idata = 
 idata, test = Wilks)



 Now, I try to compute the interactions the same way, but this doesn't work:

 % fac1*fac2
 anova(mlmfit, mlmfit0, M = ~ fac1 + fac2 + fac3, X = ~ fac3, idata = idata, 
 test = Wilks) 

 % fac1*fac2*fac3
 anova(mlmfit, mlmfit0, X = ~ fac1 + fac2 + fac3, idata = idata, test = 
 Wilks)


 Both of these above differ quite much from the SAS output and I suspect, my 
 understanding of X and M is somewhat flawed. 

 I would be very happy, if someone could tell me how to compute the two 
 interactions above and an interaction of N factors in general.

   
You need to ensure that the difference between the X and M models is the
relevant interaction, so something like

M=~fac1*fac2*fac3
X=~fac1*fac2*fac3 - fac1:fac2:fac3

should test for fac1:fac2:fac3

If the within-subject design is fac1*fac2*fac3 with one observation per
cell (NB!), then you can omit M. X can also be written as
~fac1*fac2+fac2*fac3+fac1*fac3 or ~(fac1+fac2+fac3)^2.

For the next step, use, e.g.,

M=~fac1*fac2+fac2*fac3+fac1*fac3
X=~fac2*fac3+fac1*fac3

to test significance of fac1:fac2 (notice that the main effects are
still in X becaus of the meaning of the * operator in R).


 I would also be interested in computing linear contrasts using the T matrix 
 and anova.mlm.

 Thank you very much,

 Stefan 

  

 --
 Stefan Schadwinkel, Dipl.-Inf.
 Neurologische Klinik
 Sektion Biomagnetismus
 Universität Heidelberg
 Im Neuenheimer Feld 400
 69120 Heidelberg

 Telefon:  06221 - 56 5196
 Email:[EMAIL PROTECTED] 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match problem by rownames

2008-09-09 Thread Charilaos Skiadas

As suggested in ?[.data.frame, try:

dat[match('a1', rownames(dat)),]


Haris Skiadas
Department of Mathematics and Computer Science
Hanover College

On Sep 9, 2008, at 2:41 AM, Xianming Wei wrote:


Hi all,

While dat['a1',] and dat['a10',] produce the same results in the
following example, I'd like dat['a1',] to return NAs.

dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5))
rownames(dat) - dat$x1
dat['a1',]
dat['a10',]


sessionInfo()

R version 2.7.2 (2008-08-25)
i386-pc-mingw32

locale:
LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia. 
1252;LC_MON
ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia. 
1252


attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


other attached packages:
[1] lattice_0.17-13

loaded via a namespace (and not attached):
[1] grid_2.7.2




Regards,
Xianming



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to split a data framed with sequences

2008-09-09 Thread jim holtman
Is this what you want:

 my.df - data.frame(a = c(1:5, 1:10, 1:20), b = runif(35))
 split(my.df, c(0, cumsum(diff(my.df$a)  0)))
$`0`
  a b
1 1 0.2655087
2 2 0.3721239
3 3 0.5728534
4 4 0.9082078
5 5 0.2016819

$`1`
a  b
6   1 0.89838968
7   2 0.94467527
8   3 0.66079779
9   4 0.62911404
10  5 0.06178627
11  6 0.20597457
12  7 0.17655675
13  8 0.68702285
14  9 0.38410372
15 10 0.76984142

$`2`
a  b
16  1 0.49769924
17  2 0.71761851
18  3 0.99190609
19  4 0.38003518
20  5 0.77744522
21  6 0.93470523
22  7 0.21214252
23  8 0.65167377
24  9 0.1210
25 10 0.26722067
26 11 0.38611409
27 12 0.01339033
28 13 0.38238796
29 14 0.86969085
30 15 0.34034900
31 16 0.48208012
32 17 0.59956583
33 18 0.49354131
34 19 0.18621760
35 20 0.82737332




On Tue, Sep 9, 2008 at 5:38 AM, David Carslaw
[EMAIL PROTECTED] wrote:

 Hi all,

 Given a data frame:

 my.df - data.frame(a = c(1:5, 1:10, 1:20), b = runif(35))

 I want to split it by a such that I end up with a list containing 3
 components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc.
 In other words, sets of sequences of a.

 I can't seem to find the right form using the split function - can you help?

 Much appreciated.

 David



 -
 Institute for Transport Studies
 University of Leeds
 --
 View this message in context: 
 http://www.nabble.com/how-to-split-a-data-framed-with-sequences-tp19388964p19388964.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to split a data framed with sequences

2008-09-09 Thread David Carslaw

Hi all,

Given a data frame:

my.df - data.frame(a = c(1:5, 1:10, 1:20), b = runif(35))

I want to split it by a such that I end up with a list containing 3
components i.e. the first containing a = 1 to 5, the second a = 1 to 10 etc.
In other words, sets of sequences of a.

I can't seem to find the right form using the split function - can you help? 

Much appreciated.

David



-
Institute for Transport Studies
University of Leeds
-- 
View this message in context: 
http://www.nabble.com/how-to-split-a-data-framed-with-sequences-tp19388964p19388964.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] match problem by rownames

2008-09-09 Thread Dimitris Rizopoulos
try this:

dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5))
row.names(dat) - dat$x1

dat['a1' %in% row.names(dat), ]
dat['a10'  %in% row.names(dat), ]


I hope it helps.

Best,
Dimitris


 Hi all,

 While dat['a1',] and dat['a10',] produce the same results in the
 following example, I'd like dat['a1',] to return NAs.

 dat - data.frame(x1 = paste(letters[1:5],10, sep=''), x2=rnorm(5))
 rownames(dat) - dat$x1
 dat['a1',]
 dat['a10',]

 sessionInfo()
 R version 2.7.2 (2008-08-25)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MON
 ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base


 other attached packages:
 [1] lattice_0.17-13

 loaded via a namespace (and not attached):
 [1] grid_2.7.2


 Regards,
 Xianming



 DISCLAIMER:\ For details of our e-mail disclaimer, pleas...{{dropped:15}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043399
Fax: +31/(0)10/7044657

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plotting group means

2008-09-09 Thread Erich Studerus
Hi all,

 

I want to plot the grouped means of some variables. The dependent variables
and the grouping factor are stored in different columns. I want to draw a
simple line-plot of means, in which the x-axis represents the variables and
y-axis represents the means. The means of the groups should be connected by
lines. So far, the only function that I could find comes closest to what I'm
looking for, is the error.bars.by-function in the psych-package. To know,
what I'm looking for, just type:

 

library(psych)
x - matrix(rnorm(500),ncol=20)
y - sample(4,25 ,replace=TRUE)
x - x+y
error.bars.by(x,y,ci=0)

 

Now, I want to put a legend for the grouping factor of this graph. I also
would like to manipulate the linetypes and colors of the lines. I've read
the documentation, but it was not clear to me, how to do this. Are there
other plotting functions in R, which can do the same?

 

Erich

 



Erich Studerus
Lic. Phil. Klinische Psychologie
Psychiatric University Hospital Zurich
Division of Clinical Research
Lenggstr. 31
CH-8008 Zurich
Switzerland
Mail: [EMAIL PROTECTED]
Office: +41 44 384 26 66
Mobile: +41 76 563 31 54


 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Question about multiple regression

2008-09-09 Thread Gustaf Rydevik
On Mon, Sep 8, 2008 at 7:47 PM, Dimitri Liakhovitski [EMAIL PROTECTED] wrote:
 Thank you everyone for your responses. I'll answer several questions.

 1.   Disclaimer: I have **NO IDEA** of the details of what you want
 to do or why
 -- but I am willing to bet that there are better ways of doing it than  1.8
 mm multiple refressions that take 270 secs each!! (which I find difficult to
 believe in itself -- are you sure you are doing things right? Something
 sounds very fishy here: R's regression code is typically very fast).
 I probably should not bore everyone, but just to explain where the
 large number is coming from. I have an experimental design with 7
 factors. Each factor has between 3 and 5 levels. Once you cross them
 all, you end up with 18,000 cells. For each cell, I want to generate a
 sample of N=100. For each sample I have to analyze the data using 3
 different statistical methods of analysis (the goal of the
 Monte-Carlo) is to compare those methods. One of the methods requires
 running of up to ~32,000 simple multiple regressions - yes just for
 one sample and it's not a mistake. I test-ran one such analysis for a
 sample with N=800 and 15 predictors and it took 270 seconds. R was
 actually very fast - it ran each of the individual regressions in
 about 0.008 seconds. Still I need something faster.

 2. Sorry - what was the formula sum(lm.fit(x,y))$residuals^2) for? For
 example, using it on my data, I got a value of 36,644...

 3. I know that for similarly challenging situations people did used
 Fortran compilers. So, anyone heard of a free Fortran library or an
 efficient piece of code?

 Thank you!
 Dimitri



Have you considered the fact that 32000 regressions simply takes a lot of time?
I don't really have anything to go by, but it sounds unlikely that you
will be able to cut computing time by more than, say, ten times to 27
second. That would still leave you with 4 months of running a
computer.

Perhaps an alternative approach would be to get access to stronger
(super)computers, either at a university, or buying access. A quick
googling turns up http://www.clusterondemand.com/ for example.

Anyhow, good luck with your project! I'm sure the R list would be very
interested to hear of how you solved your problem.

Regards,

Gustaf


-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compiling date

2008-09-09 Thread Henrique Dallazuanna
Try this:

strptime(x, ifelse(nchar(x) == 8, '%d/%m/%y', '%d/%m/%Y'))

On Tue, Sep 9, 2008 at 3:48 AM, Megh Dal [EMAIL PROTECTED] wrote:

 Hi,

 I have following kind of dataset (all are dates) in my Excel sheet.

 09/08/08
 09/05/08
 09/04/08
 09/02/08
 09/01/08
 29/08/2008
 28/08/2008
 27/08/2008
 26/08/2008
 25/08/2008
 22/08/2008
 21/08/2008
 20/08/2008
 18/08/2008
 14/08/2008
 13/08/2008
 08/12/08
 08/11/08
 08/08/08
 08/07/08

 However I want to use R to compile those data to make all dates in same
 format. Can anyone please tell me any automated way for doing that?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compiling date

2008-09-09 Thread Dr Eberhard Lisse
Why not Format - Cell in Excell?

el

on 9/9/08 1:03 PM Henrique Dallazuanna said the following:
 Try this:
 
 strptime(x, ifelse(nchar(x) == 8, '%d/%m/%y', '%d/%m/%Y'))
 
 On Tue, Sep 9, 2008 at 3:48 AM, Megh Dal [EMAIL PROTECTED] wrote:
 
 Hi,

 I have following kind of dataset (all are dates) in my Excel sheet.

 09/08/08
 09/05/08
 09/04/08
 09/02/08
 09/01/08
 29/08/2008
 28/08/2008
 27/08/2008
 26/08/2008
 25/08/2008
 22/08/2008
 21/08/2008
 20/08/2008
 18/08/2008
 14/08/2008
 13/08/2008
 08/12/08
 08/11/08
 08/08/08
 08/07/08

 However I want to use R to compile those data to make all dates in same
 format. Can anyone please tell me any automated way for doing that?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exporting tapply objects to csv-files

2008-09-09 Thread Henrique Dallazuanna
Try creating a new object:

tb - rbind(table(a), do.call(rbind.data.frame, tapply(a, b, table)))
names(tb) - unique(a)

then write to csv by write.table.

On Tue, Sep 9, 2008 at 5:48 AM, Kunzler, Andreas [EMAIL PROTECTED] wrote:

 Dear Everyone,

 I try to create a cvs-file with different results form the table function.

 Imagine a data-frame with two vectors a and b where b is of the class
 factor.

 I use the tapply function to count a for the different values of b.

 tapply(a,b,table)

 and I use the table function to have a look of the frequencies as a total

 table(a)

 I would like to put both results together in one txt or csv file that I can
 import to e.g. Excel.

 The export file should have a layout like

 1,2,3,4,5,6,7 (possible values of a)
 3,6,7,8,8,8,1 (Counts of a total)
 1,2,3,4,5,3,0 (Counts of a where b==A)
 2,4,4,4,3,5,1 (Counts of a where b==B)

 I tried to change the class of the table result to a matrix but I could not
 find a way to use the results of tapply. I use tapply because b has 15
 different values.

 Thanx

 Andreas Kunzler
 
 Bundeszahnärztekammer (BZÄK)
 Chausseestraße 13
 10115 Berlin

 Tel.: 030 40005-113
 Fax:  030 40005-119

 E-Mail: [EMAIL PROTECTED]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting group means

2008-09-09 Thread Chuck Cleland
On 9/9/2008 6:49 AM, Erich Studerus wrote:
 Hi all,
 
  
 
 I want to plot the grouped means of some variables. The dependent variables
 and the grouping factor are stored in different columns. I want to draw a
 simple line-plot of means, in which the x-axis represents the variables and
 y-axis represents the means. The means of the groups should be connected by
 lines. So far, the only function that I could find comes closest to what I'm
 looking for, is the error.bars.by-function in the psych-package. To know,
 what I'm looking for, just type:
 
  
 
 library(psych)
 x - matrix(rnorm(500),ncol=20)
 y - sample(4,25 ,replace=TRUE)
 x - x+y
 error.bars.by(x,y,ci=0)
 
  
 
 Now, I want to put a legend for the grouping factor of this graph. I also
 would like to manipulate the linetypes and colors of the lines. I've read
 the documentation, but it was not clear to me, how to do this. Are there
 other plotting functions in R, which can do the same?

  Here is an approach which uses xyplot() in the lattice package and
shows how to control line types and colors:

mydf - data.frame(x=rep(paste(Group, 1:4, sep=), 6),
   v=rep(paste(Variable, 1:6, sep=), each=4),
   y=runif(24))

library(lattice)

xyplot(y ~ v, groups = x, data = mydf, type=b,
  xlab=Dependent Variables, ylab=Mean,
  auto.key=list(lines=TRUE, points=TRUE, space=right),
  par.settings = list(superpose.symbol =
   list(pch=c(16,8,1,5),
col=c(black,red,green,blue),
lty=c(1,2,3,4)),
  superpose.line =
   list(col=c(black,red,green,blue),
lty=c(1,2,3,4

 Erich
 
  
 
 
 
 Erich Studerus
 Lic. Phil. Klinische Psychologie
 Psychiatric University Hospital Zurich
 Division of Clinical Research
 Lenggstr. 31
 CH-8008 Zurich
 Switzerland
 Mail: [EMAIL PROTECTED]
 Office: +41 44 384 26 66
 Mobile: +41 76 563 31 54
 
 
  
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code. 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc. (www.ndri.org)
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting group means

2008-09-09 Thread ONKELINX, Thierry
Dear Erich,

Have a look at ggplot2

library(ggplot2)
dataset - expand.grid(x = 1:20, y = factor(LETTERS[1:4]), value = 1:10)
dataset$value - rnorm(nrow(dataset), sd = 0.5) + as.numeric(dataset$y)
plotdata - aggregate(dataset$value, list(x = dataset$x, y = dataset$y),
mean)
plotdata - merge(plotdata, aggregate(dataset$value, list(x = dataset$x,
y = dataset$y), sd))
plotdata$min - plotdata$x.x - plotdata$x.y
plotdata$max - plotdata$x.x + plotdata$x.y
ggplot(plotdata, aes(x = x, y = x.x, colour = y, min = min, max = max))
+ geom_pointrange() + geom_line() + geom_point()

HTH,

Thierry 




ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
[EMAIL PROTECTED] 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-Oorspronkelijk bericht-
Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
Namens Erich Studerus
Verzonden: dinsdag 9 september 2008 12:49
Aan: r-help@r-project.org
Onderwerp: [R] plotting group means

Hi all,



I want to plot the grouped means of some variables. The dependent
variables
and the grouping factor are stored in different columns. I want to draw
a
simple line-plot of means, in which the x-axis represents the variables
and
y-axis represents the means. The means of the groups should be connected
by
lines. So far, the only function that I could find comes closest to what
I'm
looking for, is the error.bars.by-function in the psych-package. To
know,
what I'm looking for, just type:



library(psych)
x - matrix(rnorm(500),ncol=20)
y - sample(4,25 ,replace=TRUE)
x - x+y
error.bars.by(x,y,ci=0)



Now, I want to put a legend for the grouping factor of this graph. I
also
would like to manipulate the linetypes and colors of the lines. I've
read
the documentation, but it was not clear to me, how to do this. Are there
other plotting functions in R, which can do the same?



Erich





Erich Studerus
Lic. Phil. Klinische Psychologie
Psychiatric University Hospital Zurich
Division of Clinical Research
Lenggstr. 31
CH-8008 Zurich
Switzerland
Mail: [EMAIL PROTECTED]
Office: +41 44 384 26 66
Mobile: +41 76 563 31 54





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en 
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is 
door een geldig ondertekend document. The views expressed in  this message and 
any annex are purely those of the writer and may not be regarded as stating an 
official position of INBO, as long as the message is not confirmed by a duly 
signed document.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] write dataframes

2008-09-09 Thread Williams, Robin
Hi,
Just a thought.
You wrote:
ob1-object1$ORF
ob2-object2$ORF
and then use cbind like,
HG-cbind(on1,ob2)
but there is an error. Is there any other function I can use? 

  If you copied and pasted this from R, then your problem is 
Hg - cbind(on1,ob2)
  You mean 
Hg - cbind(ob1,ob2) 
  So perhaps just a typo.
HTH,
Robin Williams 
Met Office summer intern - Health Forecasting 
[EMAIL PROTECTED] 
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Roberto 
Olivares-Hernández
Sent: Tuesday, September 09, 2008 12:47 PM
To: r-help@r-project.org
Subject: [R] write dataframes

Hi,

After manipulate my data I have ended up with 5 different data frames with 
different number of observations but the same number of variables (columns)

An example, if I write str(object1), I see this,

data.frame':   47 obs. of  3 variables:
 $ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 10 19 38 39 44 
45 50 51 59 60 ...
 $ mRNA   : num  0.891 1.148 1.202 1.479 1.445 ...
 $ Protein: num  1.230 1.288 1.175 0.724 0.851 ..

str(object2)
'data.frame':   21 obs. of  3 variables:
 $ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 11 25 40 55 66 
78 104 119 141 153 ...
 $ mRNA   : num  0.794 0.741 0.676 1.047 0.912 ...
 $ Protein: num  0.427 0.363 0.468 0.501 0.661 ...

using the  column $ORF from each object , how can I  compose/write the results 
in a file that contains columns with different length ?

I have tried to generate objects like
ob1-object1$ORF
ob2-object2$ORF
and then use cbind like,
HG-cbind(on1,ob2)
but there is an error. Is there any other function I can use?

Thanks for the help

Roberto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] write dataframes

2008-09-09 Thread Roberto Olivares-Hernández

Hi,

After manipulate my data I have ended up with 5 different data frames  
with different number of observations but the same

number of variables (columns)

An example, if I write str(object1), I see this,

data.frame':   47 obs. of  3 variables:
$ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 10 19 38 39 44 
45 50 51 59 60 ...

$ mRNA   : num  0.891 1.148 1.202 1.479 1.445 ...
$ Protein: num  1.230 1.288 1.175 0.724 0.851 ..

str(object2)
'data.frame':   21 obs. of  3 variables:
$ ORF: Factor w/ 245 levels YAL038W,YAL054C,..: 11 25 40 55 66 
78 104 119 141 153 ...

$ mRNA   : num  0.794 0.741 0.676 1.047 0.912 ...
$ Protein: num  0.427 0.363 0.468 0.501 0.661 ...

using the  column $ORF from each object , how can I  compose/write the 
results in a file that contains columns with different length ?


I have tried to generate objects like
ob1-object1$ORF
ob2-object2$ORF
and then use cbind like,
HG-cbind(on1,ob2)
but there is an error. Is there any other function I can use?

Thanks for the help

Roberto

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compiling date

2008-09-09 Thread stephen sefick
this is day month year?
look at chron or maybe the easiest is to use excel to change the format

On Tue, Sep 9, 2008 at 7:12 AM, Dr Eberhard Lisse [EMAIL PROTECTED] wrote:
 Why not Format - Cell in Excell?

 el

 on 9/9/08 1:03 PM Henrique Dallazuanna said the following:
 Try this:

 strptime(x, ifelse(nchar(x) == 8, '%d/%m/%y', '%d/%m/%Y'))

 On Tue, Sep 9, 2008 at 3:48 AM, Megh Dal [EMAIL PROTECTED] wrote:

 Hi,

 I have following kind of dataset (all are dates) in my Excel sheet.

 09/08/08
 09/05/08
 09/04/08
 09/02/08
 09/01/08
 29/08/2008
 28/08/2008
 27/08/2008
 26/08/2008
 25/08/2008
 22/08/2008
 21/08/2008
 20/08/2008
 18/08/2008
 14/08/2008
 13/08/2008
 08/12/08
 08/11/08
 08/08/08
 08/07/08

 However I want to use R to compile those data to make all dates in same
 format. Can anyone please tell me any automated way for doing that?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting group means

2008-09-09 Thread Jim Lemon

Hi Erich,
Have a look at brkdn.plot in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting group means

2008-09-09 Thread hadley wickham
On Tue, Sep 9, 2008 at 6:56 AM, ONKELINX, Thierry
[EMAIL PROTECTED] wrote:
 Dear Erich,

 Have a look at ggplot2

 library(ggplot2)
 dataset - expand.grid(x = 1:20, y = factor(LETTERS[1:4]), value = 1:10)
 dataset$value - rnorm(nrow(dataset), sd = 0.5) + as.numeric(dataset$y)

Or with stat_summary:

qplot(x, value, data=dataset, colour=y, group = y) +
stat_summary(geom=line, fun=mean,size=2)


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R_USER - in which file should I include it?

2008-09-09 Thread Eduardo M. A. M.Mendes
Hello

Many thanks.  It works just fine.

How about the packages issue?  That is, same thing for the installation
path.

Cheers

Ed


-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Monday, September 08, 2008 10:01 PM
To: Eduardo M. A. M.Mendes
Cc: r-help@r-project.org
Subject: Re: [R] R_USER - in which file should I include it?

Try adding this at the end of your etc/Rprofile.site file.  That file
should already be there so you don't have to create it,
just edit it.

cat(Hello from Rprofile.site\n)
setwd(C:/Users/eduardo/Documents)

You may need to edit it as Administrator.   You should
see the Hello message in which case you will know that the
Rprofile.site file is being run.

That should work unless Tinn-R runs R in such a way as to
ignore Rprofile.site.

On Mon, Sep 8, 2008 at 8:11 PM, Eduardo M. A. M.Mendes
[EMAIL PROTECTED] wrote:
 Hello

 I am not sure whether R starts from the same dir.  For instance:

 a) if I double-click on R-2.7.2 icon and then issue the command getwd(),
the
 result is:

 getwd()
 [1] C:/Users/eduardo/Documents

 b) If R starts from within Tinn-R, the result is:

 getwd()
 [1] C:/Program Files/R/R-2/bin

 I want that no matter which calling R method I am using if I issue the
 command getwd() (first command) the result is:

 C:/Users/eduardo/Documents/R


 Moreover all new packages go to C:/Users/eduardo/Documents/R/win-library


 Thanks

 Ed

 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 08, 2008 8:57 PM
 To: Eduardo M. A. M.Mendes
 Cc: r-help@r-project.org
 Subject: Re: [R] R_USER - in which file should I include it?

 Could you explain more clearly what you mean by the same?
 Do you mean that each time you click on R 2.7.2 icon on your
 desktop that running this from the R console:

 getwd()

 is the same directory on each startup?  Isn't that already the case?
 I don't think you need to set any environment variables at all.  If
 you don't set
 any environment variables then what specifically is happening that
 you don't want to happen?

 On Mon, Sep 8, 2008 at 7:10 PM, Eduardo M. A. M.Mendes
 [EMAIL PROTECTED] wrote:
 Hello

 I am a newbie.  I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I
 decided to install all 2.7 versions under c:\program files\R\2.7 from now
 on (2.7.1 is located under .\2.7.1)

 Although I don't like the idea (I am running Vista), I have edited
 etc\Renviron.site to contain:


 R_USER=c:/Users/eduardo/Documents/R
 R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7

 As far as R starting always from the same location, that is,
 c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help.  So I wonder
 whether someone from the list could help me to:

 a) force R to start always from the same location
 b) force R to install all new packages in the same location


 Many thanks

 Ed

 PS. Before sending this email, I read windows FAQ and browsed the
archives
 (too many posts in the subject!).

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Hardwarefor R cpu 64 vs 32, dual vs quad

2008-09-09 Thread Nic Larson
Need to buy fast computer for running R on. Today we use 2,8 MHz intel D cpu
and the calculations takes around 15 days. Is it possible to get the same
calculations down to minutes/hours by only changing the hardware?
Should I go for an really fast dual 32 bit cpu and run R over linux or xp or
go for an quad core / 64 bit cpu?
Is it effective to run R on 64 bit (and problem free
(running/installing))???
Have around 2000-3000 euro to spend
Thanx for any tip

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting group means

2008-09-09 Thread Erich Studerus
Thanks for all the suggestions, but it seems, that all these functions need
a rearrangement of my data, since in my case, the dependent variables are in
different columns. The error.bars.by-function seems to be the only plotting
function, that does not need a rearrangement. Are there other functions,
which can do that or is there an easy way to rearrange the columns into one?

Thanks

Erich


-Ursprüngliche Nachricht-
Von: hadley wickham [mailto:[EMAIL PROTECTED] 
Gesendet: Dienstag, 9. September 2008 15:02
An: ONKELINX, Thierry
Cc: Erich Studerus; r-help@r-project.org
Betreff: Re: [R] plotting group means

On Tue, Sep 9, 2008 at 6:56 AM, ONKELINX, Thierry
[EMAIL PROTECTED] wrote:
 Dear Erich,

 Have a look at ggplot2

 library(ggplot2)
 dataset - expand.grid(x = 1:20, y = factor(LETTERS[1:4]), value = 1:10)
 dataset$value - rnorm(nrow(dataset), sd = 0.5) + as.numeric(dataset$y)

Or with stat_summary:

qplot(x, value, data=dataset, colour=y, group = y) +
stat_summary(geom=line, fun=mean,size=2)


-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R_USER - in which file should I include it?

2008-09-09 Thread Gabor Grothendieck
You might look at ?.libPaths
(note the dot) and play around with adding a .libPaths command
to your Rprofile.site and again you may need Administrator rights
when editing it.  If that does not help then you can try clarifying
the problem.   In particular what the same refers to and what
is happening now and what you want to happen.

On Tue, Sep 9, 2008 at 9:14 AM, Eduardo M. A. M.Mendes
[EMAIL PROTECTED] wrote:
 Hello

 Many thanks.  It works just fine.

 How about the packages issue?  That is, same thing for the installation
 path.

 Cheers

 Ed


 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 08, 2008 10:01 PM
 To: Eduardo M. A. M.Mendes
 Cc: r-help@r-project.org
 Subject: Re: [R] R_USER - in which file should I include it?

 Try adding this at the end of your etc/Rprofile.site file.  That file
 should already be there so you don't have to create it,
 just edit it.

 cat(Hello from Rprofile.site\n)
 setwd(C:/Users/eduardo/Documents)

 You may need to edit it as Administrator.   You should
 see the Hello message in which case you will know that the
 Rprofile.site file is being run.

 That should work unless Tinn-R runs R in such a way as to
 ignore Rprofile.site.

 On Mon, Sep 8, 2008 at 8:11 PM, Eduardo M. A. M.Mendes
 [EMAIL PROTECTED] wrote:
 Hello

 I am not sure whether R starts from the same dir.  For instance:

 a) if I double-click on R-2.7.2 icon and then issue the command getwd(),
 the
 result is:

 getwd()
 [1] C:/Users/eduardo/Documents

 b) If R starts from within Tinn-R, the result is:

 getwd()
 [1] C:/Program Files/R/R-2/bin

 I want that no matter which calling R method I am using if I issue the
 command getwd() (first command) the result is:

 C:/Users/eduardo/Documents/R


 Moreover all new packages go to C:/Users/eduardo/Documents/R/win-library


 Thanks

 Ed

 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 08, 2008 8:57 PM
 To: Eduardo M. A. M.Mendes
 Cc: r-help@r-project.org
 Subject: Re: [R] R_USER - in which file should I include it?

 Could you explain more clearly what you mean by the same?
 Do you mean that each time you click on R 2.7.2 icon on your
 desktop that running this from the R console:

 getwd()

 is the same directory on each startup?  Isn't that already the case?
 I don't think you need to set any environment variables at all.  If
 you don't set
 any environment variables then what specifically is happening that
 you don't want to happen?

 On Mon, Sep 8, 2008 at 7:10 PM, Eduardo M. A. M.Mendes
 [EMAIL PROTECTED] wrote:
 Hello

 I am a newbie.  I had my R upgraded from 2.7.1 to 2.7.2 and in doing so I
 decided to install all 2.7 versions under c:\program files\R\2.7 from now
 on (2.7.1 is located under .\2.7.1)

 Although I don't like the idea (I am running Vista), I have edited
 etc\Renviron.site to contain:


 R_USER=c:/Users/eduardo/Documents/R
 R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7

 As far as R starting always from the same location, that is,
 c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help.  So I wonder
 whether someone from the list could help me to:

 a) force R to start always from the same location
 b) force R to install all new packages in the same location


 Many thanks

 Ed

 PS. Before sending this email, I read windows FAQ and browsed the
 archives
 (too many posts in the subject!).

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCA and % variance explained

2008-09-09 Thread pgseye

After doing a PCA using princomp, how do you view how much each component
contributes to variance in the dataset. I'm still quite new to the theory of
PCA - I have a little idea about eigenvectors and eigenvalues (these
determine the variance explained?). Are the eigenvalues related to loadings
in R?

Thanks,

Paul
-- 
View this message in context: 
http://www.nabble.com/PCA-and---variance-explained-tp19388970p19388970.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotting group means

2008-09-09 Thread hadley wickham
On Tue, Sep 9, 2008 at 8:38 AM, Erich Studerus
[EMAIL PROTECTED] wrote:
 Thanks for all the suggestions, but it seems, that all these functions need
 a rearrangement of my data, since in my case, the dependent variables are in
 different columns. The error.bars.by-function seems to be the only plotting
 function, that does not need a rearrangement. Are there other functions,
 which can do that or is there an easy way to rearrange the columns into one?

Try:

library(reshape)
melt(x)

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Vorticity and Divergence

2008-09-09 Thread Ravi Varadhan
Both vorticity and divergence are defined in terms of partial derivatives.
You can compute these derivatives using the `grad' function in numDeriv
package.

U - function(X) { your U function}
V - function(X) { your V function} 
# where X = c(x,y)

library(numDeriv)

grU - function(X) grad(X, func=U)
grV - function(X) grad(X, func=V)

# For a 2-dimensional vector field

vortivcity - function(X) grV(X)[2] - grU(X)[1]
divergence - function(X) grU(X)[1] + grV(X)[2]


# Here is an example:

U - function(X) X[1]^2 + X[1] * X[2]
V - function(X) X[2]^2 - X[1] * X[2]

 vorticity(c(2,1))
[1] -5
 divergence(c(2,1))
[1] 5


Does this help?

Ravi.

---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Igor Oliveira
Sent: Monday, September 08, 2008 11:37 AM
To: r-help@r-project.org
Subject: [R] Vorticity and Divergence

Hi all,

I have some wind data (U and V components) and I would like to compute
Vorticity and Divergence of these fields. Is there any R function that can
easily do that?

Thanks in advance for any help

Igor Oliveira
CSAG, Dept. Environmental  Geographical Science, University of Cape Town,
Private Bag X3, Rondebosch 7701. Tel.: +27 (0)21 650 5774
South Africa Fax: +27 (0)21 650 5773
  http:///www.csag.uct.ac.za/~igor

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question

2008-09-09 Thread Veronique.Pinard
Hi,
 
I'm trying to verify the assumption of homogeneity of variance of residuals in 
an ANOVA with levene.test. I don't know how to define the groups. I have 3 
factors : A, B and C(AxB).
 
What do I have to change or to add in the command to set that I'm working with 
the residuals and to set the groups?
 
library(car)
attach(anova.sns2)
levene.test(residuals ~ ???)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R_USER - in which file should I include it?

2008-09-09 Thread Eduardo M. A. M.Mendes
Many thanks. I shall look at it. In case I run into trouble again, I'll try
to clarify the the same.

Ed
 

-Original Message-
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, September 09, 2008 10:46 AM
To: Eduardo M. A. M.Mendes
Cc: r-help@r-project.org
Subject: Re: [R] R_USER - in which file should I include it?

You might look at ?.libPaths
(note the dot) and play around with adding a .libPaths command
to your Rprofile.site and again you may need Administrator rights
when editing it.  If that does not help then you can try clarifying
the problem.   In particular what the same refers to and what
is happening now and what you want to happen.

On Tue, Sep 9, 2008 at 9:14 AM, Eduardo M. A. M.Mendes
[EMAIL PROTECTED] wrote:
 Hello

 Many thanks.  It works just fine.

 How about the packages issue?  That is, same thing for the installation
 path.

 Cheers

 Ed


 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 08, 2008 10:01 PM
 To: Eduardo M. A. M.Mendes
 Cc: r-help@r-project.org
 Subject: Re: [R] R_USER - in which file should I include it?

 Try adding this at the end of your etc/Rprofile.site file.  That file
 should already be there so you don't have to create it,
 just edit it.

 cat(Hello from Rprofile.site\n)
 setwd(C:/Users/eduardo/Documents)

 You may need to edit it as Administrator.   You should
 see the Hello message in which case you will know that the
 Rprofile.site file is being run.

 That should work unless Tinn-R runs R in such a way as to
 ignore Rprofile.site.

 On Mon, Sep 8, 2008 at 8:11 PM, Eduardo M. A. M.Mendes
 [EMAIL PROTECTED] wrote:
 Hello

 I am not sure whether R starts from the same dir.  For instance:

 a) if I double-click on R-2.7.2 icon and then issue the command getwd(),
 the
 result is:

 getwd()
 [1] C:/Users/eduardo/Documents

 b) If R starts from within Tinn-R, the result is:

 getwd()
 [1] C:/Program Files/R/R-2/bin

 I want that no matter which calling R method I am using if I issue the
 command getwd() (first command) the result is:

 C:/Users/eduardo/Documents/R


 Moreover all new packages go to
C:/Users/eduardo/Documents/R/win-library


 Thanks

 Ed

 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 08, 2008 8:57 PM
 To: Eduardo M. A. M.Mendes
 Cc: r-help@r-project.org
 Subject: Re: [R] R_USER - in which file should I include it?

 Could you explain more clearly what you mean by the same?
 Do you mean that each time you click on R 2.7.2 icon on your
 desktop that running this from the R console:

 getwd()

 is the same directory on each startup?  Isn't that already the case?
 I don't think you need to set any environment variables at all.  If
 you don't set
 any environment variables then what specifically is happening that
 you don't want to happen?

 On Mon, Sep 8, 2008 at 7:10 PM, Eduardo M. A. M.Mendes
 [EMAIL PROTECTED] wrote:
 Hello

 I am a newbie.  I had my R upgraded from 2.7.1 to 2.7.2 and in doing so
I
 decided to install all 2.7 versions under c:\program files\R\2.7 from
now
 on (2.7.1 is located under .\2.7.1)

 Although I don't like the idea (I am running Vista), I have edited
 etc\Renviron.site to contain:


 R_USER=c:/Users/eduardo/Documents/R
 R_LIBS_USER=c:/Users/eduardo/Documents/R/win-library/2.7

 As far as R starting always from the same location, that is,
 c:/Users/eduardo/Documents/R, etc\Renviron.site didn't help.  So I
wonder
 whether someone from the list could help me to:

 a) force R to start always from the same location
 b) force R to install all new packages in the same location


 Many thanks

 Ed

 PS. Before sending this email, I read windows FAQ and browsed the
 archives
 (too many posts in the subject!).

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Gumbell distribution - minimum case

2008-09-09 Thread Aaron Mackey
If you mean you want an EVD with a fat left tail (instead of a fat
right tail), then can;t you just multiply all the values by -1 to
reverse the distribution?  A new location parameter could then shift
the distribution wherever you want along the number line ...

-Aaron

On Mon, Sep 8, 2008 at 5:22 PM, Richard Gwozdz [EMAIL PROTECTED] wrote:
 Hello,

 I would like to sample from a Gumbell (minimum) distribution.  I have
 installed package {evd} but the Gumbell functions there appear to refer to
 the maximum case.  Unfortunately, setting the scale parameter negative does
 not appear to work.

 Is there a separate package for the Gumbell minimum?


 --
 _
 Rich Gwozdz
 Fire and Mountain Ecology Lab
 College of Forest Resources
 University of Washington
 cell: 206-769-6808 office: 206-543-9138
 [EMAIL PROTECTED]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How does predict.lm work?

2008-09-09 Thread Williams, Robin
Hi, 
  Please could someone explain how this element of predict.lm works?
From the help file 
`
newdata   
An optional data frame in which to look for variables with which to
predict. If omitted, the fitted values are used.
' 
  Does this dataframe (newdata) need to have the same variable names as
was used in the original data frame used to fit the model? Or will R
just look across consecutive columns of newdata, and apply them to the
call as appropriate?
  For example, if I have fitted a model with four variables
(x1,x2,x3,x4) in my original dataframe, and then have a second dataframe
which I want to supply to the newdata argument in predict.lm with
variable names (x5, x6, x7, x8), do I need to change the variable names
in my newdata dataframe to match those of the original dataframe? Or
will R treat x5 as x1, x6 as x2, etc, when using predict.lm? 
  I would like to know so that I can design the structure of some
somewhat larger dataframes in a manner which will make using predict.lm
straight forward and quick.
Hope this makes sense.
Many thanks for any help. 
   Robin Williams
Met Office summer intern - Health Forecasting
[EMAIL PROTECTED] 
 

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does predict.lm work?

2008-09-09 Thread Gabor Grothendieck
Just try it:

 BOD # built in data frame
  Time demand
118.3
22   10.3
33   19.0
44   16.0
55   15.6
67   19.8
 BOD.lm - lm(demand ~ Time, BOD)
 predict(BOD.lm, list(Time = 10))
   1
25.73571
 predict(BOD.lm, list(10))
Error in eval(expr, envir, enclos) : object Time not found


On Tue, Sep 9, 2008 at 10:59 AM, Williams, Robin
[EMAIL PROTECTED] wrote:
 Hi,
  Please could someone explain how this element of predict.lm works?
 From the help file
 `
 newdata
 An optional data frame in which to look for variables with which to
 predict. If omitted, the fitted values are used.
 '
  Does this dataframe (newdata) need to have the same variable names as
 was used in the original data frame used to fit the model? Or will R
 just look across consecutive columns of newdata, and apply them to the
 call as appropriate?
  For example, if I have fitted a model with four variables
 (x1,x2,x3,x4) in my original dataframe, and then have a second dataframe
 which I want to supply to the newdata argument in predict.lm with
 variable names (x5, x6, x7, x8), do I need to change the variable names
 in my newdata dataframe to match those of the original dataframe? Or
 will R treat x5 as x1, x6 as x2, etc, when using predict.lm?
  I would like to know so that I can design the structure of some
 somewhat larger dataframes in a manner which will make using predict.lm
 straight forward and quick.
 Hope this makes sense.
 Many thanks for any help.
   Robin Williams
 Met Office summer intern - Health Forecasting
 [EMAIL PROTECTED]


[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How does predict.lm work?

2008-09-09 Thread Marc Schwartz
on 09/09/2008 09:59 AM Williams, Robin wrote:
 Hi, 
   Please could someone explain how this element of predict.lm works?
From the help file 
 `
 newdata 
 An optional data frame in which to look for variables with which to
 predict. If omitted, the fitted values are used.
 ' 
   Does this dataframe (newdata) need to have the same variable names as
 was used in the original data frame used to fit the model? 

Yes. Also, see the Note in ?predict.lm:

Variables are first looked for in newdata and then searched for in the
usual way (which will include the environment of the formula used in the
fit). A warning will be given if the variables found are not of the same
length as those in newdata if it was supplied.


It also says Variables, not columns.

 Or will R
 just look across consecutive columns of newdata, and apply them to the
 call as appropriate?

No.

   For example, if I have fitted a model with four variables
 (x1,x2,x3,x4) in my original dataframe, and then have a second dataframe
 which I want to supply to the newdata argument in predict.lm with
 variable names (x5, x6, x7, x8), do I need to change the variable names
 in my newdata dataframe to match those of the original dataframe? 

Yes.

 Or
 will R treat x5 as x1, x6 as x2, etc, when using predict.lm? 
   I would like to know so that I can design the structure of some
 somewhat larger dataframes in a manner which will make using predict.lm
 straight forward and quick.
 Hope this makes sense.
 Many thanks for any help. 

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] printing all rows

2008-09-09 Thread ANJAN PURKAYASTHA
Hi,
my data table has 38939 rows. R prints  the first 1 columns and then
prints an error message:[ reached getOption(max.print) -- omitted 27821
rows ]].
is it possible to set the maxprint parameter so that R prints all the rows?

tia,
anjan

-- 
=
anjan purkayastha, phd
bioinformatics analyst
whitehead institute for biomedical research
nine cambridge center
cambridge, ma 02142

purkayas [at] wi [dot] mit [dot] edu
703.740.6939

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hardwarefor R cpu 64 vs 32, dual vs quad

2008-09-09 Thread Prof Brian Ripley

On Tue, 9 Sep 2008, Nic Larson wrote:


Need to buy fast computer for running R on. Today we use 2,8 MHz intel D cpu
and the calculations takes around 15 days. Is it possible to get the same
calculations down to minutes/hours by only changing the hardware?


No: you would need to arrange to parallelize the computations.  I'd be 
surprised if you got a computer within your budget that was 3x faster on a 
single CPU than your current one, and R will only use (unaided) one CPU 
for most tasks (the exception being some matrix algebra).



Should I go for an really fast dual 32 bit cpu and run R over linux or xp or
go for an quad core / 64 bit cpu?
Is it effective to run R on 64 bit (and problem free
(running/installing))???


All answered in the R-admin manual, so please RTFM.


Have around 2000-3000 euro to spend
Thanx for any tip

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] puzzle about contrasts

2008-09-09 Thread Kenneth Knoblauch

Hi,

I'm trying to redefine the contrasts for a linear model.
With a 2 level factor, x, with levels A and B, a two level
factor outputs A and B - A from an lm fit, say
lm(y ~ x). I would like to set the contrasts so that
the coefficients output are -0.5 (A + B) and B - A,
but I can't get the sign correct for the first coefficient
(Intercept).

Here is a toy example,

set.seed(12161952)
y - rnorm(10)
x - factor(rep(letters[1:2], each = 5))
##  so  A and B =
tapply(y, x, mean)

a  b
-0.719  0.8323837

## and with treatment contrasts
coef(lm(y ~ x))  ## A and B - A

(Intercept)  xb
 -0.719   1.5522724

Then, I try to redefine the contrasts

### would like contrasts: -0.5 (A + B) and B - A
D1 - matrix( c(-0.5, -0.5,
-1, 1),
2, 2, byrow = TRUE)
C1 - solve(D1)
Cnt - C1[, -1]
contrasts(x) - Cnt
coef(lm(y ~ x))

(Intercept)  x1
 0.05624745  1.55227241

but note that the desired value is
-0.5 * sum(tapply(y, x, mean))

[1] -0.05624745

I note that the first column of C1 is -1's not +1's
and that working by hand, if I tamper with the model matrix

mm - model.matrix(y ~ x)
mm[, 1] - -1

mm
   (Intercept)   x1
1   -1 -0.5
2   -1 -0.5
3   -1 -0.5
4   -1 -0.5
5   -1 -0.5
6   -1  0.5
7   -1  0.5
8   -1  0.5
9   -1  0.5
10  -1  0.5
attr(,assign)
[1] 0 1
attr(,contrasts)
attr(,contrasts)$x
  [,1]
a -0.5
b  0.5

solve(t(mm) %*% mm) %*% t(mm) %*% y  ##Yes, I know. Use QR
   [,1]
(Intercept) -0.05624745
x1   1.55227241

gives the correct sign.

So, I guess my question reduces to how one would set the
contrasts for the model.matrix to be correct
for this to work out correctly?

Thank you.

Ken


--
Ken Knoblauch
Inserm U846
Institut Cellule Souche et Cerveau
Département Neurosciences Intégratives
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exporting tapply objects to csv-files

2008-09-09 Thread hadley wickham
On Tue, Sep 9, 2008 at 3:48 AM, Kunzler, Andreas [EMAIL PROTECTED] wrote:
 Dear Everyone,

 I try to create a cvs-file with different results form the table function.

 Imagine a data-frame with two vectors a and b where b is of the class factor.

 I use the tapply function to count a for the different values of b.

 tapply(a,b,table)

 and I use the table function to have a look of the frequencies as a total

 table(a)

 I would like to put both results together in one txt or csv file that I can 
 import to e.g. Excel.

 The export file should have a layout like

 1,2,3,4,5,6,7 (possible values of a)
 3,6,7,8,8,8,1 (Counts of a total)
 1,2,3,4,5,3,0 (Counts of a where b==A)
 2,4,4,4,3,5,1 (Counts of a where b==B)

 I tried to change the class of the table result to a matrix but I could not 
 find a way to use the results of tapply. I use tapply because b has 15 
 different values.

An alternative would be to use reshape (http://had.co.nz/reshape):

mydf - data.frame( a = sample(7, 100, rep = T), b =
sample(letters[1:15], 100, rep = T))

library(reshape)
mydf$value - 1
cast(mydf, b ~ a, sum, margins=row.major, fill = 0)

Regards,

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] PCA and % variance explained

2008-09-09 Thread ngottlieb
I did PCA stuff years there is a thing that is called a scree score 
Which will give an indication of the number of PCA's and the variance
explained.

Might want to web search on scree score and PCA.



-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of pgseye
Sent: Tuesday, September 09, 2008 5:39 AM
To: r-help@r-project.org
Subject: [R] PCA and % variance explained


After doing a PCA using princomp, how do you view how much each
component contributes to variance in the dataset. I'm still quite new to
the theory of PCA - I have a little idea about eigenvectors and
eigenvalues (these determine the variance explained?). Are the
eigenvalues related to loadings in R?

Thanks,

Paul
--
View this message in context:
http://www.nabble.com/PCA-and---variance-explained-tp19388970p19388970.h
tml
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




This information is being sent at the recipient's reques...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Linear Modeling the best alternative

2008-09-09 Thread stephen sefick
I have a data set of mean velocity, discharge, and mean depth.  I need
to find out which model best fits them out of log linear, linear, some
other kind of model...  Using excel I have found that linear is not
that bad and log10(discharge) vs. the other two variables (I am trying
to predict velocity and depth from discharge) is not that bad either.
How do I test and see which one of these models is better...  better
R-squared...  I know this is a stats question and not particularly an
R question, but I will use R for the models vetting process.
any ideas would be greatly appreciated,
-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] puzzle about contrasts

2008-09-09 Thread Prof Brian Ripley

-0.5*(A+B) is not a contrast, which is the seat of your puzzlement.

All you can get from y ~ x is an intercept (a column of ones) and a single 
'contrast' column for 'x'.


If you use y ~ 0+x you can get two columns for 'x', but R does not give 
you an option of what columns in the case: see the source of contrasts(). 
So you would need to replace contrasts(), which I think will be hard as 
model.matrix.default will look in the 'stats' namespace.  It would 
probably be easier to create the model matrix yourself.


On Tue, 9 Sep 2008, Kenneth Knoblauch wrote:


Hi,

I'm trying to redefine the contrasts for a linear model.
With a 2 level factor, x, with levels A and B, a two level
factor outputs A and B - A from an lm fit, say
lm(y ~ x). I would like to set the contrasts so that
the coefficients output are -0.5 (A + B) and B - A,
but I can't get the sign correct for the first coefficient
(Intercept).

Here is a toy example,

set.seed(12161952)
y - rnorm(10)
x - factor(rep(letters[1:2], each = 5))
##  so  A and B =
tapply(y, x, mean)

  a  b
-0.719  0.8323837

## and with treatment contrasts
coef(lm(y ~ x))  ## A and B - A

(Intercept)  xb
-0.719   1.5522724

Then, I try to redefine the contrasts

### would like contrasts: -0.5 (A + B) and B - A
D1 - matrix( c(-0.5, -0.5,
-1, 1),
2, 2, byrow = TRUE)
C1 - solve(D1)
Cnt - C1[, -1]
contrasts(x) - Cnt
coef(lm(y ~ x))

(Intercept)  x1
0.05624745  1.55227241

but note that the desired value is
-0.5 * sum(tapply(y, x, mean))

[1] -0.05624745

I note that the first column of C1 is -1's not +1's
and that working by hand, if I tamper with the model matrix

mm - model.matrix(y ~ x)
mm[, 1] - -1

mm
 (Intercept)   x1
1   -1 -0.5
2   -1 -0.5
3   -1 -0.5
4   -1 -0.5
5   -1 -0.5
6   -1  0.5
7   -1  0.5
8   -1  0.5
9   -1  0.5
10  -1  0.5
attr(,assign)
[1] 0 1
attr(,contrasts)
attr(,contrasts)$x
[,1]
a -0.5
b  0.5

solve(t(mm) %*% mm) %*% t(mm) %*% y  ##Yes, I know. Use QR
 [,1]
(Intercept) -0.05624745
x1   1.55227241

gives the correct sign.

So, I guess my question reduces to how one would set the
contrasts for the model.matrix to be correct
for this to work out correctly?

Thank you.

Ken


--
Ken Knoblauch
Inserm U846
Institut Cellule Souche et Cerveau
Département Neurosciences Intégratives
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] passing graph image data from remote Rserve

2008-09-09 Thread Patil, Prasad
Hello,

 

I am using Rserve to create a dedicated computational back-engine. I
generate and pass an array of data to a java application on a separate
server. I was wondering if the same is possible for an image. I believe
that Rserve supports passing certain R objects and JRclient can cast
these objects into their Java counterparts. If I generate a barplot in R
(remotely), can I pass the graph image back to the Java application for
display? Currently, I am reduced to saving the graph as a .pdf locally,
passing the .pdf's filepath to the Java application and allowing the
application access to the file, which is not an ideal structure.

 

Thanks for your help,

 

Prasad


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Modality Test

2008-09-09 Thread Amin W. Mugera

Dear Readers:

I have two issues in nonparametric statistical analysis that i need
help:

First, does R have a package that can implement the multimodality test,
e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen
an earlier thread (sometime in 2003) where someone was trying to write
a code for the Silverman test of multimodality. Is there any other
tests that can enable me to know how many modes are in a distribution?

Second, i would like to test whether two distributions are equal. Does R
have a  package than can implement the Li (1996) test of the equality
of two distributions? Is there any other test i can use rather than the
Li test?

Thank you in advance for your help.

Amin Mugera
Graduate Student
AgEcon Dept. Kansas State University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] passing graph image data from remote Rserve

2008-09-09 Thread Patil, Prasad
I believe I have found my solution, so please disregard. Thanks


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear Modeling the best alternative

2008-09-09 Thread Ben Bolker
stephen sefick ssefick at gmail.com writes:

 
 I have a data set of mean velocity, discharge, and mean depth.  I need
 to find out which model best fits them out of log linear, linear, some
 other kind of model...  Using excel I have found that linear is not
 that bad and log10(discharge) vs. the other two variables (I am trying
 to predict velocity and depth from discharge) is not that bad either.
 How do I test and see which one of these models is better...  better
 R-squared...  I know this is a stats question and not particularly an
 R question, but I will use R for the models vetting process.
 any ideas would be greatly appreciated,


  AIC is not bad, but see
http://www.unc.edu/courses/2006spring/ecol/145/001/docs/lectures/lecture18.htm
for computing AIC to compare models where some have transformed
response variables ...

  Ben Bolker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] puzzle about contrasts

2008-09-09 Thread Peter Dalgaard
Prof Brian Ripley skrev:
 -0.5*(A+B) is not a contrast, which is the seat of your puzzlement.

 All you can get from y ~ x is an intercept (a column of ones) and a
 single 'contrast' column for 'x'.

 If you use y ~ 0+x you can get two columns for 'x', but R does not
 give you an option of what columns in the case: see the source of
 contrasts(). So you would need to replace contrasts(), which I think
 will be hard as model.matrix.default will look in the 'stats'
 namespace.  It would probably be easier to create the model matrix
 yourself.

Or accept the default and do the parameter transformations yourself.

l - lm(y~x)
T - rbind(
c(-1,-.5),
c(0,1))

c2 - T%*%coef(l)
V2 - T%*%vcov(l) %*% t(T)

cbind(coef=c(c2), s.e.=sqrt(diag(V2)))


 On Tue, 9 Sep 2008, Kenneth Knoblauch wrote:

 Hi,

 I'm trying to redefine the contrasts for a linear model.
 With a 2 level factor, x, with levels A and B, a two level
 factor outputs A and B - A from an lm fit, say
 lm(y ~ x). I would like to set the contrasts so that
 the coefficients output are -0.5 (A + B) and B - A,
 but I can't get the sign correct for the first coefficient
 (Intercept).

 Here is a toy example,

 set.seed(12161952)
 y - rnorm(10)
 x - factor(rep(letters[1:2], each = 5))
 ##  so  A and B =
 tapply(y, x, mean)

   a  b
 -0.719  0.8323837

 ## and with treatment contrasts
 coef(lm(y ~ x))  ## A and B - A

 (Intercept)  xb
 -0.719   1.5522724

 Then, I try to redefine the contrasts

 ### would like contrasts: -0.5 (A + B) and B - A
 D1 - matrix( c(-0.5, -0.5,
 -1, 1),
 2, 2, byrow = TRUE)
 C1 - solve(D1)
 Cnt - C1[, -1]
 contrasts(x) - Cnt
 coef(lm(y ~ x))

 (Intercept)  x1
 0.05624745  1.55227241

 but note that the desired value is
 -0.5 * sum(tapply(y, x, mean))

 [1] -0.05624745

 I note that the first column of C1 is -1's not +1's
 and that working by hand, if I tamper with the model matrix

 mm - model.matrix(y ~ x)
 mm[, 1] - -1

 mm
  (Intercept)   x1
 1   -1 -0.5
 2   -1 -0.5
 3   -1 -0.5
 4   -1 -0.5
 5   -1 -0.5
 6   -1  0.5
 7   -1  0.5
 8   -1  0.5
 9   -1  0.5
 10  -1  0.5
 attr(,assign)
 [1] 0 1
 attr(,contrasts)
 attr(,contrasts)$x
 [,1]
 a -0.5
 b  0.5

 solve(t(mm) %*% mm) %*% t(mm) %*% y  ##Yes, I know. Use QR
  [,1]
 (Intercept) -0.05624745
 x1   1.55227241

 gives the correct sign.

 So, I guess my question reduces to how one would set the
 contrasts for the model.matrix to be correct
 for this to work out correctly?

 Thank you.

 Ken


 -- 
 Ken Knoblauch
 Inserm U846
 Institut Cellule Souche et Cerveau
 Département Neurosciences Intégratives
 18 avenue du Doyen Lépine
 69500 Bron
 France
 tel: +33 (0)4 72 91 34 77
 fax: +33 (0)4 72 91 34 61
 portable: +33 (0)6 84 10 64 10
 http://www.sbri.fr

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with 'spectrum'

2008-09-09 Thread rkevinburton
For the command 'spectrum' I read:

The spectrum here is defined with scaling 1/frequency(x), following S-PLUS. 
This makes the spectral density a density over the range (-frequency(x)/2, 
+frequency(x)/2], whereas a more common scaling is 2π and range (-0.5, 0.5] 
(e.g., Bloomfield) or 1 and range (-π, π]. 


Forgive my ignorance but I am having a hard time interpreting this. Does this 
mean that in the spectrum output every element of the $spec array is scaled by 
1/frequency(x)? I am having a hard time determing what is meant by 
'frequency'.Say I define a time series for a year with samples for every day. I 
input a 'frequency' of 365 (which in my mind is the period). On the output of 
'spectrum' would this mean that every element of the $spec array is scaled by 
1/365? There is a corresponding frequency array on the output from 'spectrum'. 
If the frequency is 365 and an element in the frequency array output from 
'spectrum' is .1 am I to assume that the period is 36.5 and a corresponding sin 
wave would be sin(2 * pi * 36.5/365)?

Thank you in advance for helping me clear up some confusion.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] puzzle about contrasts

2008-09-09 Thread Peter Dalgaard
Peter Dalgaard skrev:
 Prof Brian Ripley skrev:
   
 -0.5*(A+B) is not a contrast, which is the seat of your puzzlement.

 All you can get from y ~ x is an intercept (a column of ones) and a
 single 'contrast' column for 'x'.

 If you use y ~ 0+x you can get two columns for 'x', but R does not
 give you an option of what columns in the case: see the source of
 contrasts(). So you would need to replace contrasts(), which I think
 will be hard as model.matrix.default will look in the 'stats'
 namespace.  It would probably be easier to create the model matrix
 yourself.

 
 Or accept the default and do the parameter transformations yourself.

 l - lm(y~x)
 T - rbind(
 c(-1,-.5),
 c(0,1))

 c2 - T%*%coef(l)
 V2 - T%*%vcov(l) %*% t(T)

 cbind(coef=c(c2), s.e.=sqrt(diag(V2)))
   

I forgot: Also have a look at estimable() from the gmodels packages.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] : writeMat

2008-09-09 Thread erola pairo
I write a .mat file using the writeMat() command, but when i try to load it
in Matlab it says that  file may be corrupt. I did it a month ago and it
worked. It exists any option that I can change for making the file readable
to Matlab?

 A  -  c(1:10)
 dim(A) - c(2,5)
 library(R.matlab)
 writeMat('A.mat', A=A)

And what matlab say is:
file may be corrupt

Regards

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help on wavelet

2008-09-09 Thread giov

Hi,
I have little experience using wavelet and I would like to know if it is
possible,using R wavelet package, to have a plot of frequency versus time. 

thank you

giov
-- 
View this message in context: 
http://www.nabble.com/help-on-wavelet-tp19395583p19395583.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] binomial(link=inverse)

2008-09-09 Thread Ben Bolker

  this may be a better question for r-devel, but ...

  Is there a particular reason (and if so, what is it) that
the inverse link is not in the list of allowable link functions
for the binomial family?  I initially thought this might
have something to do with the properties of canonical
vs non-canonical link functions, but since other link functions
(probit, cloglog, cauchit, log) are allowed, I can't think
of any good reason.  In fact, it's sort of a mystery to me
why the sets of link functions for each family are restricted.
Is this from painful experience that some link functions just
don't work well?

  I can go ahead and hack my own version that allows inverse
link, but it would be nice to know if I'm doing something dumb.

  (The reason I want to do this is that the inverse link
linearizes the Michaelis-Menten function, y = a*x/(b+x) ...)

  cheers
Ben Bolker




signature.asc
Description: OpenPGP digital signature
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] printing all rows

2008-09-09 Thread Adam D. I. Kramer

options(max.print)

$max.print
[1] 9


options(max.print=10)
options(max.print)

$max.print
[1] 1e+05

...so check what your max.print is, and figure out whether you need to
set it to nrow, ncol, or nrow*ncol of your data frame...then do so...though
of course, this is a global variable, so everything you print from then on
will just keep printing and printing.

Really, though, you might get more utility out of write.table and then using
a word processor to read the data in your table.

--Adam

On Tue, 9 Sep 2008, ANJAN PURKAYASTHA wrote:


Hi,
my data table has 38939 rows. R prints  the first 1 columns and then
prints an error message:[ reached getOption(max.print) -- omitted 27821
rows ]].
is it possible to set the maxprint parameter so that R prints all the rows?

tia,
anjan

--
=
anjan purkayastha, phd
bioinformatics analyst
whitehead institute for biomedical research
nine cambridge center
cambridge, ma 02142

purkayas [at] wi [dot] mit [dot] edu
703.740.6939

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help on wavelet

2008-09-09 Thread stephen sefick
It depends on what you want to do.  In wavelet speak frequency is scale.
these are the libraries:
wmtsa - wavCWT (make sure that you pick the wavelet.  I suggest morlet
because it is compactly supported (disappears to zero quickly))
I would also suggest the fields packages for the tim.colors function
which produces the familiar red to blue color scheme.
sowas- more complex stuff here take a look very interesting if you are
trying to tell if two signals are coherent.

hope this helps

stephen

On Tue, Sep 9, 2008 at 12:03 PM, giov [EMAIL PROTECTED] wrote:

 Hi,
 I have little experience using wavelet and I would like to know if it is
 possible,using R wavelet package, to have a plot of frequency versus time.

 thank you

 giov
 --
 View this message in context: 
 http://www.nabble.com/help-on-wavelet-tp19395583p19395583.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating table of averages

2008-09-09 Thread Lawrence Hanser
Dear Colleagues,

I have a dataframe with variables:

  [1] ID category   a11a12
a13a21
  [7] a22a23a31a32
b11b12
 [13] b13b21b31b32
b33b41
 [19] b42c11c12c21
c22c23
 [25] c31c32c33d11
d12d13
 [31] d14d21d22d23
d24d25
 [37] d31d32d33e11
e12e13
 [43] e21e22e23e31
e32e33
 [49] f11f12f13f14
f21f22
 [55] f23f24g11g12
g13g14
 [61] g21g22g23g24
g31g32
 [67] g33g41g42g43
h11h12
 [73] h13h21h22h23
C1.Employ  SC11.Ops
 [79] SC12.Unit  SC13.Nonadvers C2.Enterprise  SC21.Structure
SC22.Gov   SC23.Culture
 [85] SC24.Stratcomm C3.Manage  SC31.Resource  SC32.Change
SC33.Continue  C4.Stratthink
 [91] SC41.VisionSC42.Decision  SC43.Adapt C5.Lead
SC51.Develop   SC52.Care
 [97] SC53.Diversity C6.Foster  SC61.Teams SC62.Negotiate
C7.Embody  SC71.Ethical
[103] SC72.Follower  SC73.Warrior   SC74.Develop   C8.Comm
C81.Speak  C82.Listen
[109] OverallImp

The variable category has four values: Regular, CCM, CFM, and Other

I'd like to create a table like this to feed into barplot2:

row.name  C1.Employ C2.Enterprise  C3.Manage  C4.Stratthink  C5.Lead
C6.Foster  C7.Embody  C8.Comm
Regular 3.68  4.27 3.22
etc..
CCM 4.32  4.56  etc.
CFM  etc.
Other etc.

So far, I have been able to get this far:

 
mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm
)))
C1.Employ C2.Enterprise C3.Manage C4.Stratthink   C5.Lead
C6.Foster C7.Embody   C8.Comm
 3.60  3.85  4.48  4.346667  4.608889
4.44  4.60  4.49


But I am stumped as to how to get what I want.

Thanks in advance.

Larry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] randomForest

2008-09-09 Thread Kate Behrman
I am combining many different random forest objects run on the same data set
using the combine ( ) function. After combining the forests I am not sure
whether the variable importance, local importance, and rsq predictors are
recalculated for the new random forest object  or are calculated
individually for each tree ensemble? Is it  possible to  calculate these
predictors for the new random forest object after calling the combine
function? Any help would be greatly apprecaited. Thanks, Kate

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Compiling date

2008-09-09 Thread Dr Eberhard W Lisse

Is this Month-Day or Day-Month or a mixture of both?

I still think using the Format - Cell - Date will work
much better...

el


On 09 Sep 2008, at 11:21 , David Scott wrote:


On Mon, 8 Sep 2008, Megh Dal wrote:


Hi,

I have following kind of dataset (all are dates) in my Excel sheet.

09/08/08
09/05/08
09/04/08
09/02/08
09/01/08
29/08/2008
28/08/2008
27/08/2008
26/08/2008
25/08/2008
22/08/2008
21/08/2008
20/08/2008
18/08/2008
14/08/2008
13/08/2008
08/12/08
08/11/08
08/08/08
08/07/08

However I want to use R to compile those data to make all dates in  
same format. Can anyone please tell me any automated way for doing  
that?




Well you have to read them in as character first. Then use sub to  
make the two digit years into four digits. The following could  
probably be improved by a regular expression whiz, but works:



strngs - c(06/05/08,23/11/2008)
sub(([0-9][0-9]/[0-9][0-9]/)([0-9][0-9]$),\\120\\2,strngs)

[1] 06/05/2008 23/11/2008


David Scott


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating table of averages

2008-09-09 Thread Adam D. I. Kramer

Maybe something like this:

by(df[,c(77,81,86,90,94,98,101,106)],df$category,apply,2,mean)

...which would then need to be reformatted into a data frame (there is
probably an easy way to do this which I don't know).

aggregate seems like a more reasonable choice, but the function for
aggregate must return scalars, not rows...tapply doesn't take data.frame
inputs. Maybe someone else has a suggestion?

--Adam

On Tue, 9 Sep 2008, Lawrence Hanser wrote:


Dear Colleagues,

I have a dataframe with variables:

 [1] ID category   a11a12
a13a21
 [7] a22a23a31a32
b11b12
[13] b13b21b31b32
b33b41
[19] b42c11c12c21
c22c23
[25] c31c32c33d11
d12d13
[31] d14d21d22d23
d24d25
[37] d31d32d33e11
e12e13
[43] e21e22e23e31
e32e33
[49] f11f12f13f14
f21f22
[55] f23f24g11g12
g13g14
[61] g21g22g23g24
g31g32
[67] g33g41g42g43
h11h12
[73] h13h21h22h23
C1.Employ  SC11.Ops
[79] SC12.Unit  SC13.Nonadvers C2.Enterprise  SC21.Structure
SC22.Gov   SC23.Culture
[85] SC24.Stratcomm C3.Manage  SC31.Resource  SC32.Change
SC33.Continue  C4.Stratthink
[91] SC41.VisionSC42.Decision  SC43.Adapt C5.Lead
SC51.Develop   SC52.Care
[97] SC53.Diversity C6.Foster  SC61.Teams SC62.Negotiate
C7.Embody  SC71.Ethical
[103] SC72.Follower  SC73.Warrior   SC74.Develop   C8.Comm
C81.Speak  C82.Listen
[109] OverallImp

The variable category has four values: Regular, CCM, CFM, and Other

I'd like to create a table like this to feed into barplot2:

row.name  C1.Employ C2.Enterprise  C3.Manage  C4.Stratthink  C5.Lead
C6.Foster  C7.Embody  C8.Comm
Regular 3.68  4.27 3.22
etc..
CCM 4.32  4.56  etc.
CFM  etc.
Other etc.

So far, I have been able to get this far:


mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm
)))
   C1.Employ C2.Enterprise C3.Manage C4.Stratthink   C5.Lead
C6.Foster C7.Embody   C8.Comm
3.60  3.85  4.48  4.346667  4.608889
4.44  4.60  4.49




But I am stumped as to how to get what I want.

Thanks in advance.

Larry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating table of averages

2008-09-09 Thread Duncan Murdoch

On 9/9/2008 2:12 PM, Adam D. I. Kramer wrote:

Maybe something like this:

by(df[,c(77,81,86,90,94,98,101,106)],df$category,apply,2,mean)

...which would then need to be reformatted into a data frame (there is
probably an easy way to do this which I don't know).


sparseby() in the reshape package is more flexible than by(). If the 
function returns a vector with a consistent length, you'll get a 
dataframe with columns corresponding to its entries.


Duncan Murdoch



aggregate seems like a more reasonable choice, but the function for
aggregate must return scalars, not rows...tapply doesn't take data.frame
inputs. Maybe someone else has a suggestion?

--Adam

On Tue, 9 Sep 2008, Lawrence Hanser wrote:


Dear Colleagues,

I have a dataframe with variables:

 [1] ID category   a11a12
a13a21
 [7] a22a23a31a32
b11b12
[13] b13b21b31b32
b33b41
[19] b42c11c12c21
c22c23
[25] c31c32c33d11
d12d13
[31] d14d21d22d23
d24d25
[37] d31d32d33e11
e12e13
[43] e21e22e23e31
e32e33
[49] f11f12f13f14
f21f22
[55] f23f24g11g12
g13g14
[61] g21g22g23g24
g31g32
[67] g33g41g42g43
h11h12
[73] h13h21h22h23
C1.Employ  SC11.Ops
[79] SC12.Unit  SC13.Nonadvers C2.Enterprise  SC21.Structure
SC22.Gov   SC23.Culture
[85] SC24.Stratcomm C3.Manage  SC31.Resource  SC32.Change
SC33.Continue  C4.Stratthink
[91] SC41.VisionSC42.Decision  SC43.Adapt C5.Lead
SC51.Develop   SC52.Care
[97] SC53.Diversity C6.Foster  SC61.Teams SC62.Negotiate
C7.Embody  SC71.Ethical
[103] SC72.Follower  SC73.Warrior   SC74.Develop   C8.Comm
C81.Speak  C82.Listen
[109] OverallImp

The variable category has four values: Regular, CCM, CFM, and Other

I'd like to create a table like this to feed into barplot2:

row.name  C1.Employ C2.Enterprise  C3.Manage  C4.Stratthink  C5.Lead
C6.Foster  C7.Embody  C8.Comm
Regular 3.68  4.27 3.22
etc..
CCM 4.32  4.56  etc.
CFM  etc.
Other etc.

So far, I have been able to get this far:


mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm
)))
   C1.Employ C2.Enterprise C3.Manage C4.Stratthink   C5.Lead
C6.Foster C7.Embody   C8.Comm
3.60  3.85  4.48  4.346667  4.608889
4.44  4.60  4.49




But I am stumped as to how to get what I want.

Thanks in advance.

Larry

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating table of averages

2008-09-09 Thread Lawrence Hanser
Perfect!

Thanks.


On Tue, Sep 9, 2008 at 11:27 AM, Duncan Murdoch [EMAIL PROTECTED]wrote:

 On 9/9/2008 2:12 PM, Adam D. I. Kramer wrote:

 Maybe something like this:

 by(df[,c(77,81,86,90,94,98,101,106)],df$category,apply,2,mean)

 ...which would then need to be reformatted into a data frame (there is
 probably an easy way to do this which I don't know).


 sparseby() in the reshape package is more flexible than by(). If the
 function returns a vector with a consistent length, you'll get a dataframe
 with columns corresponding to its entries.

 Duncan Murdoch



 aggregate seems like a more reasonable choice, but the function for
 aggregate must return scalars, not rows...tapply doesn't take data.frame
 inputs. Maybe someone else has a suggestion?

 --Adam

 On Tue, 9 Sep 2008, Lawrence Hanser wrote:

  Dear Colleagues,

 I have a dataframe with variables:

  [1] ID category   a11a12
 a13a21
  [7] a22a23a31a32
 b11b12
 [13] b13b21b31b32
 b33b41
 [19] b42c11c12c21
 c22c23
 [25] c31c32c33d11
 d12d13
 [31] d14d21d22d23
 d24d25
 [37] d31d32d33e11
 e12e13
 [43] e21e22e23e31
 e32e33
 [49] f11f12f13f14
 f21f22
 [55] f23f24g11g12
 g13g14
 [61] g21g22g23g24
 g31g32
 [67] g33g41g42g43
 h11h12
 [73] h13h21h22h23
 C1.Employ  SC11.Ops
 [79] SC12.Unit  SC13.Nonadvers C2.Enterprise  SC21.Structure
 SC22.Gov   SC23.Culture
 [85] SC24.Stratcomm C3.Manage  SC31.Resource  SC32.Change
 SC33.Continue  C4.Stratthink
 [91] SC41.VisionSC42.Decision  SC43.Adapt C5.Lead
 SC51.Develop   SC52.Care
 [97] SC53.Diversity C6.Foster  SC61.Teams SC62.Negotiate
 C7.Embody  SC71.Ethical
 [103] SC72.Follower  SC73.Warrior   SC74.Develop   C8.Comm
 C81.Speak  C82.Listen
 [109] OverallImp

 The variable category has four values: Regular, CCM, CFM, and Other

 I'd like to create a table like this to feed into barplot2:

 row.name  C1.Employ C2.Enterprise  C3.Manage  C4.Stratthink  C5.Lead
 C6.Foster  C7.Embody  C8.Comm
 Regular 3.68  4.27 3.22
 etc..
 CCM 4.32  4.56  etc.
 CFM  etc.
 Other etc.

 So far, I have been able to get this far:

 

 mean(subset(impchiefs08,category==Regular,select=c(C1.Employ,C2.Enterprise,C3.Manage,C4.Stratthink,C5.Lead,C6.Foster,C7.Embody,C8.Comm
 )))
   C1.Employ C2.Enterprise C3.Manage C4.Stratthink   C5.Lead
 C6.Foster C7.Embody   C8.Comm
3.60  3.85  4.48  4.346667  4.608889
 4.44  4.60  4.49



 But I am stumped as to how to get what I want.

 Thanks in advance.

 Larry

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Hardwarefor R cpu 64 vs 32, dual vs quad

2008-09-09 Thread Henrik Bengtsson
On Tue, Sep 9, 2008 at 6:31 AM, Nic Larson [EMAIL PROTECTED] wrote:
 Need to buy fast computer for running R on. Today we use 2,8 MHz intel D cpu
 and the calculations takes around 15 days. Is it possible to get the same
 calculations down to minutes/hours by only changing the hardware?
 Should I go for an really fast dual 32 bit cpu and run R over linux or xp or
 go for an quad core / 64 bit cpu?
 Is it effective to run R on 64 bit (and problem free
 (running/installing))???
 Have around 2000-3000 euro to spend

Faster machines won't do that much.  Without knowing what methods and
algorithms you are running, I bet you a beer that it can be made twice
as fast by just optimizing the code.  My claim applies recursively.
In other words, by optimizing the algorithms/code you can speed up
things quite a bit.  From experience, it is not unlikely to find
bottlenecks in generic algorithms that can be made 10-100 times
faster.  Here is *one* example illustrating that even when you think
the code is fully optimized you can still squeeze out more:

  http://wiki.r-project.org/rwiki/doku.php?id=tips:programming:code_optim2

So, start profiling your code to narrow down the parts that takes most
of the CPU time.  help(Rprof) is a start.  There is also a Section
'Profiling R code for speed' in 'Writing R Extensions'.  Good old
verbose print out of system.time() also helps.

My $.02 ...or 2000-3000USD if it was bounty?! ;)

/Henrik

 Thanx for any tip

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cluster/snow question

2008-09-09 Thread tolga . i . uzuner
Hi Markus,

Many thanks. Is the cluster variable you mention below available in the 
environment of the nodes ? Specifically, within that environment, how 
could one identify the rank of that specific node ?

My code would use that information to partition the problem.

Thanks,
Tolga




Markus Schmidberger [EMAIL PROTECTED] 
09/09/2008 07:11
Please respond to
[EMAIL PROTECTED]


To
[EMAIL PROTECTED]
cc
r-help@r-project.org
Subject
Re: [R] cluster/snow question






Hi Tolga,

in SNOW you have to start a cluster with the command

  library(snow)
  cluster - makeCluster(#nodes)

The object cluster is a list with an object for each node and each 
object again is a list with all informations (rank, comm, tags)
The size of the cluster is the length of the list.

  #nodes == length(cluster)

E.g. the rank for node one you can get by
  cluster[[1]]$rank

Best
Markus

[EMAIL PROTECTED] schrieb:
 Dear R Users,

 I am attempting to use the snow package for clustering. Is there a way 
to 
 identfy, in the environment of each node, a rank for that node and also, 

 the total size of the cluster ? 

 By way of analogy, I am looking for the functions in snow equivalent to 
 mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things 

 clearer.

 Thanks in advance,
 Tolga

 Generally, this communication is for informational purposes only
 and it is not intended as an offer or solicitation for the purchase
 or sale of any financial instrument or as an official confirmation
 of any transaction. In the event you are receiving the offering
 materials attached below related to your interest in hedge funds or
 private equity, this communication may be intended as an offer or
 solicitation for the purchase or sale of such fund(s).  All market
 prices, data and other information are not warranted as to
 completeness or accuracy and are subject to change without notice.
 Any comments or statements made herein do not necessarily reflect
 those of JPMorgan Chase  Co., its subsidiaries and affiliates.

 This transmission may contain information that is privileged,
 confidential, legally privileged, and/or exempt from disclosure
 under applicable law. If you are not the intended recipient, you
 are hereby notified that any disclosure, copying, distribution, or
 use of the information contained herein (including any reliance
 thereon) is STRICTLY PROHIBITED. Although this transmission and any
 attachments are believed to be free of any virus or other defect
 that might affect any computer system into which it is received and
 opened, it is the responsibility of the recipient to ensure that it
 is virus free and no responsibility is accepted by JPMorgan Chase 
 Co., its subsidiaries and affiliates, as applicable, for any loss
 or damage arising in any way from its use. If you received this
 transmission in error, please immediately contact the sender and
 destroy the material in its entirety, whether in electronic or hard
 copy format. Thank you.
 Please refer to http://www.jpmorgan.com/pages/disclosures for
 disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Dipl.-Tech. Math. Markus Schmidberger

Ludwig-Maximilians-Universität München
IBE - Institut für medizinische Informationsverarbeitung,
Biometrie und Epidemiologie
Marchioninistr. 15, D-81377 Muenchen
URL: http://ibe.web.med.uni-muenchen.de 
Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
Tel: +49 (089) 7095 - 4599





Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is 

[R] Information on the number of CPU's

2008-09-09 Thread tolga . i . uzuner
Dear R Users,
I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering 
if there is a way to find out, within R, the number of CPU's on my machine 
? I would use this information to set the number of nodes in a cluster, 
depending on the machine. Sys.info() and .Platform do not carry this 
information.
Thanks in advance,
Tolga Uzuner

Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Binning

2008-09-09 Thread Felipe Carrillo
Dear List:
I have a dataset with over 5000 records and I would like to put the Count in 
bins
 based on the ForkLength. e.g.
  Forklength   Count
 32-34?
 35-37?
 38-40?
 and so on...
 and lastly I would like to plot (scatterplot) including the SampleDate
 along the X axis and ForkLength along the Y axis. I recently saw an
  example similar to this one here but I don't want a histogram I  just 
want to see the ForkLength ranges with different colors. For example:
  ForkLength 32-34---green
  ForkLength 35-37---red
  ForkLength 38-40--Orange
  Thanks in advance
 
 SampleDate   ForkLength Count
112/4/2007 32 2
212/6/2007 33 1
312/7/2007 33 2
412/7/2007 33 2
512/7/2007 34 1
612/9/2007 31 1
712/9/2007 33 2
8   12/10/2007 33 5
9   12/10/2007 34 1
10  12/11/2007 33 2
11  12/15/2007 34 1
12  12/16/2007 33 2
13  12/17/2007 35 1
14  12/19/2007 33 1
15  12/19/2007 35 1
16  12/20/2007 31 1
17  12/20/2007 32 1
18  12/20/2007 33 1
19  12/20/2007 34 3
20  12/21/2007 31 1
21  12/21/2007 32 3
22  12/21/2007 33 4
23  12/21/2007 3411
24  12/21/2007 3516
25  12/21/2007 36 3
26  12/21/2007 37 1
27  12/22/2007 32 1
28  12/22/2007 33 3
29  12/22/2007 34 1
30  12/22/2007 35 2
31  12/23/2007 32 1
32  12/23/2007 35 1
33  12/25/2007 32 1
34  12/25/2007 36 1
35  12/26/2007 34 1
36  12/26/2007 35 2
37  12/26/2007 36 1
38  12/27/2007 34 4
39  12/27/2007 35 2
40  12/27/2007 36 2
41  12/28/2007 32 1
42  12/28/2007 33 1
43  12/28/2007 34 1
44  12/28/2007 35 3
45  12/28/2007 36 4
46  12/28/2007 37 6
47  12/28/2007 38 2
48  12/28/2007 39 2
49  12/29/2007 34 1
50  12/29/2007 35 5
51  12/29/2007 36 2
52  12/29/2007 37 1
53  12/30/2007 33 3
54  12/30/2007 3410
55  12/30/2007 3510
56  12/30/2007 36 6
57  12/30/2007 3715
58  12/30/2007 38 3
59  12/31/2007 33 3
60  12/31/2007 34 8
61  12/31/2007 35 9
62  12/31/2007 36 6
63  12/31/2007 37 3
64  12/31/2007 38 1
651/1/2008 34 6
661/1/2008 35 6
671/1/2008 35 1
681/1/2008 36 6
691/1/2008 37 9
701/1/2008 38 1
711/2/2008 34 2
721/2/2008 34 1
731/2/2008 35 2
741/2/2008 36 2
751/2/2008 37 2
761/2/2008 39 1
771/3/2008 34 3
781/3/2008 35 3
791/3/2008 36 2
801/3/2008 37 3
811/8/2008 32 1
821/8/2008 33 7
831/8/2008 34 6
841/8/2008 3510
851/8/2008 3616
861/8/2008 37 7
871/8/2008 38 1
881/8/2008 39 1
891/9/2008 33 1
901/9/2008 3420
911/9/2008 3549
921/9/2008 3649
931/9/2008 3739
941/9/2008 37 1
951/9/2008 3818
961/9/2008 39 1
971/9/2008 40 1
98   1/10/2008 32 3
99   1/10/2008 3313
100  1/10/2008 3456
101  1/10/2008 3533
102  1/10/2008 3624
103  1/10/2008 3718
104  1/10/2008 39 1
105  1/11/2008 33 7
106  1/11/2008 3446
107  1/11/2008 3541
108  1/11/2008 3628
109  1/11/2008 3729

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
California, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cluster/snow question

2008-09-09 Thread Luke Tierney

On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


Hi Markus,

Many thanks. Is the cluster variable you mention below available in the
environment of the nodes ? Specifically, within that environment, how
could one identify the rank of that specific node ?


No -- that isn't the way snow works.  With snow the partitioning is
done on the master. If you need a node to know how many other nodes
there are or which index it represents in a clusterApply call then you
need to pass that information in the arguments.

luke



My code would use that information to partition the problem.

Thanks,
Tolga




Markus Schmidberger [EMAIL PROTECTED]
09/09/2008 07:11
Please respond to
[EMAIL PROTECTED]


To
[EMAIL PROTECTED]
cc
r-help@r-project.org
Subject
Re: [R] cluster/snow question






Hi Tolga,

in SNOW you have to start a cluster with the command

 library(snow)
 cluster - makeCluster(#nodes)

The object cluster is a list with an object for each node and each
object again is a list with all informations (rank, comm, tags)
The size of the cluster is the length of the list.

 #nodes == length(cluster)

E.g. the rank for node one you can get by
 cluster[[1]]$rank

Best
Markus

[EMAIL PROTECTED] schrieb:

Dear R Users,

I am attempting to use the snow package for clustering. Is there a way

to

identfy, in the environment of each node, a rank for that node and also,



the total size of the cluster ?

By way of analogy, I am looking for the functions in snow equivalent to
mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes things



clearer.

Thanks in advance,
Tolga

Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.
   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




--
Dipl.-Tech. Math. Markus Schmidberger

Ludwig-Maximilians-Universit?? M??chen
IBE - Institut f?? medizinische Informationsverarbeitung,
Biometrie und Epidemiologie
Marchioninistr. 15, D-81377 Muenchen
URL: http://ibe.web.med.uni-muenchen.de
Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
Tel: +49 (089) 7095 - 4599





Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, 

[R] splitting time vector into days

2008-09-09 Thread Alexy Khrabrov
Greetings -- I have a dataframe a with one element a vector, time, of  
POSIXct values.  What's a good way to split the data frame into  
periods of a$time, e.g. days, and apply a function, e.g. mean, to some  
other column of the dataframe, e.g. a$value?


Cheers,
Alexy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Modality Test

2008-09-09 Thread roger koenker

the diptest package, perhaps?


url:www.econ.uiuc.edu/~rogerRoger Koenker
email[EMAIL PROTECTED]Department of Economics
vox: 217-333-4558University of Illinois
fax:   217-244-6678Champaign, IL 61820



On Sep 9, 2008, at 11:23 AM, Amin W. Mugera wrote:



Dear Readers:

I have two issues in nonparametric statistical analysis that i need
help:

First, does R have a package that can implement the multimodality  
test,

e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen
an earlier thread (sometime in 2003) where someone was trying to write
a code for the Silverman test of multimodality. Is there any other
tests that can enable me to know how many modes are in a distribution?

Second, i would like to test whether two distributions are equal.  
Does R

have a  package than can implement the Li (1996) test of the equality
of two distributions? Is there any other test i can use rather than  
the

Li test?

Thank you in advance for your help.

Amin Mugera
Graduate Student
AgEcon Dept. Kansas State University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cluster/snow question

2008-09-09 Thread tolga . i . uzuner
Understood, that's what I'll do. I'm thinking of exporting the number of 
nodes to all nodes and passing in the node rank as 1:nonodes through 
clusterApply.
Thanks all,
Tolga




Luke Tierney [EMAIL PROTECTED] 
09/09/2008 20:11

To
[EMAIL PROTECTED]
cc
[EMAIL PROTECTED], r-help@r-project.org
Subject
Re: [R] cluster/snow question






On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:

 Hi Markus,

 Many thanks. Is the cluster variable you mention below available in 
the
 environment of the nodes ? Specifically, within that environment, how
 could one identify the rank of that specific node ?

No -- that isn't the way snow works.  With snow the partitioning is
done on the master. If you need a node to know how many other nodes
there are or which index it represents in a clusterApply call then you
need to pass that information in the arguments.

luke


 My code would use that information to partition the problem.

 Thanks,
 Tolga




 Markus Schmidberger [EMAIL PROTECTED]
 09/09/2008 07:11
 Please respond to
 [EMAIL PROTECTED]


 To
 [EMAIL PROTECTED]
 cc
 r-help@r-project.org
 Subject
 Re: [R] cluster/snow question






 Hi Tolga,

 in SNOW you have to start a cluster with the command

  library(snow)
  cluster - makeCluster(#nodes)

 The object cluster is a list with an object for each node and each
 object again is a list with all informations (rank, comm, tags)
 The size of the cluster is the length of the list.

  #nodes == length(cluster)

 E.g. the rank for node one you can get by
  cluster[[1]]$rank

 Best
 Markus

 [EMAIL PROTECTED] schrieb:
 Dear R Users,

 I am attempting to use the snow package for clustering. Is there a way
 to
 identfy, in the environment of each node, a rank for that node and 
also,

 the total size of the cluster ?

 By way of analogy, I am looking for the functions in snow equivalent to
 mpi.comm.rank() and mpi.comm.size() from RMPI, in case that makes 
things

 clearer.

 Thanks in advance,
 Tolga

 Generally, this communication is for informational purposes only
 and it is not intended as an offer or solicitation for the purchase
 or sale of any financial instrument or as an official confirmation
 of any transaction. In the event you are receiving the offering
 materials attached below related to your interest in hedge funds or
 private equity, this communication may be intended as an offer or
 solicitation for the purchase or sale of such fund(s).  All market
 prices, data and other information are not warranted as to
 completeness or accuracy and are subject to change without notice.
 Any comments or statements made herein do not necessarily reflect
 those of JPMorgan Chase  Co., its subsidiaries and affiliates.

 This transmission may contain information that is privileged,
 confidential, legally privileged, and/or exempt from disclosure
 under applicable law. If you are not the intended recipient, you
 are hereby notified that any disclosure, copying, distribution, or
 use of the information contained herein (including any reliance
 thereon) is STRICTLY PROHIBITED. Although this transmission and any
 attachments are believed to be free of any virus or other defect
 that might affect any computer system into which it is received and
 opened, it is the responsibility of the recipient to ensure that it
 is virus free and no responsibility is accepted by JPMorgan Chase 
 Co., its subsidiaries and affiliates, as applicable, for any loss
 or damage arising in any way from its use. If you received this
 transmission in error, please immediately contact the sender and
 destroy the material in its entirety, whether in electronic or hard
 copy format. Thank you.
 Please refer to http://www.jpmorgan.com/pages/disclosures for
 disclosures relating to UK legal entities.
[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Dipl.-Tech. Math. Markus Schmidberger

 Ludwig-Maximilians-Universit?? M??chen
 IBE - Institut f?? medizinische Informationsverarbeitung,
 Biometrie und Epidemiologie
 Marchioninistr. 15, D-81377 Muenchen
 URL: http://ibe.web.med.uni-muenchen.de
 Mail: Markus.Schmidberger [at] ibe.med.uni-muenchen.de
 Tel: +49 (089) 7095 - 4599





 Generally, this communication is for informational purposes only
 and it is not intended as an offer or solicitation for the purchase
 or sale of any financial instrument or as an official confirmation
 of any transaction. In the event you are receiving the offering
 materials attached below related to your interest in hedge funds or
 private equity, this communication may be intended as an offer or
 solicitation for the purchase or sale of such fund(s).  All market
 prices, data and other information are not warranted as to
 completeness or accuracy and 

Re: [R] Information on the number of CPU's

2008-09-09 Thread Prof Brian Ripley

On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


Dear R Users,
I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering
if there is a way to find out, within R, the number of CPU's on my machine
? I would use this information to set the number of nodes in a cluster,
depending on the machine. Sys.info() and .Platform do not carry this
information.


Correct, since

a) R does not make use of more than 1.

b) It is really not portable, and not even well-defined.  (How many CPUs 
has a hyperthreaded dual Xeon?  Some say 2, some say 4.  Do you want 
CPUs or cores?  If this is a virtualized OS, is the physical number or the 
logical number?)


In the case of Windows, how depends on the Windows version.  The w32api 
(XP or later) call GetNativeSystemInfo will tell you the number of CPUs, 
for some (unstated) definition of 'CPU'.  Later versions have 
GetLogicalProcessorInformation, which can give the number of cores.



Thanks in advance,
Tolga Uzuner


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Information on the number of CPU's

2008-09-09 Thread tolga . i . uzuner

Many thanks, that's very helpful.
Regards,
Tolga


- Original Message -
From: Prof Brian Ripley [EMAIL PROTECTED]
Sent: 09/09/2008 20:57 CET
To: Tolga Uzuner
Cc: r-help@r-project.org
Subject: Re: [R] Information on the number of CPU's



On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


Dear R Users,
I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering
if there is a way to find out, within R, the number of CPU's on my machine
? I would use this information to set the number of nodes in a cluster,
depending on the machine. Sys.info() and .Platform do not carry this
information.


Correct, since

a) R does not make use of more than 1.

b) It is really not portable, and not even well-defined.  (How many CPUs 
has a hyperthreaded dual Xeon?  Some say 2, some say 4.  Do you want 
CPUs or cores?  If this is a virtualized OS, is the physical number or the 
logical number?)


In the case of Windows, how depends on the Windows version.  The w32api 
(XP or later) call GetNativeSystemInfo will tell you the number of CPUs, 
for some (unstated) definition of 'CPU'.  Later versions have 
GetLogicalProcessorInformation, which can give the number of cores.



Thanks in advance,
Tolga Uzuner


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Modality Test

2008-09-09 Thread Mark Difford

Hi Amin,

 First, does R have a package that can implement the multimodality test, 
 e.g., the Silverman test, DIP test, MAP test or Runt test.

Jeremy Tantrum (a Ph.D. student of Werner Steutzle's, c. 2003/04) did some
work on this. There is some useful code on Steutzle's website:

http://www.stat.washington.edu/wxs/Stat593-s03/Code/jeremy-unimodality.R

I used it last year when I was trying to solve the problem of how best to
compare lots of density curves (age distributions of 3 spp. of tree
euphorbias from about very different 35 sites). In particular I had to
ensure that I wasn't creating spurious bimodality at a particular age range
when combining sites.

You might find it useful. Feel free to contact me off list if the code has
gone, as I think I still have it (somewhere).

Regards, Mark.


Amin W. Mugera wrote:
 
 
 Dear Readers:
 
 I have two issues in nonparametric statistical analysis that i need
 help:
 
 First, does R have a package that can implement the multimodality test,
 e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen
 an earlier thread (sometime in 2003) where someone was trying to write
 a code for the Silverman test of multimodality. Is there any other
 tests that can enable me to know how many modes are in a distribution?
 
 Second, i would like to test whether two distributions are equal. Does R
 have a  package than can implement the Li (1996) test of the equality
 of two distributions? Is there any other test i can use rather than the
 Li test?
 
 Thank you in advance for your help.
 
 Amin Mugera
 Graduate Student
 AgEcon Dept. Kansas State University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Modality-Test-tp19396085p19400095.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Modality Test

2008-09-09 Thread Mark Difford

Whoops! I think that should be Stuetzle --- though I very much doubt that he
reads the list.


Mark Difford wrote:
 
 Hi Amin,
 
 First, does R have a package that can implement the multimodality test, 
 e.g., the Silverman test, DIP test, MAP test or Runt test.
 
 Jeremy Tantrum (a Ph.D. student of Werner Steutzle's, c. 2003/04) did some
 work on this. There is some useful code on Steutzle's website:
 
 http://www.stat.washington.edu/wxs/Stat593-s03/Code/jeremy-unimodality.R
 
 I used it last year when I was trying to solve the problem of how best to
 compare lots of density curves (age distributions of 3 spp. of tree
 euphorbias from about very different 35 sites). In particular I had to
 ensure that I wasn't creating spurious bimodality at a particular age
 range when combining sites.
 
 You might find it useful. Feel free to contact me off list if the code has
 gone, as I think I still have it (somewhere).
 
 Regards, Mark.
 
 
 Amin W. Mugera wrote:
 
 
 Dear Readers:
 
 I have two issues in nonparametric statistical analysis that i need
 help:
 
 First, does R have a package that can implement the multimodality test,
 e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen
 an earlier thread (sometime in 2003) where someone was trying to write
 a code for the Silverman test of multimodality. Is there any other
 tests that can enable me to know how many modes are in a distribution?
 
 Second, i would like to test whether two distributions are equal. Does R
 have a  package than can implement the Li (1996) test of the equality
 of two distributions? Is there any other test i can use rather than the
 Li test?
 
 Thank you in advance for your help.
 
 Amin Mugera
 Graduate Student
 AgEcon Dept. Kansas State University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/Modality-Test-tp19396085p19400138.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Information on the number of CPU's

2008-09-09 Thread Luke Tierney

The wmic command line utility can also be used to query this; on a
dual-core Vista laptop I get

C:\Users\lukewmic cpu get NumberOfCores,NumberOfLogicalProcessors
NumberOfCores  NumberOfLogicalProcessors
2  2

luke

--

Luke Tierney
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  [EMAIL PROTECTED]
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


Many thanks, that's very helpful.
Regards,
Tolga


- Original Message -
From: Prof Brian Ripley [EMAIL PROTECTED]
Sent: 09/09/2008 20:57 CET
To: Tolga Uzuner
Cc: r-help@r-project.org
Subject: Re: [R] Information on the number of CPU's



On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


Dear R Users,
I am on Windows XP SP2 platform, using R version 2.7.2 . I was wondering
if there is a way to find out, within R, the number of CPU's on my machine
? I would use this information to set the number of nodes in a cluster,
depending on the machine. Sys.info() and .Platform do not carry this
information.


Correct, since

a) R does not make use of more than 1.

b) It is really not portable, and not even well-defined.  (How many CPUs has 
a hyperthreaded dual Xeon?  Some say 2, some say 4.  Do you want CPUs or 
cores?  If this is a virtualized OS, is the physical number or the logical 
number?)


In the case of Windows, how depends on the Windows version.  The w32api (XP 
or later) call GetNativeSystemInfo will tell you the number of CPUs, for some 
(unstated) definition of 'CPU'.  Later versions have 
GetLogicalProcessorInformation, which can give the number of cores.



Thanks in advance,
Tolga Uzuner


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
Generally, this communication is for informational purposes only
and it is not intended as an offer or solicitation for the purchase
or sale of any financial instrument or as an official confirmation
of any transaction. In the event you are receiving the offering
materials attached below related to your interest in hedge funds or
private equity, this communication may be intended as an offer or
solicitation for the purchase or sale of such fund(s).  All market
prices, data and other information are not warranted as to
completeness or accuracy and are subject to change without notice.
Any comments or statements made herein do not necessarily reflect
those of JPMorgan Chase  Co., its subsidiaries and affiliates.

This transmission may contain information that is privileged,
confidential, legally privileged, and/or exempt from disclosure
under applicable law. If you are not the intended recipient, you
are hereby notified that any disclosure, copying, distribution, or
use of the information contained herein (including any reliance
thereon) is STRICTLY PROHIBITED. Although this transmission and any
attachments are believed to be free of any virus or other defect
that might affect any computer system into which it is received and
opened, it is the responsibility of the recipient to ensure that it
is virus free and no responsibility is accepted by JPMorgan Chase 
Co., its subsidiaries and affiliates, as applicable, for any loss
or damage arising in any way from its use. If you received this
transmission in error, please immediately contact the sender and
destroy the material in its entirety, whether in electronic or hard
copy format. Thank you.
Please refer to http://www.jpmorgan.com/pages/disclosures for
disclosures relating to UK legal entities.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Modality Test

2008-09-09 Thread Mark Difford

Hi Amin,

And I have just remembered that there is a function called curveRep in Frank
Harrell's Hmisc package that might be useful, even if not quite in the
channel of your enquiry. curveRep was added to the package after my
struggles, so I never used it and so don't know how well it performs (quite
well, I would think).

Regards, Mark.


Amin W. Mugera wrote:
 
 
 Dear Readers:
 
 I have two issues in nonparametric statistical analysis that i need
 help:
 
 First, does R have a package that can implement the multimodality test,
 e.g., the Silverman test, DIP test, MAP test or Runt test. I have seen
 an earlier thread (sometime in 2003) where someone was trying to write
 a code for the Silverman test of multimodality. Is there any other
 tests that can enable me to know how many modes are in a distribution?
 
 Second, i would like to test whether two distributions are equal. Does R
 have a  package than can implement the Li (1996) test of the equality
 of two distributions? Is there any other test i can use rather than the
 Li test?
 
 Thank you in advance for your help.
 
 Amin Mugera
 Graduate Student
 AgEcon Dept. Kansas State University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Modality-Test-tp19396085p19400426.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NMDS and varimax rotation

2008-09-09 Thread Bernd Panassiti
hello,

subsequently to a NMDS analysis (performed with metaMDS or isoMDS) is 
it possible to 
rotate the axis through a varimax-rotation?

Thanks in advance.

Bernd Panassiti

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] csaps in R?

2008-09-09 Thread Dr Carbon
Is there is function in R equivalent to Matlab's csaps? I need a
spline function with the same calculation of the smoothing parameter
in csaps to compare some results. AFAIK, the spar in smooth.spline is
related but not the same.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] tsdiag error

2008-09-09 Thread rkevinburton
Does anyone know why I get the following error when trying tsdiag?

Error in UseMethod(tsdiag) : no applicable method for tsdiag

I am invoking it as: tsdiag(mar).

Thank you.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting time vector into days

2008-09-09 Thread stephen sefick
?aggregate
?window.zoo
?rollapply

anyway have a look at package zoo

On Tue, Sep 9, 2008 at 3:25 PM, Alexy Khrabrov [EMAIL PROTECTED] wrote:
 Greetings -- I have a dataframe a with one element a vector, time, of
 POSIXct values.  What's a good way to split the data frame into periods of
 a$time, e.g. days, and apply a function, e.g. mean, to some other column of
 the dataframe, e.g. a$value?

 Cheers,
 Alexy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] NMDS and varimax rotation

2008-09-09 Thread stephen sefick
have you looked at the vegan viginette- I know there is proscrutes rotation.

On Tue, Sep 9, 2008 at 3:54 PM, Bernd Panassiti
[EMAIL PROTECTED] wrote:
 hello,

 subsequently to a NMDS analysis (performed with metaMDS or isoMDS) is
 it possible to
 rotate the axis through a varimax-rotation?

 Thanks in advance.

 Bernd Panassiti

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick
Research Scientist
Southeastern Natural Sciences Academy

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods. We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] building a package that contains S4 classes and methods

2008-09-09 Thread Marie Pierre Sylvestre
Hello R users,

I am trying to make a my first package and I get an error that I can
understand. The package is build out of three files (one for functions, 1
for s4 classes and 1 for s4 methods).

Once I source them I run 

package.skeleton( name=TDC )

within a R session and I get 

Creating directories ...
Creating DESCRIPTION ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './TDC/Read-and-delete-me'.
Warning messages:
1: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse of an S4 object will not be source()able
2: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse of an S4 object will not be source()able
3: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse of an S4 object will not be source()able
4: In dump(internalObjs, file = file.path(code_dir,
sprintf(%s-internal.R,  :
  deparse may be incomplete


I keep going in spite of the warnings with 
R CMD check --no-examples TDC

and I get 
* checking for working pdflatex ... OK
* using log directory
'/home/mariepierre/Packages/PermAlgo/PermAlgo/PermAlgo2/TDC.Rcheck'
* using R version 2.7.1 (2008-06-23)
* using session charset: UTF-8
* checking for file 'TDC/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'TDC' version '1.0'
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking whether package 'TDC' can be installed ... ERROR
Installation failed.

The error file says:
 
* Installing *source* package 'TDC' ...
** R
** preparing package for lazy loading
Error in parse(n = -1, file = file) : unexpected '' at
102: `.__C__BindArgs` -
103: 
Calls: Anonymous - code2LazyLoadDB - sys.source - parse
Execution halted
ERROR: lazy loading failed for package 'TDC'
** Removing
'/home/mariepierre/Packages/PermAlgo/PermAlgo/PermAlgo2/TDC.Rcheck/TDC'

The problem is with my classes and methods. The respective files contain:

setClass(BindArgs,  signature( function ))
setClass(BindArgs2, signature( function )) 

and

setMethod(initialize, BindArgs, function( .Object, f, ... )
  callNextMethod( .Object, function( x ) f( x, ... ) )) 

setMethod(initialize, BindArgs2, function( .Object, f, ...)
  callNextMethod( .Object, function( x, y ) f( x, y, ... ) )) 

Everything works well within a R session but I can build the package.

If I look at the internal R file that this created I get

`.__C__BindArgs` -
S4 object of class structure(classRepresentation, package = methods)
`.__C__BindArgs2` -
S4 object of class structure(classRepresentation, package = methods)
`.__M__initialize:methods` -
S4 object of class structure(MethodsList, package = methods)
`.__T__initialize:methods` -
environment

Well, let just say that I am new to classes so this confuses me greatly. I
have checked the documentation and tried a few things but I reached my
personal limits!

I am using R 2.7.1 on Linux Fedora 8.

Any comments on what is happening and/or help would be greatly appreciated.

MP

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] splitting time vector into days

2008-09-09 Thread jim holtman
Here is one way of doing it:

 x - data.frame(dates=seq(as.POSIXct('2008-09-08'), by='7 hours', length=10),
+ values=1:10)
 # split into days
 x.s - split(x, format(x$dates, %Y%m%d))
 x.s
$`20080908`
dates values
1 2008-09-08 00:00:00  1
2 2008-09-08 07:00:00  2
3 2008-09-08 14:00:00  3
4 2008-09-08 21:00:00  4

$`20080909`
dates values
5 2008-09-09 04:00:00  5
6 2008-09-09 11:00:00  6
7 2008-09-09 18:00:00  7

$`20080910`
 dates values
8  2008-09-10 01:00:00  8
9  2008-09-10 08:00:00  9
10 2008-09-10 15:00:00 10

 lapply(x.s, function(.df) mean(.df$values))
$`20080908`
[1] 2.5

$`20080909`
[1] 6

$`20080910`
[1] 9




On Tue, Sep 9, 2008 at 3:25 PM, Alexy Khrabrov [EMAIL PROTECTED] wrote:
 Greetings -- I have a dataframe a with one element a vector, time, of
 POSIXct values.  What's a good way to split the data frame into periods of
 a$time, e.g. days, and apply a function, e.g. mean, to some other column of
 the dataframe, e.g. a$value?

 Cheers,
 Alexy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with 'spectrum'

2008-09-09 Thread Prof Brian Ripley
This is why some help pages have references: please use them (Venables  
Ripley explain the exact formulae used in R).


On Tue, 9 Sep 2008, [EMAIL PROTECTED] wrote:


For the command 'spectrum' I read:

The spectrum here is defined with scaling 1/frequency(x), following 
S-PLUS. This makes the spectral density a density over the range 
(-frequency(x)/2, +frequency(x)/2], whereas a more common scaling is 2π 
and range (-0.5, 0.5] (e.g., Bloomfield) or 1 and range (-π, π].



Forgive my ignorance but I am having a hard time interpreting this. Does 
this mean that in the spectrum output every element of the $spec array 
is scaled by 1/frequency(x)? I am having a hard time determing what is 
meant by 'frequency'.


So please do look up the help for frequency().

 Say I define a time series for a year with samples 
for every day. I input a 'frequency' of 365 (which in my mind is the 
period).


The point is that your time unit is 1 year, and your measurements are 
every 1/365 year.  That is unrelated to the 'period' (no one mentioned 
periodicity yet).


On the output of 'spectrum' would this mean that every element 
of the $spec array is scaled by 1/365? There is a corresponding 
frequency array on the output from 'spectrum'. If the frequency is 365 
and an element in the frequency array output from 'spectrum' is .1 am I 
to assume that the period is 36.5 and a corresponding sin wave would be 
sin(2 * pi * 36.5/365)?


Hmm, you need a 't' in there (and a phase).  The issue is the units for t. 
A frequency in the 'freq' element of the output of 0.1 corresponds to 10 
cycles per unit of time, and in your example the unit of time is 365 
observations.  So the sine (sic) wave is sin(2*pi*0.1*t + phi), where the 
increments in 't' are 1/365: that gives 10 complete cycles in observations 
at, say, c(1990, 1) ... c(1990, 365), the days of 1990 (not a leap year).



Thank you in advance for helping me clear up some confusion.

Kevin


--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Binning

2008-09-09 Thread jim holtman
This should do what you want.

#--x - read.table('clipboard', header=TRUE, as.is=TRUE)
# convert dates
x$date - as.POSIXct(strptime(x$SampleDate, %m/%d/%Y))
# put ForkLength into bins
x$bins - cut(x$ForkLength, breaks=c(32, 34, 37, 40), include.lowest=TRUE)
# count the bins
tapply(x$Count, x$bins, sum)
# plot the data
plot(x$date, x$ForkLength, col=c('green', 'red', 'orange')[x$bins])






On Tue, Sep 9, 2008 at 3:12 PM, Felipe Carrillo
[EMAIL PROTECTED] wrote:
 Dear List:
 I have a dataset with over 5000 records and I would like to put the Count in 
 bins
  based on the ForkLength. e.g.
  Forklength   Count
 32-34?
 35-37?
 38-40?
 and so on...
 and lastly I would like to plot (scatterplot) including the SampleDate
 along the X axis and ForkLength along the Y axis. I recently saw an
  example similar to this one here but I don't want a histogram I  
 just want to see the ForkLength ranges with different colors. For example:
  ForkLength 32-34---green
  ForkLength 35-37---red
  ForkLength 38-40--Orange
  Thanks in advance

  SampleDate   ForkLength Count
 112/4/2007 32 2
 212/6/2007 33 1
 312/7/2007 33 2
 412/7/2007 33 2
 512/7/2007 34 1
 612/9/2007 31 1
 712/9/2007 33 2
 8   12/10/2007 33 5
 9   12/10/2007 34 1
 10  12/11/2007 33 2
 11  12/15/2007 34 1
 12  12/16/2007 33 2
 13  12/17/2007 35 1
 14  12/19/2007 33 1
 15  12/19/2007 35 1
 16  12/20/2007 31 1
 17  12/20/2007 32 1
 18  12/20/2007 33 1
 19  12/20/2007 34 3
 20  12/21/2007 31 1
 21  12/21/2007 32 3
 22  12/21/2007 33 4
 23  12/21/2007 3411
 24  12/21/2007 3516
 25  12/21/2007 36 3
 26  12/21/2007 37 1
 27  12/22/2007 32 1
 28  12/22/2007 33 3
 29  12/22/2007 34 1
 30  12/22/2007 35 2
 31  12/23/2007 32 1
 32  12/23/2007 35 1
 33  12/25/2007 32 1
 34  12/25/2007 36 1
 35  12/26/2007 34 1
 36  12/26/2007 35 2
 37  12/26/2007 36 1
 38  12/27/2007 34 4
 39  12/27/2007 35 2
 40  12/27/2007 36 2
 41  12/28/2007 32 1
 42  12/28/2007 33 1
 43  12/28/2007 34 1
 44  12/28/2007 35 3
 45  12/28/2007 36 4
 46  12/28/2007 37 6
 47  12/28/2007 38 2
 48  12/28/2007 39 2
 49  12/29/2007 34 1
 50  12/29/2007 35 5
 51  12/29/2007 36 2
 52  12/29/2007 37 1
 53  12/30/2007 33 3
 54  12/30/2007 3410
 55  12/30/2007 3510
 56  12/30/2007 36 6
 57  12/30/2007 3715
 58  12/30/2007 38 3
 59  12/31/2007 33 3
 60  12/31/2007 34 8
 61  12/31/2007 35 9
 62  12/31/2007 36 6
 63  12/31/2007 37 3
 64  12/31/2007 38 1
 651/1/2008 34 6
 661/1/2008 35 6
 671/1/2008 35 1
 681/1/2008 36 6
 691/1/2008 37 9
 701/1/2008 38 1
 711/2/2008 34 2
 721/2/2008 34 1
 731/2/2008 35 2
 741/2/2008 36 2
 751/2/2008 37 2
 761/2/2008 39 1
 771/3/2008 34 3
 781/3/2008 35 3
 791/3/2008 36 2
 801/3/2008 37 3
 811/8/2008 32 1
 821/8/2008 33 7
 831/8/2008 34 6
 841/8/2008 3510
 851/8/2008 3616
 861/8/2008 37 7
 871/8/2008 38 1
 881/8/2008 39 1
 891/9/2008 33 1
 901/9/2008 3420
 911/9/2008 3549
 921/9/2008 3649
 931/9/2008 3739
 941/9/2008 37 1
 951/9/2008 3818
 961/9/2008 39 1
 971/9/2008 40 1
 98   1/10/2008 32 3
 99   1/10/2008 3313
 100  1/10/2008 3456
 101  1/10/2008 3533
 102  1/10/2008 3624
 103  1/10/2008 3718
 104  1/10/2008 39 1
 105  1/11/2008 33 7
 106  1/11/2008 3446
 107  1/11/2008 3541
 108  1/11/2008 3628
 109  1/11/2008 3729

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
 California, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help

Re: [R] naive variance in GEE

2008-09-09 Thread Thomas Lumley

On Mon, 8 Sep 2008, Qiong Yang wrote:


Hi,

The standard error from logistic regression is slightly different from the 
naive SE from GEE under independence working correlation structure.


Yes


Shouldn't they be identical? Anyone has insight about this?


No, they shouldn't. They are different estimators of the same quantity, 
like the mean and median of a symmetric distribution.


-thomas





Thanks,
Qiong

a-rbinom(1000,1)
b-rbinom(1000,2,0.1)
c-rbinom(1000,10,0.5)
summary(gee(a~b, id=c,family=binomial,corstr=independence))$coef
summary(glm(a~b,family=binomial))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] naive variance in GEE

2008-09-09 Thread Thomas Lumley


Sorry, I misread your message. Prof Ripley is right, as usual -- the 
estimates use different stopping criteria and so are just numerically 
different.


-thomas

On Tue, 9 Sep 2008, Thomas Lumley wrote:


On Mon, 8 Sep 2008, Qiong Yang wrote:


Hi,

The standard error from logistic regression is slightly different from the 
naive SE from GEE under independence working correlation structure.


Yes


Shouldn't they be identical? Anyone has insight about this?


No, they shouldn't. They are different estimators of the same quantity, like 
the mean and median of a symmetric distribution.


-thomas





Thanks,
Qiong

a-rbinom(1000,1)
b-rbinom(1000,2,0.1)
c-rbinom(1000,10,0.5)
summary(gee(a~b, id=c,family=binomial,corstr=independence))$coef
summary(glm(a~b,family=binomial))

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle



Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] survey package

2008-09-09 Thread Thomas Lumley


Version 3.9 of the survey package is now on CRAN.  Since the last 
announcement (version 3.6-11, about a year ago) the main changes are
 - Database-backed survey objects: the data can live in a SQLite (or other 
DBI-compatible) database and be loaded as needed.

 - Ordinal logistic regression
 - Support for the 'mitools' package and multiply-imputed data
 - Conditioning plots, transparent scatterplots, survival and CDF plots.

There is more information on the package web page at
http://faculty.washington.edu/tlumley/survey/

-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

___
R-packages mailing list
[EMAIL PROTECTED]
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >