Re: [R] Change the text size of the title in a legend of a R plot.

2011-04-29 Thread Remko Duursma
Hi Victor,

looking at the code for legend, it looks like the same 'cex' value is
used for the text in the legend as the title.

Here is a trick though: draw the legend twice, with different cex
values, and omitting title or text:


plot(1)
legend(topright,c(a,b),title= )
legend(topright,c( , ),title=Legend,cex=0.6, bty='n', title.adj=0.15)


A bit of a hack but it works
If you want the title larger, it will probably not fit the box, which
you can omit by setting bty='n' (as in the second line).

good luck,

Remko



-
Remko Duursma
Research Lecturer

Centre for Plants and the Environment
University of Western Sydney
Hawkesbury Campus
Richmond NSW 2753

Mobile: +61 (0)422 096908
www.remkoduursma.com



On Fri, Apr 29, 2011 at 3:15 PM, Steven McKinney smckin...@bccrc.ca wrote:



 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf Of Victor Gabillon
 Sent: April-28-11 8:22 PM
 To: r-help@r-project.org
 Subject: [R] Change the text size of the title in a legend of a R plot.

 Hello,

 Is it possible to change the text size of the title in a legend of a R plot?

 I tried to directly change the title.cex argument but it seems not to work.

 Trying :

 Horizo - c(1,2,6,10,20)
 legtext - paste(Horizo,sep=)
 legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd,
 lty=lty,cex=1.1,ncol=3,title = Horizons,title.col =black,title.cex=1.4)

 I haven't found any cex argument that works for just the legend title, but you
 can get some modification of the title with the expression argument:

 legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd, 
 lty=lty,cex=1.1,ncol=3,
 title = expression(bold(Horizons)),title.col=black)

 Does that help?

 Otherwise, you can of course figure out which functions do the legend 
 plotting,
 copy and modify those to get a title cex in place.

 Steve McKinney




 gives the following error (sorry in french):
 Erreur dans legend(topleft, legend = legtext, col = col, text.col =
 col,  :
    argument(s) inutilisé(s) (title.cex = 1.4)

 saying title.cex argument as been ignored.

 Thank you for helping.

 Victor

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Still confused about classes

2011-04-29 Thread Russ Abbott
Hi,

I'm still confused about how to find out what methods are defined for a
given class.  For example, I know that

 today - Sys.Date()

will produce an object of type Date. But I'm not sure what I can do with
Date objects or how I can find out.

 ?Date


refers me to the Date documentation page. But it doesn't tell me how, for
example, to extract the current year from a date object.

I tried

 year(today)Error: could not find function year


Is there some other function that does the job? I want a function f such
that f(today)will return 2011. Perhaps there is no such function.
 But in general I don't have any confidence that I would know how to find it
if it existed or that I would know how to assure myself that there was no
such function.

Thanks.

*-- Russ *

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Still confused about classes

2011-04-29 Thread Joshua Wiley
Hi Russ,

One tool that might help could be ?methods and ?showMethods

For example:
## for S3
methods(class = Date)
## for S4
showMethods(classes = Date)

regarding getting the actual year, I would use (though there may be
better ways):

format.Date(as.Date(2010-01-01), format = %Y)

HTH,

Josh

On Thu, Apr 28, 2011 at 11:05 PM, Russ Abbott russ.abb...@gmail.com wrote:
 Hi,

 I'm still confused about how to find out what methods are defined for a
 given class.  For example, I know that

 today - Sys.Date()

 will produce an object of type Date. But I'm not sure what I can do with
 Date objects or how I can find out.

 ?Date


 refers me to the Date documentation page. But it doesn't tell me how, for
 example, to extract the current year from a date object.

 I tried

 year(today)Error: could not find function year


 Is there some other function that does the job? I want a function f such
 that     f(today)    will return 2011. Perhaps there is no such function.
  But in general I don't have any confidence that I would know how to find it
 if it existed or that I would know how to assure myself that there was no
 such function.

 Thanks.

 *-- Russ *

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Still confused about classes

2011-04-29 Thread Prof Brian Ripley

See ?months.

methods(class = Date)

would have got you there.

(Date is not an S4 class, so people should not be setting S4 methods 
on it without very good reason, and there are none in R itself.)


On Thu, 28 Apr 2011, Joshua Wiley wrote:


Hi Russ,

One tool that might help could be ?methods and ?showMethods

For example:
## for S3
methods(class = Date)
## for S4
showMethods(classes = Date)

regarding getting the actual year, I would use (though there may be
better ways):

format.Date(as.Date(2010-01-01), format = %Y)


Call format(), not its methods, or use strftime() directly.



HTH,

Josh

On Thu, Apr 28, 2011 at 11:05 PM, Russ Abbott russ.abb...@gmail.com wrote:

Hi,

I'm still confused about how to find out what methods are defined for a
given class.  For example, I know that


today - Sys.Date()


will produce an object of type Date. But I'm not sure what I can do with
Date objects or how I can find out.


?Date



refers me to the Date documentation page. But it doesn't tell me how, for
example, to extract the current year from a date object.

I tried


year(today)Error: could not find function year



Is there some other function that does the job? I want a function f such
that     f(today)    will return 2011. Perhaps there is no such function.
 But in general I don't have any confidence that I would know how to find it
if it existed or that I would know how to assure myself that there was no
such function.

Thanks.

*-- Russ *

       [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] bootstrapping problem

2011-04-29 Thread mimato
I want to classify bipolar neurons in human cochleas and have data of the
following structure:

Vol_Nuc Vol_Soma
1   186.23   731.96
2   204.58  4370.96
3   539.98  7344.86
4   477.71  6939.28
5   421.22  5588.53
6   276.61  1017.05
7   392.28  6392.32
8   424.43  6190.13
9   256.41  3850.51
10  249.17  3118.14
11  276.97  3037.29
12  295.30  3703.76
13  314.43  5265.97
14  301.15  5781.73

I already worked with Matlab (I´m not a programmer) and created nice
colourcoded dendrograms and also made some verifications of them. I started
now working with R and bootstrapped data with a library named pvclust. It
worked and R computed ...

here is the code:

library(pvclust)

data =
data.frame(Vol_Nuk=c(186.23,204.58,539.98,477.71,421.22,276.61,392.28,424.43,256.41,249.17,276.97,295.3,314.43,301.15),
Vol_Soma=c(731.96,4370.96,7344.86,6939.28,5588.53,1017.05,6392.32,6190.13,3850.51,3118.14,3037.29,3703.76,5265.97,5781.73))

plot(data)
result-pvclust(data,nboot=100)
plot(result)

It is also not working using following commands:

cluster.bootstrap - pvclust(Raw, nboot=1000, method.dist=abscor)
plot(cluster.bootstrap)
pvrect(cluster.bootstrap)

I always get the following problem:

mistake in plot.hclust(x$hclust, main = main, sub = sub, xlab = xlab, col =
col,  :
invalid input for Dendrogram

Does anyone has an idea whats wrong...

Thanks a lot!! 

--
View this message in context: 
http://r.789695.n4.nabble.com/bootstrapping-problem-tp3483068p3483068.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change the text size of the title in a legend of a R plot.

2011-04-29 Thread Jannis

On 04/29/2011 05:21 AM, Victor Gabillon wrote:

Horizo - c(1,2,6,10,20)
legtext - paste(Horizo,sep=)
legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd,
lty=lty,cex=1.1,ncol=3,title = Horizons,title.col 
=black,title.cex=1.4) 


I am not sure, but the manual regarding legend seems to be not correct 
(or at least misleading). There is not title.cex argument for legend 
(even though the help page mentions it). Either you set cex 1 but this 
will resize the labels as well. Or you modify the code of legend as follows:


change the following (near the end of the code):

   text2(left + w/2, top - ymax, labels = title, adj = c(0.5,
0), cex = cex, col = title.col)

to:

   text2(left + w/2, top - ymax, labels = title, adj = c(0.5,
0), cex = title.cex, col = title.col)

and add title.cex to the arguments of legend. Its probably easiest if 
you copy the code of legend and save its modified version within a 
different function.


Not sure on whom to contact regarding correcting the documentation of 
legend(). Perhaps even I am wrong, but I could not find any reference to 
title.cex in the code.


HTH
Jannis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] EM vs Bayesian

2011-04-29 Thread Mario Valle

Maybe it is better to ask this question on:

http://stats.stackexchange.com/

The question is not R specific.
mario

On 24-Apr-11 16:16, Jim Silverton wrote:

Hello,
Is there any literature there that says that the EM is better/worse than a
Baysian model when it comes to differentiating univariate mixture of normal
distributions?



--
Ing. Mario Valle
Data Analysis and Visualization Group| http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS)  | Tel:  +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax:  +41 (91) 610.82.82

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] matrix evaluation using if function

2011-04-29 Thread ivan
Hi All,

I am trying to create a function which evaluates whether the values (which
are equal to one) of a matrix are the same as their mirror values. Consider
the following matrix:

 n-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3)
 colnames(n)-cbind(A,B,C);rownames(n)-cbind(A,B,C)
 n
  A B C
A 0 1 0
B 1 0 1
C 1 0 0

Hence, since n[2,1] and n[1,2] are 1 and the same, the function should
return the name of the row of n[2,1]. I used the following function:

for (i in length(rownames(n))) {

for (j in length(colnames(n))){

if(n[i,j]==n[j,i]){

rownames(n)[[i]]-output} else {}

}

}

 output
NULL

The right answer would have been B, though. I simply do not see my
mistake. I am very greatful for suggestions.

Thank you.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Still confused about classes

2011-04-29 Thread Kenn Konstabel
The function for getting the year from date  is there in package
lubridate (as well as many other convenient functions to work with
dates).

More generally, finding all methods for a given class may be a
little tricky. If all means everything you have installed and
currently attached to your search path then methods(class=Date) will
do it (for S3 classes). (but The functions listed are  those which
_are named like methods_ and may not actually be   methods (known
exceptions are discarded in the code). ) The result depends on which
packages you have loaded: in my currently open R session,
methods(Date) lists 36 possible methods but after library(zoo) I
get two more ( as.yearmon.Date and as.yearqtr.Date).

Regards,
Kenn


On Fri, Apr 29, 2011 at 9:05 AM, Russ Abbott russ.abb...@gmail.com wrote:
 Hi,

 I'm still confused about how to find out what methods are defined for a
 given class.  For example, I know that

 today - Sys.Date()

 will produce an object of type Date. But I'm not sure what I can do with
 Date objects or how I can find out.

 ?Date


 refers me to the Date documentation page. But it doesn't tell me how, for
 example, to extract the current year from a date object.

 I tried

 year(today)Error: could not find function year


 Is there some other function that does the job? I want a function f such
 that     f(today)    will return 2011. Perhaps there is no such function.
  But in general I don't have any confidence that I would know how to find it
 if it existed or that I would know how to assure myself that there was no
 such function.

 Thanks.

 *-- Russ *

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using lme4 with three nested random effects

2011-04-29 Thread ONKELINX, Thierry
Dear Ben,

Are site, transect and plot factors? And do they have unique id's? 

You could try this

rws30.UL$site - factor(rws30.UL$site)
rws30.UL$transect - interaction(rws30.UL$site, rws30.UL$transect, drop = TRUE)
rws30.UL$plot - interaction(rws30.UL$site, rws30.UL$transect, rws30.UL$plot, 
drop = TRUE)
modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
 +(1|site/transect/plot),
 data=rws30.UL, family=gaussian, na.action=na.omit)

Or 

modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
 +(1|site) + (1|transect) + (1|plot),
 data=rws30.UL, family=gaussian, na.action=na.omit)

Best regards,

Thierry


ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek
team Biometrie  Kwaliteitszorg
Gaverstraat 4
9500 Geraardsbergen
Belgium

Research Institute for Nature and Forest
team Biometrics  Quality Assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium

tel. + 32 54/436 185
thierry.onkel...@inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than 
asking him to perform a post-mortem examination: he may be able to say what the 
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure 
that a reasonable answer can be extracted from a given body of data.
~ John Tukey
  

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] Namens Benjamin Caldwell
 Verzonden: vrijdag 29 april 2011 0:37
 Aan: r-help
 Onderwerp: [R] using lme4 with three nested random effects
 
 Hi all,
 I'm trying to fit models for data with three levels of nested random
 effects: site/transect/plot. For example,
 
 modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bar
 k.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
 +(1|site/transect/plot),
 data=rws30.UL, family=gaussian, na.action=na.omit)
 
 but I get the following error:
 
 Error: length(f1) == length(f2) is not TRUE In addition: 
 Warning messages:
 1: In plot:(transect:site) :
   numerical expression has 92 elements: only the first used
 2: In plot:(transect:site) :
   numerical expression has 92 elements: only the first used
 
 The formulation works for two nested effects (e.g. 1|site/transect)
 
 I can get it to run in lme
 modelincrBS-lme(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bark.
 thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num,
 data=rws30.UL, random=(~1| site/transect/plot),na.action=na.omit)
 
 but I can't specify a distribution family in that package.
 
 Any help much appreciated.
 
 Ben Caldwell
 
 *
 *
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reference variables by string in for loop

2011-04-29 Thread Michael Bach
Dear R Users,

I am trying to get the following to work better:

namevec - c(one, two, three)
for (name in namevec) {
namedf - eval(parse(text=paste(name, _df, sep=)))
...
...
}

The rationale behind it being that I created variables with names
one_df, two_df and three_df earlier in the same script which I want to
reference inside the for loop.  Is there a more elegant way to do this?

Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread hck
Dear all

Problem: hist()-function, scale = “percent”

I want to generate histograms for changing underlying data. In order to make
them comparable, I want to fix the y-axis (vertical-axis) to, e.g., 0%, 10%,
20%, 30% as well as to fix the spaces, too. So the y-axis in each histogram
should be identical. Currently, I have 100 histograms and the y-axis scales
changes in each. 

Here is my code:

=Hist(na.exclude(AA3), breaks=50, col=seashell3,
scale=percent,xlim=c(-1, 1), xlab=Bewertungsfehler,
ylab=Haeufigkeit (in %), main=KBV, border=white)

I tried the ylim=c(…), but unfortunately it does not work.

Thanks for your help in advance!
Regards,
Hans


--
View this message in context: 
http://r.789695.n4.nabble.com/plot-several-histograms-with-same-y-axes-scaling-using-hist-tp3483376p3483376.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Kenn Konstabel
On Fri, Apr 29, 2011 at 1:03 PM, Michael Bach pha...@gmail.com wrote:
 Dear R Users,

 I am trying to get the following to work better:

 namevec - c(one, two, three)
 for (name in namevec) {
    namedf - eval(parse(text=paste(name, _df, sep=)))
    ...
    ...
 }

 The rationale behind it being that I created variables with names
 one_df, two_df and three_df earlier in the same script which I want to
 reference inside the for loop.  Is there a more elegant way to do this?

Yes, one elegant way to do it would be using a named list instead of
separate variables.

X - list()
X$one_df - something
X[[two_df]] - something else
NAME - one_df
X[[NAME]]
NAME - two_df
X[[NAME]]
#etc

# the for loop could then be: for(name in names(X)) ... or for(element in X)

Another way (not elegant but better and shorter than the eval-parse
way) is to use get. ?get

Best regards,

Kenn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Nick Sabbe
Hi Michael.
This is a classic :-)

ObjectsOfInterest- list(one_df, two_df, three_df)
for(namedf in ObjectsOfInterest){...}

or probably even better
sapply(ObjectsOfInterest, function(namedf){...})

hth.


Nick Sabbe
--
ping: nick.sa...@ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36

-- Do Not Disapprove




-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Michael Bach
Sent: vrijdag 29 april 2011 12:03
To: r-help@r-project.org
Subject: [R] Reference variables by string in for loop

Dear R Users,

I am trying to get the following to work better:

namevec - c(one, two, three)
for (name in namevec) {
namedf - eval(parse(text=paste(name, _df, sep=)))
...
...
}

The rationale behind it being that I created variables with names
one_df, two_df and three_df earlier in the same script which I want to
reference inside the for loop.  Is there a more elegant way to do this?

Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] is there a way/library for generating colorful noise in R ??

2011-04-29 Thread Ubuntu Diego
I would like to generate some noisy time series. I know that it is possible to 
classify noise by looking at the exponent (beta) of the relationship between 
the spectrum of the time series and the frequencies (i.e. spectrum ~ frequency 
^ beta ).  
Is there a way to generate White (beta=0), Pink (beta=-1), Brown (Beta=-2), 
Blue(beta=1) and Violet (beta=2) noise in R ?.

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Summer student internship placement at University of York / YCCSA / SEI (paid)

2011-04-29 Thread Corrado Topi
Dear R-lings,

I did not know which list to post to, because it is a studentship so not 
really a job, so it did not fit the r-sig-jobs list  and it is about 
devloping an extension package interfaced with R  I hope I did not upset 
anyone. If so apologies.

The Centre For Complex systems Analysis at the University of York (YCCSA) in 
UK in collaboration with Stockholm Environment Institute is looking for a 
highly motivated student in Computer Science, Applied Mathematics, Applied 
Statistics or related fields for a 10 weeks paid student internship over the 
summer 2011, starting in july,  to collaborate in development of a R package. 
The student will participate in research projects to develop prototypes for 
toolkits for statistical predictions of diversity and dissimilarity and the 
generation of spatial landscapes, with applications in the biological and 
environmental sciences. We require excellent development skills and experience 
in CUDA/openCL, and a strong foundation in Computing, Statistics / Applied 
Mathematics and COmputer Graphics. We need an excellent problem solver, able 
to innovate, find solutions and work independently.

For further information on the project please contact ct...@york.ac.uk or go 
to http://www.york.ac.u...2011/201107.pdf

For further information on the studentship programme please look at 
http://www.york.ac.u...olarships.html.

Please send your application not later than the 13 of may to 
scholarsh...@yccsa.org as one single pdf document including:

1. Your CV (max 2 pages)
2. A brief personal statement (max 1 page) including:
* Which project(s) you are interested in (as many as you like but in 
preference order)
* Your reasons for applying
* Your academic interest
* Your future aspirations
3. A full written academic reference (not just contact details). Your 
application will not be accepted without this reference (max 1 page). 

Best,
-- 
Corrado Topi

Stockholm Environment Institute

Mob: +44 (0) 7769 601784
Tel: +44 (0) 1904 322893
Skype: corrado-eeos
Website:  sei-international.org

University of York
York YO10 5DD
UK

Fax: +44 (0) 1904 322898

EMAIL DISCLAIMER: http://www.york.ac.uk/docs/disclaimer/email.htm

-- 
Corrado Topi

Stockholm Environment Institute

Mob: +44 (0) 7769 601784
Tel: +44 (0) 1904 322893
Skype: corrado-eeos
Website:  sei-international.org

University of York
York YO10 5DD
UK

Fax: +44 (0) 1904 322898

EMAIL DISCLAIMER: http://www.york.ac.uk/docs/disclaimer/email.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread Philipp Pagel

On Fri, Apr 29, 2011 at 03:35:41AM -0700, hck wrote:
 Problem: hist()-function, scale = “percent”
[...]
 =Hist(na.exclude(AA3), breaks=50, col=seashell3,
 scale=percent,xlim=c(-1, 1), xlab=Bewertungsfehler,
 ylab=Haeufigkeit (in %), main=KBV, border=white)

Before anyone can really help you'll need to let us know where your
Hist() function came from. 

hist() from package graphics does not have a scale parameter and
honours ylim without a problem.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread hck
Thanks for the note: Indeed, the function is the hist() function not Hist()
with capital letter.

I use the standard R hist()-function with the lower case only. Nevertheless,
the ylim does not work as supposed to. 

--
View this message in context: 
http://r.789695.n4.nabble.com/plot-several-histograms-with-same-y-axes-scaling-using-hist-tp3483376p3483479.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Putting x-axis in opposite order

2011-04-29 Thread Jim Lemon

On 04/29/2011 04:09 AM, Bogaso Christofer wrote:

Hi all, please consider this plot:



xx- seq(4, 0.01, by = -0.04)

yy- rnorm(xx)

plot(xx, yy, type=l)



Here you see my original 'xx' was in decreasing order, however R puts it in
the increasing order. I understand that in any plot x and y axis grow is
increasing order, however I am wondering whether I can manipulate this to
suit my above particular problem, so that number displayed in x-axis would
be in the given order.


Hi Bogaso,
If all else fails, have a look at rev.axis in the plotrix package.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Michael Bach
Nick Sabbe nick.sa...@ugent.be writes:

 ObjectsOfInterest- list(one_df, two_df, three_df)
 for(namedf in ObjectsOfInterest){...}

I see. This is also more readable and traceable for others.

 or probably even better
 sapply(ObjectsOfInterest, function(namedf){...})

I like this one for its functional style.

 hth.

It did, thanks.

Kind Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reference variables by string in for loop

2011-04-29 Thread Michael Bach
Kenn Konstabel lebats...@gmail.com writes:

 Another way (not elegant but better and shorter than the eval-parse
 way) is to use get. ?get

This one is handy for interactive use, thanks for the hint.

Kind Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix evaluation using if function

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 4:27 AM, ivan wrote:


Hi All,

I am trying to create a function which evaluates whether the values  
(which
are equal to one) of a matrix are the same as their mirror values.  
Consider

the following matrix:


n-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3)
colnames(n)-cbind(A,B,C);rownames(n)-cbind(A,B,C)
n

 A B C
A 0 1 0
B 1 0 1
C 1 0 0

Hence, since n[2,1] and n[1,2] are 1 and the same, the function should
return the name of the row of n[2,1]. I used the following function:

for (i in length(rownames(n))) {

for (j in length(colnames(n))){

if(n[i,j]==n[j,i]){

rownames(n)[[i]]-output} else {}

}

}


output

NULL

The right answer would have been B, though.


Can you explain why A would not be an equally good answer to satisfy  
your problem set up?


 which(n == t(n)  col(n) != row(n) , arr.ind=TRUE)
  row col
B   2   1
A   1   2
 rownames(which(n == t(n)  col(n) != row(n) , arr.ind=TRUE) )
[1] B A

# Which would seem to be the correct answer, but
# This adds an additional constraint and also insures no diagonal  
elements


 rownames(which(n == t(n)  col(n) != row(n)  lower.tri(n),  
arr.ind=TRUE) )

[1] B





I simply do not see my
mistake.


I would rather program a problem correctly that hash through errors in  
loop logic.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot several histograms with same y-axes scaling using hist()

2011-04-29 Thread Jim Lemon

On 04/29/2011 08:35 PM, hck wrote:

Dear all

Problem: hist()-function, scale = “percent”

I want to generate histograms for changing underlying data. In order to make
them comparable, I want to fix the y-axis (vertical-axis) to, e.g., 0%, 10%,
20%, 30% as well as to fix the spaces, too. So the y-axis in each histogram
should be identical. Currently, I have 100 histograms and the y-axis scales
changes in each.

Here is my code:

=Hist(na.exclude(AA3), breaks=50, col=seashell3,
scale=percent,xlim=c(-1, 1), xlab=Bewertungsfehler,
ylab=Haeufigkeit (in %), main=KBV, border=white)

I tried the ylim=c(…), but unfortunately it does not work.


Hi Hans,
The barp function in plotrix can plot histograms (see the last example 
on the help page) and may be flexible enough to do what you want.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Nomograms from rms' fastbw output objects

2011-04-29 Thread Frank Harrell
Hi Rob,

fastbw does not try to produce a full fit object.  You have to re-run the
fit manually based on what you (sometimes dangerously) learn from fastbw. 
If I can find a way to add a 'formula' component to the fastbw result then
you could do something like lrm(fastbw(fit)$formula, ...).

Frank


Rob James wrote:
 
 There is both a technical and a theoretical element to my question... 
 Should I be able to use the outputs which arise from the fastbw function 
 as inputs to nomogram().  I seem to be failing at this, -- I obtain a 
 subscript out of range error.
 
 That I can't do this may speak to technical failings, but I suspect it 
 is because Prof Harrell thinks/knows it injudicious. However,  I can't 
 invent a reason why nomograms should be restricted to the full models, 
 if the purpose of fastbw is to generate parsimonious models with 
 appropriate standard errors.
 
 I'd welcome comments on either the technical or the theoretical issues.
 
 Many thanks in advance,
 
 Rob James
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context: 
http://r.789695.n4.nabble.com/Nomograms-from-rms-fastbw-output-objects-tp3482669p3483607.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Element by Element addition of the columns of a Matrix

2011-04-29 Thread Pete Brecknock
... is the apply function what you are looking for?

A=matrix(1,2,4)

apply(A,1,sum)

HTH

Pete




--
View this message in context: 
http://r.789695.n4.nabble.com/Element-by-Element-addition-of-the-columns-of-a-Matrix-tp3483545p3483628.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] abline outside of plot region

2011-04-29 Thread Nick Sabbe
Hi R people.

 

I ran into this problem: I created a plot with errbars, like this:

 errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5),
yplus=c(2.5,1.5,3.5,3.5))

Next, I wanted to accentuate some x value with an abline, like this:

 abline(v=2)

 

In one of my R sessions (which admittedly I have had open for quite a while
now), the abline draws outside of the plotting region of errbars (till the
edge of my plotting window at least).

I tested for the cause by opening another session (clean) of the same
version of R (2.13), and running the same set of commands. In this session,
I do not have this behavior. Conclusion: I must have changed some graphical
parameter in my original session, but I don't know which one. Do you?

 

As an addendum: I also want to add a few specific axis ticks besides the
standard ones in my graph. I used axis for this, and it works. I set
col.ticks to match the color of my abline (in the nonsimplified code), and
this works too, but unfortunately, the label below the tick is not in this
color, and a parameter for this is not present in axis.

 

Suggestions for either? Note: I'm on windows 7 with R 2.13.

 

Nick Sabbe

--

ping: nick.sa...@ugent.be

link:  http://biomath.ugent.be/ http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36

 

-- Do Not Disapprove

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] 3-way contingency table

2011-04-29 Thread Mathias Walter
Hi,

I have large data frame with many columns. A short example is given below:

 dataH
host ms01 ms31 ms33 ms34
1  cattle4   2096
2   sheep4345
3  cattle4345
4  cattle4345
5   sheep4355
6goat4345
7   sheep4355
8goat4345
9goat4345
10 cattle4345

Now I want to determine the the frequencies of every unique value in
every column depending on the host column.

It is quite easy to determine the frequencies in total with the
following command:

 dataH2 - dataH[,c(2,3,4,5)]
 table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)], useNA=ifany)

ms01 ms31 ms33 ms34
 3 0900
 410070
 5 0029
 6 0001
 9 0010
 200100

But I cannot manage to get it dependent on the host.

I tried

 xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)

and many other ways but I'm not stressful.

I can get it for each column individually with

 with(dataH, table(host, ms33))

   ms33
host 4 5 9
 cattle 3 0 1
 deer   0 0 0
 goat   3 0 0
 human  0 0 0
 sheep  1 2 0
 tick   0 0 0

But I do not want to repeat the command for every column. I need a
single table which can be plotted as a balloon plot, for instance.

Does anybody knows how to achieve this?

--
Kind regards,
Mathias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] question of VECM restricted regression

2011-04-29 Thread Meilan Yan
Dear Colleague

  I am trying to figure out how to use R to do OLS restricted VECM regression. 
However, there are some notation I cannot understand.

Please tell me what is 'ect',  'sd' and 'LRM.dl1  in the following practice:

#OLS retricted VECM regression
data(denmark)
sjd - denmark[, c(LRM, LRY, IBO, IDE)]
sjd.vecm- ca.jo(sjd, ecdet = const, type=eigen, K=2, spec=longrun,
season=4)
sjd.vecm.rls-cajorls(sjd.vecm,r=1)
summary(sjd.vecm.rls$rlm)
sjd.vecm.rls$beta

Response LRM.d :
Call:
lm(formula = substitute(LRM.d), data = data.mat)

Residuals:
  Min1QMedian3Q   Max
-0.027598 -0.012836 -0.003395  0.015523  0.056034

Coefficients:
 Estimate Std. Error t value Pr(|t|)
ect1-0.212955   0.064354  -3.309  0.00185 **
sd1 -0.057653   0.010269  -5.614 1.16e-06 ***
sd2 -0.016305   0.009177  -1.777  0.08238 .
sd3 -0.040859   0.008767  -4.660 2.82e-05 ***
LRM.dl1  0.049816   0.191992   0.259  0.79646
LRY.dl1  0.075717   0.157902   0.480  0.63389
IBO.dl1 -1.148954   0.372745  -3.082  0.00350 **
IDE.dl1  0.227094   0.546271   0.416  0.67959

 sjd.vecm.rls$beta
  ect1
LRM.l21.00
LRY.l2   -1.032949
IBO.l25.206919
IDE.l2   -4.215879


Many thanks
Meilan





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Plot multiple ctrees in the same figure

2011-04-29 Thread tudor
Dear all:

Is there a way one could plot two conditional inference trees (party
package, ctree) in a figure specified by layout?  My attempts failed as
plot.party seemed to take over the layout functionality and forced a single
ctree plot to be displayed.  A brief (non reproducible) example together
with the intended behavior follows below.  I hope I am not missing something
obvious.  My system: R2.12.2 on a Windows machine with party0.9-1 and
partykit0.1-0.

Thanks.

Tudor


# CREATE ctrees
...
layout(matrix(c(1,2,0,2), 2, 2, byrow=TRUE), widths=c(1,2), heights=c(1,2))
plot(ctree1)# plot first ctree
plot(ctree2)# plot second ctree
...

   

--
View this message in context: 
http://r.789695.n4.nabble.com/Plot-multiple-ctrees-in-the-same-figure-tp3483231p3483231.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Change the text size of the title in a legend of a R plot.

2011-04-29 Thread Victor Gabillon

thanks everyone for the help.

I ended up copying and pasting the legend function from the R source files.
I changed it so that the title.cex is not set by default to cex and so 
that this title.cex can be given as a parameter.


It works fine for me.
Note that if you make the title too big it goes out of the border as the 
borders were not designed for the case of a big title.


Thanks again!!

Victor

Le 29/04/2011 10:03, Jannis a écrit :

On 04/29/2011 05:21 AM, Victor Gabillon wrote:

Horizo - c(1,2,6,10,20)
legtext - paste(Horizo,sep=)
legend(topleft, legend=legtext,col=col,text.col=col,lwd=lwd,
lty=lty,cex=1.1,ncol=3,title = Horizons,title.col 
=black,title.cex=1.4) 


I am not sure, but the manual regarding legend seems to be not correct 
(or at least misleading). There is not title.cex argument for legend 
(even though the help page mentions it). Either you set cex 1 but 
this will resize the labels as well. Or you modify the code of legend 
as follows:


change the following (near the end of the code):

   text2(left + w/2, top - ymax, labels = title, adj = c(0.5,
0), cex = cex, col = title.col)

to:

   text2(left + w/2, top - ymax, labels = title, adj = c(0.5,
0), cex = title.cex, col = title.col)

and add title.cex to the arguments of legend. Its probably easiest if 
you copy the code of legend and save its modified version within a 
different function.


Not sure on whom to contact regarding correcting the documentation of 
legend(). Perhaps even I am wrong, but I could not find any reference 
to title.cex in the code.


HTH
Jannis


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] replace non numeric with NA

2011-04-29 Thread Nandini B

 Hello,
I have a sample data frame which looks like this
  day  od   month
1   1 0.12
2   3 #VALUE! 1
3   5 0.4 12
4   7 0.8 10
5  11   -  3
6  14   s 7
7  18  -- 12
8  27  197


Now i wish to filter all the non numeric values and replace it with NA. The 
data frame is actually huge and the non numeric characters vary from - to a 
string to absolutely anything!!!
Can anyone please help ?




Thank you,
Warm Regards,

Nandini 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem installing package sp in R 2.13.0

2011-04-29 Thread Roger Bivand
The rgdal package is not a dependency of sp, only suggested. In addition, you
are trying to install source packages, but should (probably) be installing
binaries, with type=mac.binary.leopard the most likely. If you need the
OSX rgdal binary, make sure that CRAN extras is on your repository path -
see ?setRepositories, for example by 
setRepositories(ind=1:2). If you really want to install source packages
under OSX, be sure to read up on this on 

http://cran.r-project.org/bin/macosx/

looking for the FAQ, and links to tools. If you can manage with binary
packages, stay with them.

Roger


Arnaud Catherine wrote:
 
 Hi,
 
 I am having troubles trying to install package sp in R (2.13.0) on mac
 OSX.
 I have tried installing the package using GUi or function install.packages
 but it didn't work.
 
 Here is the error message I get:
 
 
 also installing the dependency ‘rgdal’
 
 trying URL 'http://cran.univ-lyon1.fr/src/contrib/rgdal_0.6-33.tar.gz'
 Content type 'application/x-gzip' length 1422992 bytes (1.4 Mb)
 opened URL
 ==
 downloaded 1.4 Mb
 
 trying URL 'http://cran.univ-lyon1.fr/src/contrib/sp_0.9-80.tar.gz'
 Content type 'application/x-gzip' length 738569 bytes (721 Kb)
 opened URL
 ==
 downloaded 721 Kb
 
 * installing *source* package ‘sp’ ...
 ** libs
 *** arch - i386
 sh: make: command not found
 ERROR: compilation failed for package ‘sp’
 * removing
 ‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library/sp’
 ERROR: dependency ‘sp’ is not available for package ‘rgdal’
 
 The downloaded packages are in
 
 ‘/private/var/folders/8P/8P9oV0FHFI83GKIm2cPUOk+++TM/-Tmp-/RtmppsxaRa/downloaded_packages’
 * removing
 ‘/Library/Frameworks/R.framework/Versions/2.13/Resources/library/rgdal’
 
 
 
 Any help would be much appreciated!
 
 
 Best regards.
 
 
 
 
 
 Dr. Arnaud CATHERINE
 Post-Doctorant
 
 UMR 7245 CNRS/MNHN Molécules de Communication et Adaptation des
 Micro-organismes
 Equipe Cyanobactéries, Cyanotoxines et Environnement
 Muséum National d'Histoire Naturelle
 12, rue Buffon , Case 39
 75231 Paris Cedex 05
 
 Tel : + 33 (0)1 40 79 31 79
 Fax : +33 (0)1 40 79 35 94
 Email : arno...@mnhn.fr
 Site du Muséum National d'Histoire Naturelle : http://www.mnhn.fr
 
 
 
 
 
 
 
 
   [[alternative HTML version deleted]]
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-
Roger Bivand
Economic Geography Section
Department of Economics
Norwegian School of Economics and Business Administration
Helleveien 30
N-5045 Bergen, Norway

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-installing-package-sp-in-R-2-13-0-tp3481107p3483392.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to define specially nested functions

2011-04-29 Thread Chee Chen
Hi, Jerome and Phil,
Thank you for your solutions and I have studied carefully your codes but I have 
further questions (since I guess the simple lines of codes may not do the real 
job I am going to describe to you. Please forgive me for my shallowness!)

 I guess I over-simplified my question, basically I need such a function as the 
integrand for estimation of the expectation by Monte Carlo methods.

Please allow me to state the problem in more details:

I have to define a function for Monte Carlo computation of conditional 
expectation and solve for the argument for which the expectation equals a 
pre-specified value. Say, the integrand function is f(x,y,z), where x, z are 
deterministic, y probabilistic and follows a distribution F.
I will have to feed x=x0 to f, then I sample from F for y and evaluate 
f(x0,y,z), and use Monte Carlo method to get the expectation, which gives a 
function of z; now that the expectation is a function of z only, say, E(z); 
finally to solve for z such that E(z) = 0.5, for example.
The function f itself is very complicated and has high dimensional vectors as 
arguments except z, which is a real number. 

I am new in R but unexpectedly encountered this symbolic incapability of R as I 
almost finished programming all major computations in R. I have been skillful 
in Matlab and Mathematica (and it is very easy to do this in them) but as I am 
now in statistics I would like to continue in R unless it really is not able to 
do it (in that case I will have to recode in Mathematica).

Any of your further help is much appreciated!
Best regards,
-Chee

 
From: Jerome Asselin 
Sent: Friday, April 29, 2011 12:25 AM
To: Chee Chen 
Cc: R -Help 
Subject: Re: [R] How to define specially nested functions


On Thu, 2011-04-28 at 23:08 -0400, Chee Chen wrote:
 Dear All,
 I would like to define a function: f(x,y,z) with three arguments x,y,z, such 
 that: given values for x,y,  f(x,y,z) is still a function of z and that I am 
 still allowed to find the root in terms of z when x,y are given.
 For example: f(x,y,z) =  x+y + (x^2-z),  given x=1,y=3, f(1,3,z)= 1+3+1-z is 
 a function of z, and then I can use R to find the root z=5.
 
 Thank you.
 -Chee

Interesting exercise.

I've got this function, which I think it's doing what you're asking.

f - function(x,y,z)
{
fcall - match.call()
fargs - NULL
if(fcall$x == x)
fargs - c(fargs, x)
if(fcall$y == y)
fargs - c(fargs, y)
if(fcall$z == z)
fargs - c(fargs, z)

ffunargs - as.list(fargs)
names(ffunargs) - fargs

argslist - list(fcall)
ffun - append(argslist, substitute( x+y + (x^2-z) ), after=0)[[1]]
as.function(append(ffunargs, ffun))
}

This yields.

 f(3, 2, z)
function (z = z) 
3 + 2 + (3^2 - z)
environment: 0x132fdb8
 f(3, 2, z)(3)
[1] 11

I haven't figured out how to get rid of the default argument value shown
here as 'z = z'. That doesn't prevent it to work, but it's less
pretty.  If you find a better way, let me know.

HTH,
Jerome



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about lrm, validate, pentrace

2011-04-29 Thread Frank Harrell
Yes I would select that as the final model.  The difference you saw is caused
by different treatment of penalization of factor variables, related to the
use of the sum squared differences between the estimate at one category from
the average over all categories.  I think that as long as you code it one
way consistently and pick the penalty using that coding you are OK.  But if
the coefficients of the non-factor variables depend on how the binary
predictor is coded, there is a bit more concern.

Frank


細田弘吉 wrote:
 
 Thank you for you quick reply, Prof. Harrell.
 According to your advice, I ran pentrace using a very wide range.
 
   pentrace.x6factor - pentrace(x6factor.lrm, seq(0, 100, by=0.5))
   plot(pentrace.x6factor)
 
 I attached this figure. Then,
 
   pentrace.x6factor - pentrace(x6factor.lrm, seq(0, 10, by=0.05))
 
 It seems reasonable that the best penalty is 2.55.
 
   x6factor.lrm.pen - update(x6factor.lrm, penalty=2.55)
   cbind(coef(x6factor.lrm), coef(x6factor.lrm.pen), 
 abs(coef(x6factor.lrm)-coef(x6factor.lrm.pen)))
   [,1][,2][,3]
 Intercept -4.32434556 -3.86816460 0.456180958
 stenosis  -0.01496757 -0.01091755 0.004050025
 T1 3.04248257  2.42443034 0.618052225
 T2-0.75335619 -0.57194342 0.181412767
 procedure -1.20847252 -0.82589263 0.382579892
 ClinicalScore  0.37623189  0.30524628 0.070985611
 
   validate(x6factor.lrm, bw=F, B=200)
index.orig trainingtest optimism index.corrected   n
 Dxy   0.6324   0.6849  0.5955   0.0894  0.5430 200
 R20.3668   0.4220  0.3231   0.0989  0.2679 200
 Intercept 0.   0. -0.1924   0.1924 -0.1924 200
 Slope 1.   1.  0.7796   0.2204  0.7796 200
 Emax  0.   0.  0.0915   0.0915  0.0915 200
 D 0.2716   0.3229  0.2339   0.0890  0.1826 200
 U-0.0192  -0.0192  0.0243  -0.0436  0.0243 200
 Q 0.2908   0.3422  0.2096   0.1325  0.1582 200
 B 0.1272   0.1171  0.1357  -0.0186  0.1457 200
 g 1.6328   1.9879  1.4940   0.4939  1.1389 200
 gp0.2367   0.2502  0.2216   0.0286  0.2080 200
 
 
   validate(x6factor.lrm.pen, bw=F, B=200)
index.orig trainingtest optimism index.corrected   n
 Dxy   0.6375   0.6857  0.6024   0.0833  0.5542 200
 R20.3145   0.3488  0.3267   0.0221  0.2924 200
 Intercept 0.   0.  0.0882  -0.0882  0.0882 200
 Slope 1.   1.  1.0923  -0.0923  1.0923 200
 Emax  0.   0.  0.0340   0.0340  0.0340 200
 D 0.2612   0.2571  0.2370   0.0201  0.2411 200
 U-0.0192  -0.0192 -0.0047  -0.0145 -0.0047 200
 Q 0.2805   0.2763  0.2417   0.0346  0.2458 200
 B 0.1292   0.1224  0.1355  -0.0132  0.1423 200
 g 1.2704   1.3917  1.5019  -0.1102  1.3805 200
 gp0.2020   0.2091  0.2229  -0.0138  0.2158 200
 
 In the penalized model (x6factor.lrm.pen), the apparent Dxy is 0.64, and 
 bias-corrected Dxy is 0.55. The maximum absolute error is estimated to 
 be 0.034, smaller than non-penalized model (0.0915 in x6factor.lrm) The 
 changes in slope and intercept are substantially reduced in penalized 
 model. I think overfitting is improved at least to some extent. Should I 
 select this as a final model?
 
 I have one more question. The procedure variable was defined as 0/1 
 value in the previous mail. For some graphical reason, I redefined it as 
 treat1/treat2 value. Then, the best penalty value was changed from 3.05 
 to 2.55. I guess change from numeric to factorial caused this reduction 
 in penalty. Which set up should I select?
 
 I appreciate your help in advance.
 
 -- 
 KH
 
 (11/04/26 0:21), Frank Harrell wrote:
 You've done a lot of good work on this.  Yes I would say you have
 moderate
 overfitting with the first model.  The only thing that saved you from
 having
 severe overfitting is that there seems to be a signal present [I am
 assume
 this model is truly pre-specified and was not developed at all by looking
 at
 patterns of responses Y.]

 The use of backwards stepdown demonstrated much worse overfitting.  This
 is
 in line with what we know about the damage of stepwise selection methods
 that do not incorporate shrinkage.  I would throw away the stepwise
 regression model.  You'll find that the model selected is entirely
 arbitrary.  And you can't use the selected variables in any re-fit of
 the
 model, i.e., you can't use lrm pretending that the two remaining
 variables
 were pre-specified.  Stepwise regression methods only seem to help.  When
 assessed properly we see that is an illusion.

 You are using penalizing properly but you did not print the full table of
 penalties vs. effective AIC.  We don't have faith that 

[R] threshold matrix

2011-04-29 Thread Alaios
Dear all,
I have a quite big matrix which I would like to threshold.
If the value is below threshold the cell should be zero
and 
if the value is over threshold the cell should be one

One really simple way to do that is two have a nested loop and check cell by 
cell.

The problem is that this seems to be really time consuming and ineficient.

What do you suggest me to try out?

I would like to thank you in advance for your help


Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with NA

2011-04-29 Thread James W. MacDonald

Hi Nandini,

On 4/29/2011 6:45 AM, Nandini B wrote:


  Hello,
I have a sample data frame which looks like this
   day  od   month
1   1 0.12
2   3 #VALUE! 1
3   5 0.4 12
4   7 0.8 10
5  11   -  3
6  14   s 7
7  18  -- 12
8  27  197



 x - data.frame(day=1:8, od = 
c(0.1,#VALUE!,0.4,0.8,-,s,--,19), month = c(2,1,12,10,3,7,12,7))

 x
  day  od month
1   1 0.1 2
2   2 #VALUE! 1
3   3 0.412
4   4 0.810
5   5   - 3
6   6   s 7
7   7  --12
8   8  19 7
 x$od - as.numeric(as.character(x$od))
Warning message:
NAs introduced by coercion
 x
  day   od month
1   1  0.1 2
2   2   NA 1
3   3  0.412
4   4  0.810
5   5   NA 3
6   6   NA 7
7   7   NA12
8   8 19.0 7


Best,

Jim




Now i wish to filter all the non numeric values and replace it with NA. The data frame 
is actually huge and the non numeric characters vary from - to a string to absolutely 
anything!!!
Can anyone please help ?




Thank you,
Warm Regards,

Nandini



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 3-way contingency table

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 6:47 AM, Mathias Walter wrote:


Hi,

I have large data frame with many columns. A short example is given  
below:



dataH

   host ms01 ms31 ms33 ms34
1  cattle4   2096
2   sheep4345
3  cattle4345
4  cattle4345
5   sheep4355
6goat4345
7   sheep4355
8goat4345
9goat4345
10 cattle4345

Now I want to determine the the frequencies of every unique value in
every column depending on the host column.

It is quite easy to determine the frequencies in total with the
following command:


dataH2 - dataH[,c(2,3,4,5)]
table(as.matrix(dataH2), colnames(dataH2)[col(dataH2)],  
useNA=ifany)


   ms01 ms31 ms33 ms34
3 0900
410070
5 0029
6 0001
9 0010
200100

But I cannot manage to get it dependent on the host.

I tried


xtabs(cbind(ms01, ms31, ms33, ms34) ~ ., dataH)


and many other ways but I'm not stressful.

I can get it for each column individually with


with(dataH, table(host, ms33))


  ms33
host 4 5 9
cattle 3 0 1
deer   0 0 0
goat   3 0 0
human  0 0 0
sheep  1 2 0
tick   0 0 0

But I do not want to repeat the command for every column. I need a
single table which can be plotted as a balloon plot, for instance.


You have obviously not given us the full data from which your correct  
answer was drawn, but see if this is going  the right direction:


require(reshape)
 dataHm - melt(dataH)
Using host as id variables
 xtabs(~host+value+variable, dataHm)
, , variable = ms01

value
host 3 4 5 6 9 20
  cattle 0 4 0 0 0  0
  goat   0 3 0 0 0  0
  sheep  0 3 0 0 0  0

, , variable = ms31

value
host 3 4 5 6 9 20
  cattle 3 0 0 0 0  1
  goat   3 0 0 0 0  0
  sheep  3 0 0 0 0  0

, , variable = ms33

value
host 3 4 5 6 9 20
  cattle 0 3 0 0 1  0
  goat   0 3 0 0 0  0
  sheep  0 1 2 0 0  0

, , variable = ms34

value
host 3 4 5 6 9 20
  cattle 0 0 3 1 0  0
  goat   0 0 3 0 0  0
  sheep  0 0 3 0 0  0



Does anybody knows how to achieve this?

--
Kind regards,
Mathias

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with NA

2011-04-29 Thread Duncan Murdoch

On 29/04/2011 6:45 AM, Nandini B wrote:

  Hello,
I have a sample data frame which looks like this
   day  od   month
1   1 0.12
2   3 #VALUE! 1
3   5 0.4 12
4   7 0.8 10
5  11   -  3
6  14   s 7
7  18  -- 12
8  27  197


Now i wish to filter all the non numeric values and replace it with NA. The data frame 
is actually huge and the non numeric characters vary from - to a string to absolutely 
anything!!!
Can anyone please help ?


You don't tell use the types of the columns, so I'll assume they are 
factors.  If so, call


as.numeric(as.character())

on each of them to convert the number-like values to numbers, the others 
to NA.  For example,


df$day - as.numeric(as.character(df$day))

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up plotting to MSWindows graphics window

2011-04-29 Thread jim holtman
If you are plotting that many data points, you  might want to look at
'hexbin' as a way of aggregating the values to a different
presentation.  It is especially nice if you are doing a scatter plot
with a lot of data points and trying to make sense out of it.

On Wed, Apr 27, 2011 at 5:16 AM, Jonathan Gabris jonat...@k-m-p.nl wrote:
 Hello,

 I am working on a project analysing the performance of motor-vehicles
 through messages logged over a CAN bus.

 I am using R 2.12 on Windows XP and 7

 I am currently plotting the data in R, overlaying 5 or more plots of data,
 logged at 1kHz, (using plot.ts() and par(new = TRUE)).
 The aim is to be able to pan, zoom in and out and get values from the
 plotted graph using a custom Qt interface that is used as a front end to
 R.exe (all this works).
 The plot is drawn by R directly to the windows graphic device.

 The data is imported from a .csv file (typically around 100MB) to a matrix.
 (timestamp, message ID, byte0, byte1, ..., byte7)
 I then separate this matrix into several by message ID (dimensions are in
 the order of 8cols, 10^6 rows)

 The panning is done by redrawing the plots, shifted by a small amount. So as
 to view a window of data from a second to a minute long that can travel the
 length of the logged data.

 My problem is that, the redrawing of the plots whilst panning is too slow
 when dealing with this much data.
 i.e.: I can see the last graphs being drawn to the screen in the half-second
 following the view change.
 I need a fluid change from one view to the next.

 My question is this:
 Are there ways to speed up the plotting on the MSWindows display?
 By reducing plotted point densities to *sensible* values?
 Using something other than plot.ts() - is the lattice package faster?
 I don't need publication quality plots, they can be rougher...

 I have tried:
 -Using matrices instead of dataframes - (works for calculations but not
 enough for plots)
 -increasing the max usable memory (max-mem-size) - (no change)
 -increasing the size of the pointer protection stack (max-ppsize) - (no
 change)
 -deleting the unnecessary leftover matrices - (no change)
 -I can't use lines() instead of plot() because of the very  different scales
 (rpm-1, flags -1to3)

 I am going to do some resampling of the logged data to reduce the vector
 sizes.
 (removal of *less* important data and use of window.ts())

 But I am currently running out of ideas...
 So if sombody could point out something, I would be greatfull.

 Thanks,

 Jonathan Gabris

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] abline outside of plot region

2011-04-29 Thread Peter Ehlers

On 2011-04-29 06:14, Nick Sabbe wrote:

Hi R people.



I ran into this problem: I created a plot with errbars, like this:


errbar(x=c(1,2,3,4), y=c(2,1,3,3), yminus=c(1.5,0.5,2.5,2.5),

yplus=c(2.5,1.5,3.5,3.5))

Next, I wanted to accentuate some x value with an abline, like this:


abline(v=2)




In one of my R sessions (which admittedly I have had open for quite a while
now), the abline draws outside of the plotting region of errbars (till the
edge of my plotting window at least).

I tested for the cause by opening another session (clean) of the same
version of R (2.13), and running the same set of commands. In this session,
I do not have this behavior. Conclusion: I must have changed some graphical
parameter in my original session, but I don't know which one. Do you?



As an addendum: I also want to add a few specific axis ticks besides the
standard ones in my graph. I used axis for this, and it works. I set
col.ticks to match the color of my abline (in the nonsimplified code), and
this works too, but unfortunately, the label below the tick is not in this
color, and a parameter for this is not present in axis.



Suggestions for either? Note: I'm on windows 7 with R 2.13.


  plot(1:4, xaxt='n')
  axis(1, at=2:3, lab=c('a', 'b'),
   col.ticks=3, col.axis=2, lwd=0, lwd.ticks=1)
  par(xpd = TRUE)
  abline(v = 4)

Peter Ehlers





Nick Sabbe

--

ping: nick.sa...@ugent.be

link:http://biomath.ugent.be/  http://biomath.ugent.be

wink: A1.056, Coupure Links 653, 9000 Gent

ring: 09/264.59.36



-- Do Not Disapprove




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 9:37 AM, Alaios wrote:


Dear all,
I have a quite big matrix which I would like to threshold.
If the value is below threshold the cell should be zero
and
if the value is over threshold the cell should be one


M2 - M
M2[M  thresh] - 0
M2[M = thresh] - 1

or perhaps simply:

M2 - as.numeric( M[]  thresh )


One really simple way to do that is two have a nested loop and check  
cell by cell.


The problem is that this seems to be really time consuming and  
ineficient.


What do you suggest me to try out?


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] matrix evaluation using if function

2011-04-29 Thread Berend Hasselman

David Winsemius wrote:
 
 On Apr 29, 2011, at 4:27 AM, ivan wrote:
 
 Hi All,

 I am trying to create a function which evaluates whether the values  
 (which
 are equal to one) of a matrix are the same as their mirror values.  
 Consider
 the following matrix:

 n-matrix(cbind(c(0,1,1),c(1,0,0),c(0,1,0)),3,3)
 colnames(n)-cbind(A,B,C);rownames(n)-cbind(A,B,C)
 n
  A B C
 A 0 1 0
 B 1 0 1
 C 1 0 0

 Hence, since n[2,1] and n[1,2] are 1 and the same, the function should
 return the name of the row of n[2,1]. I used the following function:

 for (i in length(rownames(n))) {

 for (j in length(colnames(n))){

 if(n[i,j]==n[j,i]){

 rownames(n)[[i]]-output} else {}

 }

 }

 output
 NULL

 The right answer would have been B, though.
 
 Can you explain why A would not be an equally good answer to satisfy  
 your problem set up?
 
   which(n == t(n)  col(n) != row(n) , arr.ind=TRUE)
row col
 B   2   1
 A   1   2
   rownames(which(n == t(n)  col(n) != row(n) , arr.ind=TRUE) )
 [1] B A
 
 # Which would seem to be the correct answer, but
 # This adds an additional constraint and also insures no diagonal  
 elements
 
   rownames(which(n == t(n)  col(n) != row(n)  lower.tri(n),  
 arr.ind=TRUE) )
 [1] B
 

Wouldn't this do it too (dsince the diagonal is set to false by lower.tri)?:

rownames(which(n == t(n)  lower.tri(n),  arr.ind=TRUE))

Berend



--
View this message in context: 
http://r.789695.n4.nabble.com/matrix-evaluation-using-if-function-tp3483188p3483785.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread Giovanni Petris
Well, but the original poster also refers to 0.2 and 0.8 as expected
 min and max, in which case we are back to a joke...

Giovanni


On Thu, 2011-04-28 at 13:06 -0400, David Winsemius wrote:
 On Apr 28, 2011, at 12:09 PM, Ravi Varadhan wrote:
 
  Surely you must be joking, Mr. Jianfeng.
 
 
 Perhaps not joking and perhaps not with correct statistical  
 specification.
 
 A truncated Normal could be simulated with:
 
 set.seed(567)
 x - rnorm(n=5, m=1, sd=1)
 xtrunc - x[x=0.2  x =0.8]
 require(logspline)
 plot(logspline(xtrunc, lbound=0.2, ubound=0.8, nknots=7))
 
 -- 
 David.
 
  ---
  Ravi Varadhan, Ph.D.
  Assistant Professor,
  Division of Geriatric Medicine and Gerontology School of Medicine  
  Johns Hopkins University
 
  Ph. (410) 502-2619
  email: rvarad...@jhmi.edu
 
 
  -Original Message-
  From:
 r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
  ] On Behalf Of Mao Jianfeng
  Sent: Thursday, April 28, 2011 12:02 PM
  To: r-help@r-project.org
  Subject: [R] how to generate a normal distribution with mean=1,  
  min=0.2, max=0.8
 
  Dear all,
 
  This is a simple probability problem. I want to know, How to  
  generate a
  normal distribution with mean=1, min=0.2 and max=0.8?
 
  I know how the generate a normal distribution of mean = 1 and sd =
 1  
  and
  with 500 data point.
 
  rnorm(n=500, m=1, sd=1)
 
  But, I am confusing with how to generate a normal distribution
 with  
  expected
  min and max. I expect to hear your directions.
 
  Thanks in advance.
 
  Best,
  Jian-Feng,
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
-- 

Giovanni Petris  gpet...@uark.edu
Associate Professor
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread Alaios
Thanks a lot.
I finally used

M2 - M
M2[M  thresh] - 0
M2[M = thresh] - 1

as I noticed that this one line

M2 - as.numeric( M[]  thresh )
vectorizes my matrix.

One more question I have two matrices that only differ slightly. What will be 
the easiest way to compare and find the cells that are not the same?

Best Regards
Alex

--- On Fri, 4/29/11, David Winsemius dwinsem...@comcast.net wrote:

 From: David Winsemius dwinsem...@comcast.net
 Subject: Re: [R] threshold matrix
 To: Alaios ala...@yahoo.com
 Cc: R-help@r-project.org
 Date: Friday, April 29, 2011, 2:57 PM
 
 On Apr 29, 2011, at 9:37 AM, Alaios wrote:
 
  Dear all,
  I have a quite big matrix which I would like to
 threshold.
  If the value is below threshold the cell should be
 zero
  and
  if the value is over threshold the cell should be one
 
 M2 - M
 M2[M  thresh] - 0
 M2[M = thresh] - 1
 
 or perhaps simply:
 
 M2 - as.numeric( M[]  thresh )
  
  One really simple way to do that is two have a nested
 loop and check cell by cell.
  
  The problem is that this seems to be really time
 consuming and ineficient.
  
  What do you suggest me to try out?
 
 --
 David Winsemius, MD
 West Hartford, CT
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using Java methods in R

2011-04-29 Thread hill0093
H do I obtain a strictly rectangular
type-double array (converted to an R 2-dimensional array) from a Java class? 
I can obtain a 1-dimensional type-double array (vector) or a scalar, 
but I cannot figure out the two-dimensional from the instructions.
Is .jevalArray also involved?
My simple Java test class and R test code follows:

import java.lang.reflect.Array;
public class RJavTest { 
  public static void main(String[]args) { RJavTest rJavTest=new RJavTest();
} 
  public final static String conStg=testString; 
  public final static double con0dbl=10001; 
  public final static double[]con1Arr=new double[] {
10001,10002,10003,10004,10005,10006 }; 
  public final static double[][]con2Arr=new double[][] { {
10001,10002,10003,10004 },{ 20001,20002,20003,20004 },{
30001,30002,30003,30004 } }; 
  public final static String retConStg() { return(conStg); } 
  public final static double retCon0dbl() { return(con0dbl); } 
  public final static double[] retCon1Arr() { return(con1Arr); } 
  public final static double[][] retCon2Arr() { return(con2Arr); } 
}

library(rJava)
.jinit()
.jaddClassPath(C:/ad/j)
print(.jclassPath())
rJavaTst - .jnew(RJavTest)
conn1Arr - .jfield(rJavaTst,sig=[D,con1Arr)
print(conn1Arr)
print(conn1Arr[2])
conn1ArrRet - .jcall(rJavaTst,returnSig=[D,retCon1Arr)
print(conn1ArrRet)
print(conn1ArrRet[2])
conn0dbl - .jfield(rJavaTst,sig=D,con0dbl)
print(conn0dbl)
##The above works, but not the following
conn2Arr - .jfield(rJavaTst,sig=[[D,con2Arr)
print(conn2Arr[2])
print(conn2Arr[2,3])
print(conn2Arr)
arj34Ret - .jcall(rJavaTst,returnSig=[[D,arReturnTEST)
print(arj34Ret)

The latter 2-dim stuff doesn't work



--
View this message in context: 
http://r.789695.n4.nabble.com/Using-Java-methods-in-R-tp3469299p3483862.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread Petr Savicky
On Fri, Apr 29, 2011 at 07:44:59AM -0700, Alaios wrote:
 Thanks a lot.
 I finally used
 
 M2 - M
 M2[M  thresh] - 0
 M2[M = thresh] - 1
 
 as I noticed that this one line
 
 M2 - as.numeric( M[]  thresh )
 vectorizes my matrix.

Hi.

This may be avoided, for example

  M2 - M
  M2[, ] - as.numeric(M = thresh)

or

  array(as.numeric(M = thresh), dim=dim(M))

 One more question I have two matrices that only differ slightly. What will be 
 the easiest way to compare and find the cells that are not the same?

If A and B are matrices of the same dimension, then

  A == B

is a logical matrix with TRUE entires for positions, where 
A and B match exactly.

  abs(A - B) = eps

is a logical matrix with TRUE entires for positions, where
A and B differ at most by eps.

If you want to get only one logical result, then use

  all(A == B)

for exact equality and

  all(abs(A - B) = eps)

for approximate equality of all entries.

See also ?all.equal, which uses the relative error, not absolute
difference.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Specify custom par(mfrow()) layout for defined plot()

2011-04-29 Thread Michael Bach
Dear R Users,

I am doing stats::decompose() on 4 different time series.  When I issue

csdA - decompose(tsA)
plot(csdA)

I get a summary plot for observed, trend, seasonal and random components
of decomposed time series tsA.  As I understand it, the object returned
by decompose() has it's own plot method where mfrow(4,1) etc. is
defined.  Now suppose I wanted to wrap those mfrow(4,1) into my own
mfrow(2,2) layout.  How could I achieve this?  Is there a general way to
handle these cases?  Something like a meta par(mfrow())?

Best Regards,
Michael Bach

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Still confused about classes

2011-04-29 Thread Russ Abbott
Thanks, to all. I didn't know about either *methods( ) *or the package *
lubridate*, which seems like a very nice *Date *package.

*-- Russ *



On Fri, Apr 29, 2011 at 1:35 AM, Kenn Konstabel lebats...@gmail.com wrote:

 The function for getting the year from date  is there in package
 lubridate (as well as many other convenient functions to work with
 dates).

 More generally, finding all methods for a given class may be a
 little tricky. If all means everything you have installed and
 currently attached to your search path then methods(class=Date) will
 do it (for S3 classes). (but The functions listed are  those which
 _are named like methods_ and may not actually be   methods (known
 exceptions are discarded in the code). ) The result depends on which
 packages you have loaded: in my currently open R session,
 methods(Date) lists 36 possible methods but after library(zoo) I
 get two more ( as.yearmon.Date and as.yearqtr.Date).

 Regards,
 Kenn


 On Fri, Apr 29, 2011 at 9:05 AM, Russ Abbott russ.abb...@gmail.com
 wrote:
  Hi,
 
  I'm still confused about how to find out what methods are defined for a
  given class.  For example, I know that
 
  today - Sys.Date()
 
  will produce an object of type Date. But I'm not sure what I can do with
  Date objects or how I can find out.
 
  ?Date
 
 
  refers me to the Date documentation page. But it doesn't tell me how, for
  example, to extract the current year from a date object.
 
  I tried
 
  year(today)Error: could not find function year
 
 
  Is there some other function that does the job? I want a function f such
  that f(today)will return 2011. Perhaps there is no such
 function.
   But in general I don't have any confidence that I would know how to find
 it
  if it existed or that I would know how to assure myself that there was no
  such function.
 
  Thanks.
 
  *-- Russ *
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Tal Galili
Hello all,
I wish to use read.csv to read a google doc spreadsheet.

I try using the following code:

data_url - 
http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv

read.csv(data_url)

Which results in the following error:

Error in file(file, rt) : cannot open the connection


I'm on windows 7.  And the code was tried on R 2.12 and 2.13

I remember trying this a few months ago and it worked fine.
Any suggestion what might be causing this or how to solve it?


Thanks.



Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with NA

2011-04-29 Thread Nandini B

Thanks a lot Jim, this is perfect!!

Thank you,
Nandini Badarinarayan




 Date: Fri, 29 Apr 2011 09:49:26 -0400
 From: jmac...@med.umich.edu
 To: nandini...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] replace non numeric with NA
 
 Hi Nandini,
 
 On 4/29/2011 6:45 AM, Nandini B wrote:
 
Hello,
  I have a sample data frame which looks like this
 day  od   month
  1   1 0.12
  2   3 #VALUE! 1
  3   5 0.4 12
  4   7 0.8 10
  5  11   -  3
  6  14   s 7
  7  18  -- 12
  8  27  197
 
 
   x - data.frame(day=1:8, od = 
 c(0.1,#VALUE!,0.4,0.8,-,s,--,19), month = c(2,1,12,10,3,7,12,7))
   x
day  od month
 1   1 0.1 2
 2   2 #VALUE! 1
 3   3 0.412
 4   4 0.810
 5   5   - 3
 6   6   s 7
 7   7  --12
 8   8  19 7
   x$od - as.numeric(as.character(x$od))
 Warning message:
 NAs introduced by coercion
   x
day   od month
 1   1  0.1 2
 2   2   NA 1
 3   3  0.412
 4   4  0.810
 5   5   NA 3
 6   6   NA 7
 7   7   NA12
 8   8 19.0 7
 
 
 Best,
 
 Jim
 
 
 
  Now i wish to filter all the non numeric values and replace it with NA. 
  The data frame is actually huge and the non numeric characters vary from 
  - to a string to absolutely anything!!!
  Can anyone please help ?
 
 
 
 
  Thank you,
  Warm Regards,
 
  Nandini
 
 
  
  [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 -- 
 James W. MacDonald, M.S.
 Biostatistician
 Douglas Lab
 University of Michigan
 Department of Human Genetics
 5912 Buhl
 1241 E. Catherine St.
 Ann Arbor MI 48109-5618
 734-615-7826
 **
 Electronic Mail is not secure, may not be read every day, and should not be 
 used for urgent or sensitive issues 
 
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting options only inside functions

2011-04-29 Thread luke-tierney

The Python solution does not extend, at least not cleanly, to things
like dev on/ dev off or to Hadley's locale example.  In any case if I
am reading the Python source correctly on how they handle user
interrupts this solution has the same non-robusness to user interrupts
issue that Bill's initial solution had.

As a basis I believe what we need is a mechanism that handles a
setup, an action, and a cleanup, with setup and cleanup occurring with
interrupts disablednand the action with interrupts enabled. Scheme's
dynamic wind is similar, though I don't believe the scheme standard
addresses interrupts and we don't need to worry about continuations,
but some of the issues are similar.  Probably we would want two
flavors, one in which the action has to be a function that takes as a
single argument the result produced by the setup code, and one in
which the action can be an argument expression that is then evaluated
at the appropriate place by laze evaluation.

This can be done at the R level except for the controlling of
interrupts (and possibly other asynchronous stuff)-- that would need a
new pair of primitives (suspendInterrupts/enableInterupts or something
like that).  There is something in the Haskell literature on this that
I have looked at a while back -- probably time to have another look.


On Thu, 28 Apr 2011, Jonathan Daily wrote:


I would also love to see this implemented in R, as my current solution
to the issue of doing tons of open/close, dev/dev.off, etc. is to use
snippets in my IDE, and in the end I feel like it is a hack job. A
pythonic with function would also solve most of the situations where
I have had to use awkward try or tryCatch calls. I would be willing to
help with this project, even if it is just testing.

On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
b.rowling...@lancaster.ac.uk wrote:

but it's a little clumsy, because

with_connection(file(myfile.txt), {do stuff...})

isn't very useful because you have no way to reference the connection
that you're using. Ruby's blocks have arguments which would require
big changes to R's syntax.  One option would to use pronouns:


 Looking very much like python 'with' statements:

http://effbot.org/zone/python-with-statement.htm

 Implemented via the 'with' statement which can operate on anything
that has a __enter__ and an __exit__ method. Very neat.

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








--
Luke Tierney
Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:  l...@stat.uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold matrix

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 10:44 AM, Alaios wrote:


Thanks a lot.
I finally used

M2 - M
M2[M  thresh] - 0
M2[M = thresh] - 1

as I noticed that this one line

M2 - as.numeric( M[]  thresh )
vectorizes my matrix.

One more question I have two matrices that only differ slightly.  
What will be the easiest way to compare and find the cells that are  
not the same?


M[!M==N]
N[!M==N]




Best Regards
Alex

--- On Fri, 4/29/11, David Winsemius dwinsem...@comcast.net wrote:


From: David Winsemius dwinsem...@comcast.net
Subject: Re: [R] threshold matrix
To: Alaios ala...@yahoo.com
Cc: R-help@r-project.org
Date: Friday, April 29, 2011, 2:57 PM

On Apr 29, 2011, at 9:37 AM, Alaios wrote:


Dear all,
I have a quite big matrix which I would like to

threshold.

If the value is below threshold the cell should be

zero

and
if the value is over threshold the cell should be one


M2 - M
M2[M  thresh] - 0
M2[M = thresh] - 1

or perhaps simply:

M2 - as.numeric( M[]  thresh )


One really simple way to do that is two have a nested

loop and check cell by cell.


The problem is that this seems to be really time

consuming and ineficient.


What do you suggest me to try out?


--
David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting options only inside functions

2011-04-29 Thread Jonathan Daily
In python, opening a connection using with allows for a temporary
assignment using as. So:

with file(/path/to/file) as con:
permanent_object = function(con)

would provide the return of function(con) globally, but close con. If
function(con) causes an error, con is still closed.

I agree with your description of what the function would need to do.
Would it make sense to make it generic and define default methods for
different setups? e.g. Using the current with/within when it is a
data.frame/environment, evaluating it when it is a function, etc.

On Fri, Apr 29, 2011 at 12:34 PM,  luke-tier...@uiowa.edu wrote:
 The Python solution does not extend, at least not cleanly, to things
 like dev on/ dev off or to Hadley's locale example.  In any case if I
 am reading the Python source correctly on how they handle user
 interrupts this solution has the same non-robusness to user interrupts
 issue that Bill's initial solution had.

 As a basis I believe what we need is a mechanism that handles a
 setup, an action, and a cleanup, with setup and cleanup occurring with
 interrupts disablednand the action with interrupts enabled. Scheme's
 dynamic wind is similar, though I don't believe the scheme standard
 addresses interrupts and we don't need to worry about continuations,
 but some of the issues are similar.  Probably we would want two
 flavors, one in which the action has to be a function that takes as a
 single argument the result produced by the setup code, and one in
 which the action can be an argument expression that is then evaluated
 at the appropriate place by laze evaluation.

 This can be done at the R level except for the controlling of
 interrupts (and possibly other asynchronous stuff)-- that would need a
 new pair of primitives (suspendInterrupts/enableInterupts or something
 like that).  There is something in the Haskell literature on this that
 I have looked at a while back -- probably time to have another look.


 On Thu, 28 Apr 2011, Jonathan Daily wrote:

 I would also love to see this implemented in R, as my current solution
 to the issue of doing tons of open/close, dev/dev.off, etc. is to use
 snippets in my IDE, and in the end I feel like it is a hack job. A
 pythonic with function would also solve most of the situations where
 I have had to use awkward try or tryCatch calls. I would be willing to
 help with this project, even if it is just testing.

 On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
 b.rowling...@lancaster.ac.uk wrote:

 but it's a little clumsy, because

 with_connection(file(myfile.txt), {do stuff...})

 isn't very useful because you have no way to reference the connection
 that you're using. Ruby's blocks have arguments which would require
 big changes to R's syntax.  One option would to use pronouns:

  Looking very much like python 'with' statements:

 http://effbot.org/zone/python-with-statement.htm

  Implemented via the 'with' statement which can operate on anything
 that has a __enter__ and an __exit__ method. Very neat.

 Barry

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






 --
 Luke Tierney
 Statistics and Actuarial Science
 Ralph E. Wareham Professor of Mathematical Sciences
 University of Iowa                  Phone:             319-335-3386
 Department of Statistics and        Fax:               319-335-3017
   Actuarial Science
 241 Schaeffer Hall                  email:      l...@stat.uiowa.edu
 Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



-- 
===
Jon Daily
Technician
===
#!/usr/bin/env outside
# It's great, trust me.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, Giovanni Petris wrote:

Well, but the original poster also refers to 0.2 and 0.8 as expected 
min and max, in which case we are back to a joke...


Well, he is a lot better with English than I am with Mandarin.  He seemed 
to like the truncated normal answers, so we'll let those be his answers.


It is possible to choose parameters for a normal distribution with 500 
observations such that the expected value of the maximum is .8 and the 
expected value of the minimum is .2.  Obviously, the mean would be .5, not 
1, but what would the variance then have to be to provide the correct 
expected max and min?  That's another legitimate question.


Mike



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Mao Jianfeng
Sent: Thursday, April 28, 2011 12:02 PM
To: r-help@r-project.org
Subject: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

Dear all,

This is a simple probability problem. I want to know, How to generate 
a normal distribution with mean=1, min=0.2 and max=0.8?


I know how the generate a normal distribution of mean = 1 and sd = 1 
and with 500 data point.


rnorm(n=500, m=1, sd=1)

But, I am confusing with how to generate a normal distribution with 
expected min and max. I expect to hear your directions.


Thanks in advance.

Best,
Jian-Feng,


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:


Hello all,
I wish to use read.csv to read a google doc spreadsheet.

I try using the following code:

data_url - 
http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv

read.csv(data_url)

Which results in the following error:

Error in file(file, rt) : cannot open the connection


I'm on windows 7.  And the code was tried on R 2.12 and 2.13

I remember trying this a few months ago and it worked fine.


I am always amused at such claims. Occasionally they are correct, but  
more often a crucial step has been omitted. In this case you have at a  
minimum embedded line-feeds in your URL string and have not  
established a connection, so it could not possibly have succeeded as  
presented.


But now it's time to admit I do not know why it is not succeeding when  
I correct those flaws.


 closeAllConnections()
 data_url - url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv 
)

 read.csv(data_url)
Error in open.connection(file, rt) : cannot open the connection

 closeAllConnections()
 dd - read.csv(con -  url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv 
))

Error in open.connection(file, rt) : cannot open the connection


So, I guess I'm not reading the help pages for `url` and `read.csv` as  
well I thought I was.




Any suggestion what might be causing this or how to solve it?



--
David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sub-matrix block size

2011-04-29 Thread Santosh
Dear Rxperts,

Can Jordan decomposition of submatrices be useful to determine size of sub
blocks? http://en.wikipedia.org/wiki/Jordan_normal_form;..

Thanks for the ideas/suggestions.
.
I have another similar situation, where at least one of the off diagonal
elements of the lower triangle submatrices (as mentioned in the previous
example) may be zero.. and based on the visual inspection, the block size of
those square submatrices should be the same as in the previous example. How
do I resolve this one?

m1 - structure(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim = c(11L,
11L))

Also, in the vector below is there a simple way to separate out contiguous
blocks (for identification purposes)? Please see the inserted 0 in the
vector below to identify the next block  ...

 rowSums(m) + colSums(m) - 1
  [1]  2  2  1 -1  3  3  3  0  3  3  3 -1   # the elements in this vector
are the TRUE sizes of submatrices (zero is inserted to separate contiguous
blocks of same size)

Regards,
Santosh

On Wed, Apr 27, 2011 at 6:41 AM, Santosh santosh2...@gmail.com wrote:

 Thanks, David! That is another interesting perspective to (sub/super)
 diagonal story! For now I was looking only at block sizes of lower triangle
 submatrices as Dennis suggested.

 Regards,
 Santosh


 On Wed, Apr 27, 2011 at 5:57 AM, David Winsemius 
 dwinsem...@comcast.netwrote:


 On Apr 27, 2011, at 12:07 AM, Dennis Murphy wrote:

  Hi:

 Maybe this can help get you started. Reading your data into a matrix m,

 m - structure(c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0,
 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0,
 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0,
 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), .Dim =
 c(11L,
 11L))

 rowSums(m) + colSums(m) - 1
 [1]  2  2  1 -1  3  3  3  3  3  3 -1

 The pair of 2's = a 2 x 2 block, 1 = a 1 x 1 matrix with value 1, -1
 = a 1 x 1 matrix with entry 0, a triplet of 3's = a 3 x 3 subblock,
 etc. You should be able to figure out the rows and columns for each
 submatrix from the indices of the vector above; the values provide an
 indication of matrix size as well as position.


 If we are in the stage of providing potentially useful but incomplete
 ideas, this would be my notion. Use the row and col functions with [ to
 locate non-zero elements in the diagonal and subdiagonal:

 Diagonal:  (My matrix was named `mm`)
  mm[row(mm)==col(mm)]
  [1] 1 1 1 0 1 1 1 1 1 1 0
 First subdiagonal:
  mm[row(mm)==col(mm)+1]

  [1] 0 0 0 0 0 0 0 0 0 0
 First superdiagonal:
  mm[row(mm)==col(mm)-1]
  [1] 1 0 0 0 1 1 0 1 1 0

 Perhaps a combination of the two? It seems as though the rowSums/colSums
 approach might be insensitive to whether triangular blocks were sub or super
 diagonal:

  rowSums(mm) + colSums(mm) - 1

  [1]  2  2  1 -1  3  3  3  3  3  3 -1
  mm[1,2]-0
  mm[2,1]-1
  rowSums(mm) + colSums(mm) - 1

  [1]  2  2  1 -1  3  3  3  3  3  3 -1

  HTH,
 Dennis



 On Tue, Apr 26, 2011 at 5:13 PM, Santosh santosh2...@gmail.com wrote:

 Dear Rxperts

 Below is a small vector of values of zeros and non-zeros... was
 wondering if
 there is an efficient way to get the block sizes of submatrices of a big
 matrix similar to the one shown below? diagonal elements can be zero
 too.
 Rows with only a diagonal element may be considered as a unit block
 size.

 c(1,0,0,0,0,0,0,0,0,0,0,
  1,1,0,0,0,0,0,0,0,0,0,
  0,0,1,0,0,0,0,0,0,0,0,
  0,0,0,0,0,0,0,0,0,0,0,
  0,0,0,0,1,0,0,0,0,0,0,
  0,0,0,0,1,1,0,0,0,0,0,
  0,0,0,0,1,1,1,0,0,0,0,
  0,0,0,0,0,0,0,1,0,0,0,
  0,0,0,0,0,0,0,1,1,0,0,
  0,0,0,0,0,0,0,1,1,1,0,
  0,0,0,0,0,0,0,0,0,0,0)

 Thanks much!
 Santosh

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 David Winsemius, MD
 West Hartford, CT




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html

Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 1:29 PM, Mike Miller wrote:


On Fri, 29 Apr 2011, Giovanni Petris wrote:

Well, but the original poster also refers to 0.2 and 0.8 as  
expected min and max, in which case we are back to a joke...


Well, he is a lot better with English than I am with Mandarin.  He  
seemed to like the truncated normal answers, so we'll let those be  
his answers.


It is possible to choose parameters for a normal distribution with  
500 observations such that the expected value of the maximum is .8  
and the expected value of the minimum is .2.  Obviously, the mean  
would be .5, not 1, but what would the variance then have to be to  
provide the correct expected max and min?  That's another legitimate  
question.


You would need to specify an N since the expected first and last order  
statistic would decrease/increase with increasing N.


--
David.



Mike



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org 
] On Behalf Of Mao Jianfeng

Sent: Thursday, April 28, 2011 12:02 PM
To: r-help@r-project.org
Subject: [R] how to generate a normal distribution with mean=1,  
min=0.2, max=0.8


Dear all,

This is a simple probability problem. I want to know, How to  
generate a normal distribution with mean=1, min=0.2 and max=0.8?


I know how the generate a normal distribution of mean = 1 and sd  
= 1 and with 500 data point.


rnorm(n=500, m=1, sd=1)

But, I am confusing with how to generate a normal distribution  
with expected min and max. I expect to hear your directions.


Thanks in advance.

Best,
Jian-Feng,


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] The bin/R file - hardcoded paths

2011-04-29 Thread Saptarshi Guha
Hello,

I notice that e.g /home/sguha/lib64 is hard coded into the /bin/R file .
I nstalled R as ./configure --prefix=$HOME ...

What i need to do is ship the entire R distribution to remote nodes,
and run R. These are shipped to ephemeral directories
so I dont know the path ahead of time.

R_HOME doesn't change things either.

So i guess one cant run R on a system unless it's been installed?

1. I can't install R on the compute nodes using ./configure 
2. All nodes do have the same architecture
3. I would like to stick to the 'shipping' approach.


Thanks
Saptarshi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of David Winsemius
 Sent: Friday, April 29, 2011 10:36 AM
 To: Tal Galili
 Cc: r-help@r-project.org
 Subject: Re: [R] read.csv fails to read a CSV file from google docs
 
 
 On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
 
  Hello all,
  I wish to use read.csv to read a google doc spreadsheet.
 
  I try using the following code:
 
  data_url - 
  
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enke
 y=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid
 =0output=csv
  
  read.csv(data_url)
 
  Which results in the following error:
 
  Error in file(file, rt) : cannot open the connection

With S+ I get:
 S+
download.file(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=
enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0ou
tput=csv, destfile=e:/temp/splus)
 Problem in
download.file(http://spreadsheets0.google.com/spreadsheet/pu..: Could
not get url: un
 supported protocol, libcurl was built with SSL disabled, https: not
supported!
and with cygwin's wget I get
 E:\temp\jnkwget
http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDT
Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0outpu
t=csv
 --2011-04-29 11:00:10--
http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTV
ek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=
0output=csv
 Resolving spreadsheets0.google.com... 74.125.224.73, 74.125.224.71,
74.125.224.64, ...
 Connecting to spreadsheets0.google.com|74.125.224.73|:80... connected.
 HTTP request sent, awaiting response... 302 Moved Temporarily
 Location:
https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDT
Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv [
following]
 --2011-04-29 11:00:11--
https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDT
Vek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid
=0output=csv
 Connecting to spreadsheets0.google.com|74.125.224.73|:443... connected.
 ERROR: cannot verify spreadsheets0.google.com's certificate, issued by
`/C=US/O=Google Inc/CN=Google Internet Authority':
   Unable to locally verify the issuer's authority.
 To connect to spreadsheets0.google.com insecurely, use
`--no-check-certificate'.
 Unable to establish SSL connection.

so I suspect that the SLL/certifcate business may also be the problem
when
using R to get the document.  The R error message is not very
illuminating.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 

 
 
  I'm on windows 7.  And the code was tried on R 2.12 and 2.13
 
  I remember trying this a few months ago and it worked fine.
 
 I am always amused at such claims. Occasionally they are 
 correct, but  
 more often a crucial step has been omitted. In this case you 
 have at a  
 minimum embedded line-feeds in your URL string and have not  
 established a connection, so it could not possibly have succeeded as  
 presented.
 
 But now it's time to admit I do not know why it is not 
 succeeding when  
 I correct those flaws.
 
   closeAllConnections()
   data_url - 
 url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=
 enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=tru
 egid=0output=csv 
 )
   read.csv(data_url)
 Error in open.connection(file, rt) : cannot open the connection
 
   closeAllConnections()
   dd - read.csv(con -  
 url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=
 enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=tru
 egid=0output=csv 
 ))
 Error in open.connection(file, rt) : cannot open the connection
 
 
 So, I guess I'm not reading the help pages for `url` and 
 `read.csv` as  
 well I thought I was.
 
 
  Any suggestion what might be causing this or how to solve it?
 
 
 -- 
 David Winsemius, MD
 West Hartford, CT
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Duncan Temple Lang

Thanks David for fixing the early issues.

The reason for the failure is that the response
from the Web server is a to redirect the requester
to another page, specifically

 
https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv

Note that this is https, not http, and the built-in URL reading facilities in R 
don't suport https.


One way to see this is to use look at the headers in your browser (e.g. Live 
HTTP Headers),
or to use curl, or the RCurl package

tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;,
  hl =en, key = 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE,
  single = true, gid =0,
  output = csv,
 .opts = list(followlocation = TRUE, verbose = TRUE))


The verbose option shows the entire dialog, and tt contains the
text of the CSV document.

 read.csv(textConnection(tt))

then yields the data frame

  D.


On 4/29/11 10:36 AM, David Winsemius wrote:
 
 On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
 
 Hello all,
 I wish to use read.csv to read a google doc spreadsheet.

 I try using the following code:

 data_url - 
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv

 
 read.csv(data_url)

 Which results in the following error:

 Error in file(file, rt) : cannot open the connection


 I'm on windows 7.  And the code was tried on R 2.12 and 2.13

 I remember trying this a few months ago and it worked fine.
 
 I am always amused at such claims. Occasionally they are correct, but more 
 often a crucial step has been omitted. In
 this case you have at a minimum embedded line-feeds in your URL string and 
 have not established a connection, so it
 could not possibly have succeeded as presented.
 
 But now it's time to admit I do not know why it is not succeeding when I 
 correct those flaws.
 
 closeAllConnections()
 data_url -
 url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv;)
 
 read.csv(data_url)
 Error in open.connection(file, rt) : cannot open the connection
 
 closeAllConnections()
 dd - read.csv(con - 
 url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv;))
 
 Error in open.connection(file, rt) : cannot open the connection
 
 
 So, I guess I'm not reading the help pages for `url` and `read.csv` as well I 
 thought I was.
 
 
 Any suggestion what might be causing this or how to solve it?
 


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Philipp Pagel
On Fri, Apr 29, 2011 at 06:19:24PM +0300, Tal Galili wrote:

 data_url - 
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
 read.csv(data_url)
 Error in file(file, rt) : cannot open the connection

I get the same error (R 2.11.1, Debian LINUX) and don't have a
solution. But I did some tests and found the origin of the problem

I can download the file from google with wget but get some interesting
´information in the process:


$ wget -v 
'http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv'
--2011-04-29 20:07:40--  
http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
Resolving spreadsheets0.google.com... 209.85.148.139, 209.85.148.113, 
209.85.148.138, ...
Connecting to spreadsheets0.google.com|209.85.148.139|:80... connected.
HTTP request sent, awaiting response... 302 Moved Temporarily
Location: 
https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 [following]
--2011-04-29 20:07:41--  
https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
Connecting to spreadsheets0.google.com|209.85.148.139|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: 
“pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv.1”

[ =   
] 41  --.-K/s   in 0s  

2011-04-29 20:07:42 (342 KB/s) - 
“pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv.1”
 saved [41]


The message that caught my attention was the http redirection: 302 Moved
Temporarily.

If you try again with the new url you get this:

 read.csv(url(https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=trueg;))
Error in open.connection(file, rt) : cannot open the connection
In addition: Warning message:
In open.connection(file, rt) : unsupported URL scheme

?url told me Note that ‘https://’ connections are not supported.
Case closed, problem unsolved...

Dirty workaround: use system() and wget or whatever command is available on
Windows for this.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
Maximus-von-Imhof-Forum 3
85354 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mlogit package, Error in X[omitlines, ] - NA : subscript out of bounds

2011-04-29 Thread Yong Wang
I am using the mlogit packages and get a data problem, for which I
can't find any clue from R archive.

code below shows my related code all the way to the error

#---
mydata - data.frame(dependent,x,y,z)

mydata$dependent-as.factor(mydata$dependent)

mldata-mlogit.data(mydata, varying=NULL, choice=dependent, shape=wide)

summary(mlogit.1- mlogit(dependent~1|x+y+z, data = mldata, reflevel=0))

Error in X[omitlines, ] - NA : subscript out of bounds ,
#---

Could anybody kindly tip how  can I possibly solve this problem?

Thank you

yong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] logistic regression with glm: cooks distance and dfbetas are different compared to SPSS output

2011-04-29 Thread Biedermann, Jürgen

Hi there,

I have the problem, that I'm not able to reproduce the SPSS residual 
statistics (dfbeta and cook's distance) with a simple binary logistic 
regression model obtained in R via the glm-function.


I tried the following:

fit - glm(y ~ x1 + x2 + x3, data, family=binomial)

cooks.distance(fit)
dfbetas(fit)

When i compare the returned values with the values that I get in SPSS, 
they are different, although the same model is calculated (the 
coefficients are the same etc.)


It seems that different calculation-formulas are used for cooks.distance 
and dfbetas in SPSS compared to R.


Unfortunately I didn't find out, what's the difference in the 
calculation and how I could get R to calculate me the same statistics 
that SPSS uses.

Or is this an unknown SPSS bug?

Greetings
Jürgen

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] replace non numeric with NA

2011-04-29 Thread Nandini B

Thanks a lot Duncan, this is what I was looking for!!Thank you,Nandini 




 Date: Fri, 29 Apr 2011 09:53:06 -0400
 From: murdoch.dun...@gmail.com
 To: nandini...@hotmail.com
 CC: r-help@r-project.org
 Subject: Re: [R] replace non numeric with NA
 
 On 29/04/2011 6:45 AM, Nandini B wrote:
Hello,
  I have a sample data frame which looks like this
 day  od   month
  1   1 0.12
  2   3 #VALUE! 1
  3   5 0.4 12
  4   7 0.8 10
  5  11   -  3
  6  14   s 7
  7  18  -- 12
  8  27  197
 
 
  Now i wish to filter all the non numeric values and replace it with NA. 
  The data frame is actually huge and the non numeric characters vary from 
  - to a string to absolutely anything!!!
  Can anyone please help ?
 
 You don't tell use the types of the columns, so I'll assume they are 
 factors.  If so, call
 
 as.numeric(as.character())
 
 on each of them to convert the number-like values to numbers, the others 
 to NA.  For example,
 
 df$day - as.numeric(as.character(df$day))
 
 Duncan Murdoch
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Regression Summary for a List

2011-04-29 Thread Ryan J. McGuigan
Hi,

I am trying to run a regression on two matrices with 10 columns.  I have
been able to run the regression with the following code:

fit=list()
for(i in 1:10) {
fit[[i]]=lm(monret[,i]~janret[,i])
}

However, I can't get the regression to spit out more than the coefficients
(summary(fit) does not work).  I really need the full summary for each of
the 10 regressions, including the R-squared values.  I'm sure there's a
simple way to do this I just can't seem to figure it out.

Thanks.

-Ryan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Trying to get RWeka/Snowball to work

2011-04-29 Thread Peter Holme
Hi!

I was trying to install RWeka to be able to use SnowballStemmer in a Mac OS
X 10.6.7 environment... but coudn't do it... I get error messages after:

 library(RWeka);
 install(Snowball);
 ## Test the supplied vocabulary for the default stemmer ('porter'):
 source - readLines(system.file(words, porter,voc.txt,
+ package = Snowball))
 result - SnowballStemmer(source)
Error in .jnew(name) :
  java.lang.InternalError: Can't start the AWT because Java was started on
the first thread.  Make sure StartOnFirstThread is not specified in your
application's Info.plist or on the command line
 target - readLines(system.file(words, porter, output.txt,
+ package = Snowball))
 ## Any differences?
 any(result != target)
Error: object 'result' not found
Trying to add database driver (JDBC): RmiJdbc.RJDriver - Warning, not in
CLASSPATH?
Trying to add database driver (JDBC): jdbc.idbDriver - Warning, not in
CLASSPATH?
Trying to add database driver (JDBC): org.gjt.mm.mysql.Driver - Warning, not
in CLASSPATH?
Trying to add database driver (JDBC): com.mckoi.JDBCDriver - Warning, not in
CLASSPATH?
Trying to add database driver (JDBC): org.hsqldb.jdbcDriver - Warning, not
in CLASSPATH?

Well – after searching around, I decided to take the matter into my own
hands – not ideal, but it fits my small purpose for now... will possibly
expand it later..:
http://holme.se/stem/

:)

Peter
-- 
+47 920 42 782

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with qualitative variables in anova

2011-04-29 Thread katerinaaa
Hi,
I am newbie in R programming and I need some help.

I have two columns the first has 1000 values Y/N/U and the other has f/m.
Like that :

q7  sex
==
Um
U f
Um
Nf

I want to do one way anova parametric and no parametric.
But I have some problems.

Code:

frameq7 - data.frame(q7,sex)
frameq7

r - aov(q7 ~ sex, data = frameq7)
summary(r)

I take
Error in storage.mode(y) - double :
invalid to change the storage mode of a factor
In addition: Warning message:
In model.response(mf, numeric) :
using type=numeric with a factor response will be ignored

Could you help me please to make it wright ?

And finally how can I present this analysis ? with boxplot ?

Thanks a lot 

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3483845.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about lrm, validate, pentrace

2011-04-29 Thread khosoda

(11/04/29 22:09), Frank Harrell wrote:

Yes I would select that as the final model.


Thank you for your comment. I am able to be confident about my model now.

The difference you saw is caused

by different treatment of penalization of factor variables, related to the
use of the sum squared differences between the estimate at one category from
the average over all categories.  I think that as long as you code it one
way consistently and pick the penalty using that coding you are OK.  But if
the coefficients of the non-factor variables depend on how the binary
predictor is coded, there is a bit more concern.


A lot of previous studies have demonstrated that poor outcome is more 
frequent in treat2 than in treat 1. So, I coded treat1 as 0, and treat2 
as 1 in the first mail. Then, I came back to the original coding of 
treat1 and treat2 in the newer mail. According to your answer, I guess I 
am OK. :-)


Prof Harrell, Your book (Rregression Modeling Strategies) and many kind 
comments helped me a lot. Thank you very much again.


--
KH



Frank


細田弘吉 wrote:


Thank you for you quick reply, Prof. Harrell.
According to your advice, I ran pentrace using a very wide range.

pentrace.x6factor- pentrace(x6factor.lrm, seq(0, 100, by=0.5))
plot(pentrace.x6factor)

I attached this figure. Then,

pentrace.x6factor- pentrace(x6factor.lrm, seq(0, 10, by=0.05))

It seems reasonable that the best penalty is 2.55.

x6factor.lrm.pen- update(x6factor.lrm, penalty=2.55)
cbind(coef(x6factor.lrm), coef(x6factor.lrm.pen),
abs(coef(x6factor.lrm)-coef(x6factor.lrm.pen)))
   [,1][,2][,3]
Intercept -4.32434556 -3.86816460 0.456180958
stenosis  -0.01496757 -0.01091755 0.004050025
T1 3.04248257  2.42443034 0.618052225
T2-0.75335619 -0.57194342 0.181412767
procedure -1.20847252 -0.82589263 0.382579892
ClinicalScore  0.37623189  0.30524628 0.070985611

validate(x6factor.lrm, bw=F, B=200)
index.orig trainingtest optimism index.corrected   n
Dxy   0.6324   0.6849  0.5955   0.0894  0.5430 200
R20.3668   0.4220  0.3231   0.0989  0.2679 200
Intercept 0.   0. -0.1924   0.1924 -0.1924 200
Slope 1.   1.  0.7796   0.2204  0.7796 200
Emax  0.   0.  0.0915   0.0915  0.0915 200
D 0.2716   0.3229  0.2339   0.0890  0.1826 200
U-0.0192  -0.0192  0.0243  -0.0436  0.0243 200
Q 0.2908   0.3422  0.2096   0.1325  0.1582 200
B 0.1272   0.1171  0.1357  -0.0186  0.1457 200
g 1.6328   1.9879  1.4940   0.4939  1.1389 200
gp0.2367   0.2502  0.2216   0.0286  0.2080 200


validate(x6factor.lrm.pen, bw=F, B=200)
index.orig trainingtest optimism index.corrected   n
Dxy   0.6375   0.6857  0.6024   0.0833  0.5542 200
R20.3145   0.3488  0.3267   0.0221  0.2924 200
Intercept 0.   0.  0.0882  -0.0882  0.0882 200
Slope 1.   1.  1.0923  -0.0923  1.0923 200
Emax  0.   0.  0.0340   0.0340  0.0340 200
D 0.2612   0.2571  0.2370   0.0201  0.2411 200
U-0.0192  -0.0192 -0.0047  -0.0145 -0.0047 200
Q 0.2805   0.2763  0.2417   0.0346  0.2458 200
B 0.1292   0.1224  0.1355  -0.0132  0.1423 200
g 1.2704   1.3917  1.5019  -0.1102  1.3805 200
gp0.2020   0.2091  0.2229  -0.0138  0.2158 200

In the penalized model (x6factor.lrm.pen), the apparent Dxy is 0.64, and
bias-corrected Dxy is 0.55. The maximum absolute error is estimated to
be 0.034, smaller than non-penalized model (0.0915 in x6factor.lrm) The
changes in slope and intercept are substantially reduced in penalized
model. I think overfitting is improved at least to some extent. Should I
select this as a final model?

I have one more question. The procedure variable was defined as 0/1
value in the previous mail. For some graphical reason, I redefined it as
treat1/treat2 value. Then, the best penalty value was changed from 3.05
to 2.55. I guess change from numeric to factorial caused this reduction
in penalty. Which set up should I select?

I appreciate your help in advance.

--
KH

(11/04/26 0:21), Frank Harrell wrote:

You've done a lot of good work on this.  Yes I would say you have
moderate
overfitting with the first model.  The only thing that saved you from
having
severe overfitting is that there seems to be a signal present [I am
assume
this model is truly pre-specified and was not developed at all by looking
at
patterns of responses Y.]

The use of backwards stepdown demonstrated much worse overfitting.  This
is
in line with what we know about the damage of stepwise selection methods
that do not incorporate shrinkage.  I would throw away the stepwise

[R] importing and filtering time series data

2011-04-29 Thread Joel Reymont
Folks,

I'm new to R and would like to use it to analyze web server performance data. 

I collect the data in this CSV format:

1304083104.41,Y,668.856249809
1304083104.41,Y,348.143193007

First column is a seconds.microseconds timestamp, rows with N instead of Y 
need to be skipped and the last column has the same format as the first column, 
except it's request duration (latency).

I would like to calculate average number of requests per second, mean latency, 
variance, 5 and 95 percentiles.

What is the best way to accomplish this, starting with importing of time series?

Thanks, Joel

--
- for hire: mac osx device driver ninja, kernel extensions and usb drivers
-++---
http://wagerlabs.com | @wagerlabs | http://www.linkedin.com/in/joelreymont
-++---
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for 2x2 table

2011-04-29 Thread viostorm

After I shared comments form the forum yesterday with the biostatistician he
indicated this:

Fisher's exact test is the non-parametric analog for the Chi-square 
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact 
test, known as the Freeman-Halton test applies to comparisons for tables 
greater than 2x2. SAS can calculate both statistics using the following 
instructions.

  proc freq; tables a * b / fisher;

Do people here still stand by position fisher exact test can be used for RxC
contingency tables ?  Sorry to both you all so much it is just important for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least by
the titles indicates it is extending it to RxC 

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.
 
Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again, 

-Rob



viostorm wrote:
 
 Thank you all very kindly for your help.
 
 -Rob
 
 
 Robert Schutt III, MD, MCS 
 Resident - Department of Internal Medicine
 University of Virginia, Charlottesville, Virginia
 

viostorm wrote:
 
 Thank you all very kindly for your help.
 
 -Rob
 
 
 Robert Schutt III, MD, MCS 
 Resident - Department of Internal Medicine
 University of Virginia, Charlottesville, Virginia
 


--
View this message in context: 
http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convert filogenetic tree to binary matrix

2011-04-29 Thread vanderlei52
Hi Ben,

Thank you for your help.

I did the same question in the r-sig-phylo mailing list. Liam Revell gave
the following solution: 

temp-prop.part(tree)
X-matrix(0,nrow=length(tree$tip),ncol=length(temp),dimnames=list(tree$tip.label,tree$node.label))
for(i in 1:ncol(X)) X[temp[[i]],i]-1

Vanderlei


--
View this message in context: 
http://r.789695.n4.nabble.com/Convert-filogenetic-tree-to-binary-matrix-tp3478961p3484371.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regression Summary for a List

2011-04-29 Thread Phil Spector

Ryan -
   summary expects an lm object, and fit is a list.  So
you need to use something like

   lapply(fit,summary)

to pass each list element to the summary function.

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Fri, 29 Apr 2011, Ryan J. McGuigan wrote:


Hi,

I am trying to run a regression on two matrices with 10 columns.  I have
been able to run the regression with the following code:

fit=list()
for(i in 1:10) {
fit[[i]]=lm(monret[,i]~janret[,i])
}

However, I can't get the regression to spit out more than the coefficients
(summary(fit) does not work).  I really need the full summary for each of
the 10 regressions, including the R-squared values.  I'm sure there's a
simple way to do this I just can't seem to figure it out.

Thanks.

-Ryan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conditonal Rank

2011-04-29 Thread Doran, Harold
Suppose I have data such as

tmp - data.frame(score = c(1,2,3,4, 4,3,2,1), trial = gl(2,4), Gender = 
gl(2,2,8, labels=c('M', 'F')))

Now I would like to compute a rank on the variable score conditional on trial 
and gender. I could do

res - with(tmp, tapply(score, list(Gender, trial), rank))
res[,1]
res[,2]

and then finagle a way to create a new variable in the dataframe tmp that has 
these ranks associated with the correct rows. But, perhaps there is a better 
way. Any suggestions?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] using lme4 with three nested random effects

2011-04-29 Thread Benjamin Caldwell
Thierry,
The first suggestion worked. Thank you very much.
*Ben Caldwell*

University of California, Berkeley
137 Mulford Hall #3114
Berkeley, CA 94720
Office 223 Mulford Hall
(510)859-3358



On Fri, Apr 29, 2011 at 1:52 AM, ONKELINX, Thierry thierry.onkel...@inbo.be
 wrote:

 Dear Ben,

 Are site, transect and plot factors? And do they have unique id's?

 You could try this

 rws30.UL$site - factor(rws30.UL$site)
 rws30.UL$transect - interaction(rws30.UL$site, rws30.UL$transect, drop =
 TRUE)
 rws30.UL$plot - interaction(rws30.UL$site, rws30.UL$transect,
 rws30.UL$plot, drop = TRUE)
 modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
 +(1|site/transect/plot),
  data=rws30.UL, family=gaussian, na.action=na.omit)

 Or

 modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bak.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
 +(1|site) + (1|transect) + (1|plot),
  data=rws30.UL, family=gaussian, na.action=na.omit)

 Best regards,

 Thierry


 
 ir. Thierry Onkelinx
 Instituut voor natuur- en bosonderzoek
 team Biometrie  Kwaliteitszorg
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium

 Research Institute for Nature and Forest
 team Biometrics  Quality Assurance
 Gaverstraat 4
 9500 Geraardsbergen
 Belgium

 tel. + 32 54/436 185
 thierry.onkel...@inbo.be
 www.inbo.be

 To call in the statistician after the experiment is done may be no more
 than asking him to perform a post-mortem examination: he may be able to say
 what the experiment died of.
 ~ Sir Ronald Aylmer Fisher

 The plural of anecdote is not data.
 ~ Roger Brinner

 The combination of some data and an aching desire for an answer does not
 ensure that a reasonable answer can be extracted from a given body of data.
 ~ John Tukey


  -Oorspronkelijk bericht-
  Van: r-help-boun...@r-project.org
  [mailto:r-help-boun...@r-project.org] Namens Benjamin Caldwell
  Verzonden: vrijdag 29 april 2011 0:37
  Aan: r-help
  Onderwerp: [R] using lme4 with three nested random effects
 
  Hi all,
  I'm trying to fit models for data with three levels of nested random
  effects: site/transect/plot. For example,
 
  modelincrBS-glmer(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bar
  k.thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num
  +(1|site/transect/plot),
  data=rws30.UL, family=gaussian, na.action=na.omit)
 
  but I get the following error:
 
  Error: length(f1) == length(f2) is not TRUE In addition:
  Warning messages:
  1: In plot:(transect:site) :
numerical expression has 92 elements: only the first used
  2: In plot:(transect:site) :
numerical expression has 92 elements: only the first used
 
  The formulation works for two nested effects (e.g. 1|site/transect)
 
  I can get it to run in lme
  modelincrBS-lme(l.ru.ba.incr~shigo.av+pre.f.crwn.length+bark.
  thick.bh+Date+slope.pos.num+dens.T+dbh+leaf.area+can.pos.num,
  data=rws30.UL, random=(~1| site/transect/plot),na.action=na.omit)
 
  but I can't specify a distribution family in that package.
 
  Any help much appreciated.
 
  Ben Caldwell
 
  *
  *
 
[[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Analysis and graphics by groups

2011-04-29 Thread Cristiano Yuji Sasada Sato
Hello,

This is my first post in this e-mail list and I hope it's enough to justify
calling for help. In case it's not, sorry.

I'm trying to do analysis and graphics using a factor as a criteria to split
data and do the analysis/graphics for each subset of data.

Right now what I'm trying to do is to fit and plot the following logistic
model, according to a third variable named Cerca:
dm_fit_T-nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T)

I've found a function called gapply which seems to be what I need, but it
doesn't seem to work. This is the argument I've used:
gapply(perieph,FUN=nls(nDMTRBgm2~(K/(1+((K-nDMTRBgm2.T.1)/nDMTRBgm2.T.1)*exp(-r))),perieph,start=list(K=3,r=0.2),trace=T),groups=Cerca)

But I get this error message returned:
 Error in get(as.character(FUN), mode = function, envir = envir) :
object 'FUN' of mode 'function' was not found

Can you help me doing this non-linear regression by groups work?

Also, after I manage making the regression, I'd also need fitting a line to
the nDMTRBgm2~nDMTRBgm2.T.1 data using the same model above. I've used
plotfit to do that with one nlm data set. Is it possible to fit each group
trend line and data with different colours/symbols  in one same graphic?

Thank you,
Cristiano

-- 
Cristiano Yuji Sasada Sato
Doutorando
Programa de Pós-Graduação em Ecologia e Evolução - IBRAG / UERJ
Laboratório de Ecologia de Rios e Córregos
Departamento de Ecologia - Universidade do Estado do Rio de Janeiro

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RCurl and postForm()

2011-04-29 Thread Elmore, Ryan
Hi everybody,

I think that I am missing something fundamental in how strings are passed from 
a postForm() call in R to the curl or libcurl functions underneath.  For 
example, I can do the following using curl from the command line:

$ curl -d Archbishop Huxley http://www.datasciencetoolkit.org/text2people;
[{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop
 Huxley}]

Trying the same thing, or what I *think* is the same thing (obvious not) in R 
(Mac OS 10.6.7, R 2.13.0) produces:

 library(RCurl)
Loading required package: bitops
 api - http://www.datasciencetoolkit.org/text2people;
 postForm(api, a=Archbishop Huxley)
[1] 
[{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:44,\end_index\:61,\matched_string\:\Archbishop
 
Huxley\},{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:88,\end_index\:105,\matched_string\:\Archbishop
 Huxley\}]
attr(,Content-Type)
charset
text/html utf-8

I can match the result given on the DSTK API's website by using system(), but 
doesn't seem like the R-like way of doing something.

 system(curl -d 'Archbishop Huxley' 
 'http://www.datasciencetoolkit.org/text2people')
158   141  141   141
0[{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop
 Huxley}]17599 72 --:--:-- --:--:-- --:--:--   670

If you want to see some additional information related to this question, I 
posted on StackOverflow a few days ago:
http://stackoverflow.com/questions/5797688/post-request-using-rcurl

I am working on this R wrapper for the data science toolkit as a way of 
illustrating how to make an R package for the Denver RUG and ran into this 
problem.  Any help to this problem will be greatly appreciated by the Denver 
RUG!

Cheers,
Ryan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] using R in C#

2011-04-29 Thread Lodha, Akhil
Hi,

I've been able to use R in C# on my machine by following the steps on the 
http://www.codeproject.com/KB/cs/RtoCSharp.aspx.

This works locally, i.e. if R is running on my box. I was wondering if its 
possible to change it so that I can connect to another machine that is running 
R (and has rscproxy installed).

This way a lot of people can use R in C# without having to first install it on 
their boxes if its installed on another box that they can connect to.

Thanks,
Akhil

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Bigining with a Program of SVR

2011-04-29 Thread ypriverol
Hi:
  I'm starting a research of Support Vector Regression. I want to obtain a
model to predict a property A with 
  a set of property B, C, D, ...  This problem is very common for example in
QSAR models. I want to know 
  some examples and package that could help me in this way. I know about
caret and e1071. But I' don't 
  know if this package can work with continues variables.?

Thanks in advance

--
View this message in context: 
http://r.789695.n4.nabble.com/Bigining-with-a-Program-of-SVR-tp3484476p3484476.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using Java methods in R

2011-04-29 Thread Robert Baer

-- snip --

It clogs up my email, takes a long
time to delete, and is hard to be selective enough to not
delete some of my other important email.

-- snip --

If you don't care about contributing to the R listserve community, it's hard 
to imagine why that community should care about you.


Some people (not me) seem to use nabble [ http://www.nabble.com/ ] to 
monitor the list.  See R under what is cool .  Another option is to set 
up rules in your email client to direct your mail to an appropriate 
folders or if you use gmail I guess we would say to label you R listserve 
email.


You can search mail archives for a topic of interest with the R command line 
command

RSiteSearch().  To learn more type
?RSiteSearch

For fun I put rJava rectangular arrays into this search engine (having no 
idea what  that means) and one of the things that came out was:

http://finzi.psych.upenn.edu/R/library/rJava/html/jrectRef-class.html

Hopefully, this or one of the other things can be useful to you.

Finally for the third time, try joining/looking at:
stats-rosuda-devel:
  http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel
or the archive:
 http://mailman.rz.uni-augsburg.de/pipermail/stats-rosuda-devel/

--
Robert W. Baer, Ph.D.
Professor of Physiology
Kirksville College of Osteopathic Medicine
A. T. Still University of Health Sciences
800 W. Jefferson St.
Kirksville, MO 63501
660-626-2322
FAX 660-626-2965

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to generate a normal distribution with mean=1, min=0.2, max=0.8

2011-04-29 Thread Mike Miller

On Fri, 29 Apr 2011, David Winsemius wrote:


On Apr 29, 2011, at 1:29 PM, Mike Miller wrote:


On Fri, 29 Apr 2011, Giovanni Petris wrote:

Well, but the original poster also refers to 0.2 and 0.8 as expected min 
and max, in which case we are back to a joke...


Well, he is a lot better with English than I am with Mandarin.  He seemed 
to like the truncated normal answers, so we'll let those be his answers.


It is possible to choose parameters for a normal distribution with 500 
observations such that the expected value of the maximum is .8 and the 
expected value of the minimum is .2.  Obviously, the mean would be .5, 
not 1, but what would the variance then have to be to provide the 
correct expected max and min?  That's another legitimate question.


You would need to specify an N since the expected first and last order 
statistic would decrease/increase with increasing N.


Right -- I chose N=500, as did the OP.  I think the order statistics for 
the normal are pretty complex, but it wouldn't be hard to use the density 
for order statistics for the uniform to compute the appropriate values for 
a standard normal, then rescale.


http://en.wikipedia.org/wiki/Order_statistic#The_order_statistics_of_the_uniform_distribution

You'd have to multiply the beta density times the inverse normal cdf and 
get the weighted average for a set of points.  It doesn't sound terribly 
difficult but I don't want to do it!  ;-)


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with qualitative variables in anova

2011-04-29 Thread David Winsemius

Are you working on the same homework problem as user494766?

http://stackoverflow.com/questions/5835605/1-way-anova-in-r-help

--  
David.


On Apr 29, 2011, at 10:43 AM, katerinaaa wrote:


Hi,
I am newbie in R programming and I need some help.

I have two columns the first has 1000 values Y/N/U and the other has  
f/m.

Like that :

q7  sex
==
Um
U f
Um
Nf

I want to do one way anova parametric and no parametric.
But I have some problems.

Code:

frameq7 - data.frame(q7,sex)
frameq7

r - aov(q7 ~ sex, data = frameq7)
summary(r)

I take
Error in storage.mode(y) - double :
invalid to change the storage mode of a factor
In addition: Warning message:
In model.response(mf, numeric) :
using type=numeric with a factor response will be ignored

Could you help me please to make it wright ?

And finally how can I present this analysis ? with boxplot ?

Thanks a lot

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3483845.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up code with for() loop

2011-04-29 Thread hck
Barth sent me a very good code and I modified it a bit. Have a look:

Error-rnorm(1000, mean=0, sd=0.05)
estimate-(log(1+0.10)+Error)

DCF_korrigiert-(1/(exp(1/(exp(0.5*(-estimate)^2/(0.05^2))*sqrt(2*pi/(0.05^2
))*(1-pnorm(0,((-estimate)/(0.05^2)),sqrt(1/(0.05^2))-1))
DCF_verzerrt-(1/(exp(estimate)-1))

S - 1000   # total sample size
D - 1  # number of subsamples
Subset - 1  # number in each subsample
Select - matrix(sample(S,D*Subset,replace=TRUE),nrow=Subset,ncol=D)

DCF_korrigiert_select - matrix(DCF_korrigiert[Select],nrow=Subset,ncol=D)
Delta_ln -(log(colMeans(DCF_korrigiert_select, na.rm=T)/(1/0.10)))



The only problem I discovered is that R cannot handle more than
2.147.483.647 integers, thus the cells in the matrix are bounded by this
condition. (R shows the max by typing: .Machine$integer.max). And if you
want to safe the workspace, the file with 10.000 times 10.000 becomes round
2 GB. Compared to the original of just 300 MB. 

So I cannot perform my previous bootstrap with 1.000.000 times 100.000. But
nevertheless 10.000 times 10.000 seems to be sufficiently; I have to say its
amazing, how fast the idea works.

Has anybody a suggestion how to make it work for the 1.000.000 times 100.000
bootstrap???


--
View this message in context: 
http://r.789695.n4.nabble.com/Speed-up-code-with-for-loop-tp3481680p3484548.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for 2x2 table

2011-04-29 Thread Brian S Cade
Rob:   Fisher's exact test is conceptually possible for any r x c 
contingency table problem and uses the observed multinomial table 
probability as the test statistic.   Other tests for r x c contingency 
tables use a different test statistic (Chi-squared, likelihood ratio, 
Zelterman's).  It is possible that the probabilities for any of these 
procedures may differ slightly for the same table configuration even if 
the probabilities for each test are calculated by enumerating all possible 
permutations (hypergeometric) under the null hypothesis.   See Mielke and 
Berry 2007 (Permutation Methods:  A distance function approach) Chps 6 
and7.   Mielke has provided efficient Fortran algorithms for enumerating 
the exact probabilities for 2x2, 3x2, 4x2, 5x2, 6x2 ,3x3,and even 2x2x2 
tables for Fisher's exact and Chi-square statistics.   I don't remember 
whether Cyrus Meta's algorithms for Fisher's exact can do more.But the 
important point to keep in mind is that it is possible to use different 
statistics for evaluating the same null hypothesis for r x c tables 
(Fisher's exact uses one form, Chi-square uses another, etc.) and the 
probabilities can be computed by exact enumeration of all permutations 
(what people expect Fisher's exact to do but also possible for Chi-square 
statistic) or by some approximation (asymptotic distribution, Monte Carlo 
resampling).  The complete enumeration of test statistics under the null 
becomes computationally intractable for large dimension r x c problems 
whether using the observed table probability (like Fisher's exact) as a 
test statistic or other like Chi-square statistic.

So in short, yes you can use Fisher's exact on your 4 x 2 problem, and the 
result might differ from using a Chi-square statistic even if you compute 
the P-value for the Chi-square test by complete enumeration.   Note that 
the minimum expected cell size for the Chi-square test is related to 
whether the Chi-square distributional approximation (an asymptotic 
argument) for evaluating the Chi-square statistic will be reasonable and 
is irrelevant if  you calculate your probabilities by exact enumeration of 
all permutations.

Brian
 

Brian S. Cade, PhD

U. S. Geological Survey
Fort Collins Science Center
2150 Centre Ave., Bldg. C
Fort Collins, CO  80526-8818

email:  brian_c...@usgs.gov
tel:  970 226-9326



From:
viostorm rob.sch...@gmail.com
To:
r-help@r-project.org
Date:
04/29/2011 01:23 PM
Subject:
Re: [R] fisher exact for  2x2 table
Sent by:
r-help-boun...@r-project.org




After I shared comments form the forum yesterday with the biostatistician 
he
indicated this:

Fisher's exact test is the non-parametric analog for the Chi-square 
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact 
test, known as the Freeman-Halton test applies to comparisons for tables 
greater than 2x2. SAS can calculate both statistics using the following 
instructions.

  proc freq; tables a * b / fisher;

Do people here still stand by position fisher exact test can be used for 
RxC
contingency tables ?  Sorry to both you all so much it is just important 
for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least 
by
the titles indicates it is extending it to RxC 

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.
 
Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for 
Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again, 

-Rob



viostorm wrote:
 
 Thank you all very kindly for your help.
 
 -Rob
 
 
 Robert Schutt III, MD, MCS 
 Resident - Department of Internal Medicine
 University of Virginia, Charlottesville, Virginia
 

viostorm wrote:
 
 Thank you all very kindly for your help.
 
 -Rob
 
 
 Robert Schutt III, MD, MCS 
 Resident - Department of Internal Medicine
 University of Virginia, Charlottesville, Virginia
 


--
View this message in context: 
http://r.789695.n4.nabble.com/fisher-exact-for-2x2-table-tp3481979p3484009.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list

[R] For loop and sqldf

2011-04-29 Thread mathijsdevaan
Hi list,

Can anyone tell my why the following does not work? Thanks a lot! Your help
is very much appreciated.

DF = data.frame(read.table(textConnection(B  C  D  E  F  G
8025  1995  0  4  1  2
8025  1997  1  1  3  4
8026  1995  0  7  0  0
8026  1996  1  2  3  0
8026  1997  1  2  3  1
8026  1998  6  0  0  4
8026  1999  3  7  0  3
8027  1997  1  2  3  9
8027  1998  1  2  3  1
8027  1999  6  0  0  2
8028  1999  3  7  0  0
8029  1995  0  2  3  3
8029  1998  1  2  3  2
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE))
list-sort(unique(DF$C))
for (t in 1:length(list))
{
year = as.character(list[t])
data[year]-sqldf('select * from DF where C = [year]')
}

I am trying to split up the data.frame into 5 new ones, one for every year. 


--
View this message in context: 
http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484559.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] --mem-vsize in R

2011-04-29 Thread kparamas
Hi,

I am calculation pairwise correlation coefficient for a matrix of 234 X
3.
I am getting the following error,
Error in cbind(as.vector(row(cl)), as.vector(col(cl)), as.vector(cl)) : 
  allocMatrix: too many elements specified
In addition: There were 50 or more warnings (use warnings() to see the first
50)

The function used is,
corGraphPearson = function(cData, COR) #COR is threshold 0.5,0.7, etc
{

cl = unname(cor(cData, use=pairwise.complete.obs, method=pearson))

result = cbind(as.vector(row(cl)),as.vector(col(cl)),as.vector(cl))
result = result[result[,1] != result[,2],]

corm = result

# remove low cor pairs
corm =corm[abs(corm[,3]) = COR, ]
# the network
net - network(corm, directed = F)
}


I am running this in a cluster with 4 machines with 24 GB memory each.

How should I start R so that I make max use of the memory availbale?
Or how to overcome this issue?

--
View this message in context: 
http://r.789695.n4.nabble.com/mem-vsize-in-R-tp3484541p3484541.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with qualitative variables in anova

2011-04-29 Thread katerinaaa
Yes, I wrote also in the other forum because here I didn't take an answer.  

Thanks for your reply

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-with-qualitative-variables-in-anova-tp3483845p3484599.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conditonal Rank

2011-04-29 Thread Dennis Murphy
Hi:

Does this work?

library(plyr)
ddply(tmp, .(trial, Gender), transform, rankscore = rank(score))
  score trial Gender rankscore
1 1 1  M 1
2 2 1  M 2
3 3 1  F 1
4 4 1  F 2
5 4 2  M 2
6 3 2  M 1
7 2 2  F 2
8 1 2  F 1

Alternatively, you could get the 'wide form' with

aggregate(score ~ trial + Gender, data = tmp, FUN = rank)
  trial Gender score.1 score.2
1 1  M   1   2
2 2  M   2   1
3 1  F   1   2
4 2  F   2   1

HTH,
Dennis


On Fri, Apr 29, 2011 at 12:26 PM, Doran, Harold hdo...@air.org wrote:
 Suppose I have data such as

 tmp - data.frame(score = c(1,2,3,4, 4,3,2,1), trial = gl(2,4), Gender = 
 gl(2,2,8, labels=c('M', 'F')))

 Now I would like to compute a rank on the variable score conditional on trial 
 and gender. I could do

 res - with(tmp, tapply(score, list(Gender, trial), rank))
 res[,1]
 res[,2]

 and then finagle a way to create a new variable in the dataframe tmp that has 
 these ranks associated with the correct rows. But, perhaps there is a better 
 way. Any suggestions?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for 2x2 table

2011-04-29 Thread Mike Miller

Rob--

Your biostatistician has not disagreed with the rest of us about anything 
except for his preferred name for the test.  He wants to call it the 
Freeman-Halton test, some people call it the Fisher-Freeman-Halton test, 
but most people call it the Fisher Exact test -- all are the same test. 
When he was adamant you could not do  2x2, what he was being adamant 
about was the name you should use when referring to the test for tables 
larger than 2x2.  Why he was doing that, I don't know, but I think it is 
silly -- he confused you and the rest of us.


He goes on to tell you that to get the Freeman-Halton test in SAS, you use 
tables a * b / fisher.  In other words, SAS calls the test Fisher 
instead of calling it Freeman-Halton.  R also calls it Fisher and not 
Freeman-Halton.  I'm like R and SAS and unlike your biostatistician, but 
to each his own.


You say that he is exceptionally clear on this point, which may be true, 
but what is the point?  The point is that he prefers a different *name* 
for the test than the rest of us.  Everyone agrees on the math/stat.


Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota


On Fri, 29 Apr 2011, viostorm wrote:



After I shared comments form the forum yesterday with the biostatistician he
indicated this:

Fisher's exact test is the non-parametric analog for the Chi-square
test for 2x2 comparisons. A version (or extension) of the Fisher's Exact
test, known as the Freeman-Halton test applies to comparisons for tables
greater than 2x2. SAS can calculate both statistics using the following
instructions.

 proc freq; tables a * b / fisher;

Do people here still stand by position fisher exact test can be used for RxC
contingency tables ?  Sorry to both you all so much it is just important for
a paper I am writing and planning to submit soon. ( I have a 4x2 table but
does not meet expected frequencies requirements for chi-squared.)

I guess people here have suggested R implements, the following, which
unfortunately are unavailable at least easily at my library but  at least by
the titles indicates it is extending it to RxC

Mehta CR, Patel NR. A network algorithm for performing Fisher's exact test
in r c contingency tables. Journal of the American Statistical Association
1983;78:427-34.

Mehta CR, Patel NR. Algorithm 643: FEXACT: A FORTRAN subroutine for Fisher's
exact test on unordered r x c contingency tables. ACM Transactions on
Mathematical Software 1986;12:154-61.

The only reason I ask again is he is exceptionally clear on this point.

Thanks again,

-Rob


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fisher exact for 2x2 table

2011-04-29 Thread Jeremy Miles
On 29 April 2011 08:43, viostorm rob.sch...@gmail.com wrote:

 After I shared comments form the forum yesterday with the biostatistician he
 indicated this:

 Fisher's exact test is the non-parametric analog for the Chi-square
 test for 2x2 comparisons. A version (or extension) of the Fisher's Exact
 test, known as the Freeman-Halton test applies to comparisons for tables
 greater than 2x2. SAS can calculate both statistics using the following
 instructions.

  proc freq; tables a * b / fisher;



SAS documentation says:

Fisher's exact test was extended to general R×C tables by Freeman and
Halton (1951), and this test is *also* known as the Freeman-Halton
test.

Emphasis mine.

Jeremy



-- 
Jeremy Miles
Psychology Research Methods Wiki: www.researchmethodsinpsychology.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Kolmogorov-Smirnov test

2011-04-29 Thread Greg Snow
The general idea of the KS test (and others) can be applied to discrete data, 
but the implementation in R assumes continuous data (does not have the needed 
adjustments to deal with ties).  The chi-square and other tests suffer from the 
same problems in your case.  In all cases the null hypothesis is that the data 
comes from the stated distribution (poisson in your case), failing to reject 
the null hypothesis does not prove that the data comes from that distribution, 
only shows that we cannot disprove that it comes from that distribution.  With 
large sample sizes, your data could come from a true distribution that for all 
practical purposes is equivalent to the poisson, but due to slight rounding or 
other errors has probabilities slightly different for some values (a difference 
that no one would reasonably care about), but these tests can show a 
significant difference.

Usually it is better to just show that your data and the theoretical 
distribution are close enough to each other rather than depending on a formal 
test.  The plots and diagnostics in the vcd package are a good choice here, you 
could also use the KS test statistic (ignoring the p-value and warnings) as 
another measure, but plot the empirical and theoretical distributions to see 
what the value means and how close they are.

Another option is the vis.test function in TeachingDemos, it lets you plot data 
from the theoretical distribution and the actual data, then see if you can 
visually tell the difference.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of m.marcinmichal
 Sent: Thursday, April 28, 2011 3:54 PM
 To: r-help@r-project.org
 Subject: Re: [R] Kolmogorov-Smirnov test
 
 Hi,
 thanks for response.
 
  The Kolmogorov-Smirnov test is designed for distributions on
 continuous
  variable, not discrete like the  poisson.  That is why you are
 getting
  some of your warnings.
 
 I read in Fitting distributions whith R Vito Ricci page 19  that:
 ...
 Kolmogorov-Smirnov test is used to decide if a sample comes from a
 population with a specific distribution. I can be applied both for
 discrete
 (count) data and continuous binned (even if some Authors do not agree
 on
 this point) and both for continuous variables but in page 16 i read
 that
 ... while the Kolmogorov-Smirnov and Anderson-Darling tests are
 restricted
 to continuous distribution and i was little confused, but try this
 test to
 my discrete data.
 
 Generally in first step, I try fit my data to discret or continuous
 distribution (task: find distribution for emirical data). Question, Can
 I
 approximate my discret data by the continuous  distribution? I know
 that
 sometmies we can poisson distribution approxime by the normal
 distribution.
 But what happen if I use another distribution like log normall or gama?
 
 I done another three tests - chi square test. But this tests return
 three
 another results. Suppose that we have the same data i.e vectorSentence.
 Test:
 1. One
 param - fitdistr(vectorSentence, poisson)
 chisq.test(table(vectorSentence), p = dpois(1:9, lambda=param[[1]][1]),
 rescale.p = TRUE)
 
 X-squared = 272.8958, df = 8, p-value  2.2e-16
 
 2. Two
 library(vcd)
 gf - goodfit(vectorSentence, type=poisson, method=MinChisq)
 summary(gf)
 
  X^2 df P( X^2)
 Pearson 404.3607  8 2.186332e-82
 
 3. Three
 fdistc - fitdist(vectorSentence, pois)
 g-gofstat(fdistc, print.test = TRUE)
 
 Chi-squared statistic:  535.344
 Degree of freedom of the Chi-squared distribution:  8
 Chi-squared p-value:  1.824112e-110
 
 Question which results is correct?
 
 I know that I can reject null hipotesis: data don't come from poisson
 distribution. But which result is correct?
 
 For another side I trying to accomplish another problem:
 1. Suppose that we have a reference data (dr) from some process (pr)
 which
 save in vectorSentence.
 2. Suppose that we have a two another sample data d1, d2 from another
 two
 process p1, p2
 3. We know that all data is discrete.
 
 Task:
 One: check if data d1, d2 is equal to reference data (dr) - this is not
 a
 problem. I use a cdf, histogram, another mensure etc. chi square test.
 But
 can I use Kolmogorov-Smirnov  to test cumulative distribution function
 hipotesis i.e F(d1) = F(d) for my data?
 Two: find dr distributions discret or if possible continuous
 
 Best
 
 Marcin M.
 
 
 --
 View this message in context: http://r.789695.n4.nabble.com/Kolmogorov-
 Smirnov-test-tp3479506p3482349.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


[R] strange fluctuations in system.time with kernapply

2011-04-29 Thread Alexander Senger

Hello expeRts,


here is something which strikes me as kind of odd and I would like to ask for 
some enlightenment:

First let's do this:

tkern - kernel(modified.daniell, c(5,5))
test - rep(1,100)
system.time(kernapply(test,tkern))
   User  System verstrichen
  1.100   0.040   1.136

That was easy. Now this:

test - rep(1,110)
system.time(kernapply(test,tkern))
   User  System verstrichen
   1.400.021.43

Still fine. Now this:

test - rep(1,111)
system.time(kernapply(test,tkern))
   User  System verstrichen
  1.390   0.020   1.409

Ok, by now it seems boring. But wait:

test - rep(1,1110300)
system.time(kernapply(test,tkern))
   User  System verstrichen
 12.270   0.030  12.319

There is a sudden - and repeatable! - jump in the time needed to execute kernapply. At least from a 
naive point of view there should not be much difference between applying a kernel to a vector 
111 or 1110300 entries long. But maybe there is some limit here?


So I tried this:

test - rep(1,1110400)
system.time(kernapply(test,tkern))
   User  System verstrichen
   1.960.011.97

which doesn't fit into the pattern. But the best thing is still to come. When I 
try this

test - rep(1,1110308)
system.time(kernapply(test,tkern))

then the computer starts to run and does so for longer than 15 minutes until when I normally kill 
the process. As noted above this behaviour is repeatable and occurs every time I issue these commands.


I really would like to know if there is some magic to the number 1110308 I'm 
not aware of.


Last but not least, here is my

sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=de_DE.utf8   LC_NUMERIC=C
 [3] LC_TIME=de_DE.utf8LC_COLLATE=de_DE.utf8
 [5] LC_MONETARY=C LC_MESSAGES=de_DE.utf8
 [7] LC_PAPER=de_DE.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=de_DE.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.10.1


Thank you,

Alex

--
Dipl.-Phys. Alexander SengerTel   : +49 30 2093 4941
Humboldt-Universitaet zu Berlin Fax   : +49 30 2093 4718
AG Quantenoptik und Metrologie
Hausvogteiplatz 5-7 Email :
10117 Berlin, Germany   sen...@physik.hu-berlin.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Tal Galili
Hello Duncan,
Thank you for having a look at this.

I tried the code you provided but it failed in the getForm stage.  running
this:

 tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;,
+  hl =en, key =
0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE,
+  single = true, gid =0,
+  output = csv,
+ .opts = list(followlocation = TRUE, verbose = TRUE))

Resulted in the following error:

Error in curlPerform(url = url, headerfunction = header$update, curl = curl,
 :
  SSL certificate problem, verify that the CA cert is OK. Details:
error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify
failed


Did I miss some step?





Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang dun...@wald.ucdavis.edu
 wrote:


 Thanks David for fixing the early issues.

 The reason for the failure is that the response
 from the Web server is a to redirect the requester
 to another page, specifically


 https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv

 Note that this is https, not http, and the built-in URL reading facilities
 in R don't suport https.


 One way to see this is to use look at the headers in your browser (e.g.
 Live HTTP Headers),
 or to use curl, or the RCurl package

 tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;,
  hl =en, key =
 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE,
  single = true, gid =0,
  output = csv,
 .opts = list(followlocation = TRUE, verbose = TRUE))


 The verbose option shows the entire dialog, and tt contains the
 text of the CSV document.

  read.csv(textConnection(tt))

 then yields the data frame

  D.


 On 4/29/11 10:36 AM, David Winsemius wrote:
 
  On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
 
  Hello all,
  I wish to use read.csv to read a google doc spreadsheet.
 
  I try using the following code:
 
  data_url - 
 
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
  
  read.csv(data_url)
 
  Which results in the following error:
 
  Error in file(file, rt) : cannot open the connection
 
 
  I'm on windows 7.  And the code was tried on R 2.12 and 2.13
 
  I remember trying this a few months ago and it worked fine.
 
  I am always amused at such claims. Occasionally they are correct, but
 more often a crucial step has been omitted. In
  this case you have at a minimum embedded line-feeds in your URL string
 and have not established a connection, so it
  could not possibly have succeeded as presented.
 
  But now it's time to admit I do not know why it is not succeeding when I
 correct those flaws.
 
  closeAllConnections()
  data_url -
  url(
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 )
 
  read.csv(data_url)
  Error in open.connection(file, rt) : cannot open the connection
 
  closeAllConnections()
  dd - read.csv(con -
  url(
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 ))
 
  Error in open.connection(file, rt) : cannot open the connection
 
 
  So, I guess I'm not reading the help pages for `url` and `read.csv` as
 well I thought I was.
 
 
  Any suggestion what might be causing this or how to solve it?
 
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] setting options only inside functions

2011-04-29 Thread William Dunlap
 -Original Message-
 From: r-help-boun...@r-project.org 
 [mailto:r-help-boun...@r-project.org] On Behalf Of 
 luke-tier...@uiowa.edu
 Sent: Friday, April 29, 2011 9:35 AM
 To: Jonathan Daily
 Cc: r-help@r-project.org; Hadley Wickham; Barry Rowlingson
 Subject: Re: [R] setting options only inside functions
 
 The Python solution does not extend, at least not cleanly, to things
 like dev on/ dev off or to Hadley's locale example.  In any case if I
 am reading the Python source correctly on how they handle user
 interrupts this solution has the same non-robusness to user interrupts
 issue that Bill's initial solution had.
 
 As a basis I believe what we need is a mechanism that handles a
 setup, an action, and a cleanup, with setup and cleanup occurring with
 interrupts disablednand the action with interrupts enabled. Scheme's
 dynamic wind is similar, though I don't believe the scheme standard
 addresses interrupts and we don't need to worry about continuations,
 but some of the issues are similar.  Probably we would want two
 flavors, one in which the action has to be a function that takes as a
 single argument the result produced by the setup code, and one in
 which the action can be an argument expression that is then evaluated
 at the appropriate place by laze evaluation.
 
 This can be done at the R level except for the controlling of
 interrupts (and possibly other asynchronous stuff)-- that would need a
 new pair of primitives (suspendInterrupts/enableInterupts or something
 like that).  There is something in the Haskell literature on this that
 I have looked at a while back -- probably time to have another look.

Luke,

  A similar problem is that if optionsList contains an illegal
option then setting options(optionList) will commit changes
to .Options as it works it way down the optionList until it
hits the illegal option, when it throws an error.  Then the
following on.exit is never called (it wouldn't have the output
of options(optionList) to work on if it were called) and the
initial settings in optionList stick around forever.  E.g.,

   withOptions - function(optionList, expr) {
  + oldOpt - options(optionList)
  + on.exit(options(oldOpt))
  + expr
  + }
   getOption(height)
  NULL
   getOption(width)
  [1] 80
   withOptions(list(height=10, width=-2), 666)
  Error in options(optionList) :
invalid 'width' parameter, allowed 10...1
   getOption(height)
  [1] 10
   getOption(width)
  [1] 80

I haven't checked to see if par() works in the same way - it
does in S+.

An ignoreInterrupts(expr) function would not help in that case.
Making options() (and par()) atomic operations would help, but that
may be a lot of work.  options() might also warn but no change
.Options if there were an attempt to set an illegal option.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com 
 
 
 
 On Thu, 28 Apr 2011, Jonathan Daily wrote:
 
  I would also love to see this implemented in R, as my 
 current solution
  to the issue of doing tons of open/close, dev/dev.off, etc. 
 is to use
  snippets in my IDE, and in the end I feel like it is a hack job. A
  pythonic with function would also solve most of the 
 situations where
  I have had to use awkward try or tryCatch calls. I would be 
 willing to
  help with this project, even if it is just testing.
 
  On Wed, Apr 27, 2011 at 5:43 PM, Barry Rowlingson
  b.rowling...@lancaster.ac.uk wrote:
  but it's a little clumsy, because
 
  with_connection(file(myfile.txt), {do stuff...})
 
  isn't very useful because you have no way to reference 
 the connection
  that you're using. Ruby's blocks have arguments which 
 would require
  big changes to R's syntax.  One option would to use pronouns:
 
   Looking very much like python 'with' statements:
 
  http://effbot.org/zone/python-with-statement.htm
 
   Implemented via the 'with' statement which can operate on anything
  that has a __enter__ and an __exit__ method. Very neat.
 
  Barry
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
 -- 
 Luke Tierney
 Statistics and Actuarial Science
 Ralph E. Wareham Professor of Mathematical Sciences
 University of Iowa  Phone: 319-335-3386
 Department of Statistics andFax:   319-335-3017
 Actuarial Science
 241 Schaeffer Hall  email:  l...@stat.uiowa.edu
 Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu
 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Use nparcomp function from nparcomp library to run post hoc

2011-04-29 Thread Jun Shen
Dear list,

I tried to use the nparcomp to run some post hoc non-parametric comparison
and got and error.

Error in uniroot(pfct, interval = interval) :
  f() values at end points not of opposite sign

 Appreciate any comments.

the command line:

nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')


Jun
===
data as follows

structure(list(Group = c(Duoderm, Duoderm, Duoderm, Duoderm,
Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
Duoderm, Duoderm, Fibrase, Fibrase, Fibrase, Fibrase,
Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
Fibrase, Fibrase, Kollagenase, Kollagenase, Kollagenase,
Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
Kollagenase, Non-treated, Non-treated, Non-treated, Non-treated,
Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle,
Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle,
Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle,
Vehicle, Vehicle, Vehicle, Vehicle, Vehicle, Vehicle
), Ulceration = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5,
5, 5, 5), Inflamation = c(3, 4, 3, 2, 3, 3, 4, 4, 2, 2, 3, 3,
3, 3, 3, 3, 3, 4, 3, 3, 3, 4, 3, 3, 2, 3, 3, 4, 3, 3, 2, 4, 4,
4, 4, 4, 4, 4, 3, 3, 4, 3, 5, 3, 3, 4, 4, 3, 3, 2, 4, 2, 3, 3,
4, 3, 4, 3, 3, 4, 3, 4, 2, 3, 3, 4, 2, 3, 4, 3, 2, 3, 3, 3, 2,
3, 2, 2, 2, 2, 4, 3, 2, 3, 3, 4, 3, 3, 4, 3, 4, 2, 4, 3, 4, 2,
4, 3, 4, 3, 2, 2, 2, 2, 3, 2, 3, 2, 4, 3, 2, 4, 4, 4, 2, 2, 3,
3, 2, 4, 3, 2, 3, 2, 2, 2, 4, 2, 3, 2, 3, 2, 3, 3, 3, 4, 3, 3,
4, 4, 2, 3, 2, 3), Fibroplasia = c(4, 4, 4, 4, 4, 3, 4, 4, 4,
3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 2, 4, 4, 3,
2, 4, 4, 4, 4, 4, 4, 3, 3, 3, 4, 3, 3, 3, 4, 3, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 3, 3, 3, 4, 4, 3, 4, 4, 4, 3, 4,
4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4,
4, 4, 3, 4, 4, 3, 4, 3, 2, 3, 3, 4, 3, 3, 4, 4, 3, 3, 3, 4, 3,
3, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 3, 3, 4, 4, 4, 4, 4, 3, 4,
3, 4, 4, 4, 4, 4, 4, 4, 4), Fibrosis.and.Adexnal.Atrophy = c(4,
4, 4, 3, 4, 4, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
4, 4, 3, 4, 3, 4, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 3, 4, 3, 4,
4, 4, 3, 4, 3, 3, 3, 3, 3, 4, 4, 4, 3, 4, 4, 4, 4, 4, 4, 3, 3,
4, 4, 3, 3, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4,
4, 4, 3, 4, 4, 4, 3, 4, 3, 4, 3, 4, 4, 3, 4, 3, 2, 3, 3, 4, 4,
3, 4, 4, 3, 3, 3, 4, 3, 3, 4, 4, 4, 4, 4, 4, 4, 3, 4, 4, 4, 4,
3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4), Inflammation = c(2,
2, 2, 1, 1, 1, 2, 3, 1, 2, 1, 1, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1,
1, 1, 1, 1, 1, 1, 1, 2, NA, 1, 1, 1, 2, 2, 2, 1, 2, 1, 1, 2,
1, 1, 2, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, NA, 2, 1, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 2, 1,
2, 1, 2, 2, 1, 1, 2, 1, 2, 2, 1, 1, 2, 1, 1, 1, 2, 1, 2, 1, 2,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 2, 2), Fibroplasia.1 =
c(4,
4, 4, 4, 4, 3, 4, 4, 4, 3, 4, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3,
4, 4, 4, 4, 3, 3, 4, 3, NA, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3,
3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 4, 4, 4, 3, 4, 3, 4, 4, 4, 4, 3,
3, 4, 4, 4, 3, 4, 3, 4, 4, 3, 4, 4, 3, 4, 4, 4, 4, 4, 4, 4, 4,
4, 3, 3, 3, NA, 4, 4, 4, 4, 3, 4, 3, 3, 3, 3, 3, 4, 2, 4, 3,
4, 4, 3, 4, 4, 2, 3, 2, 3, 3, 3, 3, 4, 4, 4, 4, 3, 3, 3, 3, 3,
4, 3, 3, 4, 4, 4, 4, 3, 3, 4, 3, 3, 4, 4, 4, 4, 3, 4, 4), Fibrosis = c(3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, NA, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3,
3, 3, 3, 3, NA, 3, 3, 3, 3, 3, 2, 2, 3, 3, 3, 3, 3, 2, 3, 3,
3, 

Re: [R] For loop and sqldf

2011-04-29 Thread Dennis Murphy
Hi:

Try

split(DF, DF$C)

Does that work?

Dennis

On Fri, Apr 29, 2011 at 1:27 PM, mathijsdevaan mathijsdev...@gmail.com wrote:
 Hi list,

 Can anyone tell my why the following does not work? Thanks a lot! Your help
 is very much appreciated.

 DF = data.frame(read.table(textConnection(    B  C  D  E  F  G
 8025  1995  0  4  1  2
 8025  1997  1  1  3  4
 8026  1995  0  7  0  0
 8026  1996  1  2  3  0
 8026  1997  1  2  3  1
 8026  1998  6  0  0  4
 8026  1999  3  7  0  3
 8027  1997  1  2  3  9
 8027  1998  1  2  3  1
 8027  1999  6  0  0  2
 8028  1999  3  7  0  0
 8029  1995  0  2  3  3
 8029  1998  1  2  3  2
 8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE))
 list-sort(unique(DF$C))
 for (t in 1:length(list))
        {
        year = as.character(list[t])
        data[year]-sqldf('select * from DF where C = [year]')
        }

 I am trying to split up the data.frame into 5 new ones, one for every year.


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/For-loop-and-sqldf-tp3484559p3484559.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv fails to read a CSV file from google docs

2011-04-29 Thread Duncan Temple Lang
Hi Tal

You can add

  ssl.verifypeer = FALSE

in the .opts list so that the certificate is simply accepted.

Alternatively, you can tell libcurl where to find the certification
authority file containing signatures. This can be done via the cainfo
option, e.g.

   cainfo = system.file(CurlSSL, cacert.pem, package = RCurl),

Often such a collection of certificates is installed with the ssl library.

  D.

On 4/29/11 2:42 PM, Tal Galili wrote:
 Hello Duncan,
 Thank you for having a look at this.
 
 I tried the code you provided but it failed in the getForm stage.  running 
 this:
 
  tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;,
 +  hl =en, key = 
 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE,
 +  single = true, gid =0,
 +  output = csv,
 + .opts = list(followlocation = TRUE, verbose = TRUE))
 
 Resulted in the following error:
 
 Error in curlPerform(url = url, headerfunction = header$update, curl = 
 curl,  : 
   SSL certificate problem, verify that the CA cert is OK. Details:
 error:14090086:SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate 
 verify failed
 
 
 Did I miss some step?
 
 
 
 
 
 Contact 
 Details:---
 Contact me: tal.gal...@gmail.com mailto:tal.gal...@gmail.com |  
 972-52-7275845
 Read me: www.talgalili.com http://www.talgalili.com (Hebrew) | 
 www.biostatistics.co.il
 http://www.biostatistics.co.il (Hebrew) | www.r-statistics.com 
 http://www.r-statistics.com (English)
 --
 
 
 
 
 On Fri, Apr 29, 2011 at 9:18 PM, Duncan Temple Lang dun...@wald.ucdavis.edu 
 mailto:dun...@wald.ucdavis.edu wrote:
 
 
 Thanks David for fixing the early issues.
 
 The reason for the failure is that the response
 from the Web server is a to redirect the requester
 to another page, specifically
 
  
 https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
 https://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
 Note that this is https, not http, and the built-in URL reading 
 facilities in R don't suport https.
 
 
 One way to see this is to use look at the headers in your browser (e.g. 
 Live HTTP Headers),
 or to use curl, or the RCurl package
 
 tt = getForm(http://spreadsheets0.google.com/spreadsheet/pub;,
  hl =en, key = 
 0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REE,
  single = true, gid =0,
  output = csv,
 .opts = list(followlocation = TRUE, verbose = TRUE))
 
 
 The verbose option shows the entire dialog, and tt contains the
 text of the CSV document.
 
  read.csv(textConnection(tt))
 
 then yields the data frame
 
  D.
 
 
 On 4/29/11 10:36 AM, David Winsemius wrote:
 
  On Apr 29, 2011, at 11:19 AM, Tal Galili wrote:
 
  Hello all,
  I wish to use read.csv to read a google doc spreadsheet.
 
  I try using the following code:
 
  data_url - 
 
 
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
  
  read.csv(data_url)
 
  Which results in the following error:
 
  Error in file(file, rt) : cannot open the connection
 
 
  I'm on windows 7.  And the code was tried on R 2.12 and 2.13
 
  I remember trying this a few months ago and it worked fine.
 
  I am always amused at such claims. Occasionally they are correct, but 
 more often a crucial step has been omitted. In
  this case you have at a minimum embedded line-feeds in your URL string 
 and have not established a connection, so it
  could not possibly have succeeded as presented.
 
  But now it's time to admit I do not know why it is not succeeding when 
 I correct those flaws.
 
  closeAllConnections()
  data_url -
 
 
 url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
 http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv)
 
  read.csv(data_url)
  Error in open.connection(file, rt) : cannot open the connection
 
  closeAllConnections()
  dd - read.csv(con -
 
 
 url(http://spreadsheets0.google.com/spreadsheet/pub?hl=enhl=enkey=0AgMhDTVek_sDdGI2YzY2R1ZESDlmZS1VYUxvblQ0REEsingle=truegid=0output=csv
 
 

Re: [R] RCurl and postForm()

2011-04-29 Thread Duncan Temple Lang

Hi Ryan

 postForm() is using a different style (or specifically Content-Type) of 
submitting the form than the curl -d command.
Switching the style = 'POST' uses the same type, but at a quick guess, the 
parameter name 'a' is causing confusion
and the result is the empty JSON array - [].

A quick workaround is to use curlPerform() directly rather than postForm()

 r = dynCurlReader()
 curlPerform(postfields = 'Archbishop Huxley', url = 
'http://www.datasciencetoolkit.org/text2people', verbose = TRUE,
  post = 1L, writefunction = r$update)
 r$value()

This yields

[1]
[{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:0,\end_index\:17,\matched_string\:\Archbishop
Huxley\}]

and you can use fromJSON() to transform it into data in R.

  D.

On 4/29/11 12:14 PM, Elmore, Ryan wrote:
 Hi everybody,
 
 I think that I am missing something fundamental in how strings are passed 
 from a postForm() call in R to the curl or libcurl functions underneath.  For 
 example, I can do the following using curl from the command line:
 
 $ curl -d Archbishop Huxley http://www.datasciencetoolkit.org/text2people;
 [{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop
  Huxley}]
 
 Trying the same thing, or what I *think* is the same thing (obvious not) in R 
 (Mac OS 10.6.7, R 2.13.0) produces:
 
 library(RCurl)
 Loading required package: bitops
 api - http://www.datasciencetoolkit.org/text2people;
 postForm(api, a=Archbishop Huxley)
 [1] 
 [{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:44,\end_index\:61,\matched_string\:\Archbishop
  
 Huxley\},{\gender\:\u\,\first_name\:\\,\title\:\archbishop\,\surnames\:\Huxley\,\start_index\:88,\end_index\:105,\matched_string\:\Archbishop
  Huxley\}]
 attr(,Content-Type)
 charset
 text/html utf-8
 
 I can match the result given on the DSTK API's website by using system(), but 
 doesn't seem like the R-like way of doing something.
 
 system(curl -d 'Archbishop Huxley' 
 'http://www.datasciencetoolkit.org/text2people')
 158   141  141   141
 0[{gender:u,first_name:,title:archbishop,surnames:Huxley,start_index:0,end_index:17,matched_string:Archbishop
  Huxley}]17599 72 --:--:-- --:--:-- --:--:--   670
 
 If you want to see some additional information related to this question, I 
 posted on StackOverflow a few days ago:
 http://stackoverflow.com/questions/5797688/post-request-using-rcurl
 
 I am working on this R wrapper for the data science toolkit as a way of 
 illustrating how to make an R package for the Denver RUG and ran into this 
 problem.  Any help to this problem will be greatly appreciated by the Denver 
 RUG!
 
 Cheers,
 Ryan
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Use nparcomp function from nparcomp library to run post hoc

2011-04-29 Thread Dennis Murphy
Hi:

Is this the function nparcomp() in the nparcomp package or the one
from the mutoss package? When using functions from packages, it is
useful to indicate the package name. I'm assuming you're using the
nparcomp package, because your code worked for me when that package
was loaded:

 library(nparcomp)
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: splines
 nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')

  Nonparametric Multiple Comparison Procedure based on relative
contrast effects , Type of Contrast : Dunnett
 NOTE:
 *---Weight Matrix--*
 - Weight matrix for choosen contrast based on all-pairs comparisons

 *---Analysis of relative effects---*
 - Simultaneous Confidence Intervals for relative effects p(i,j)
  with confidence level 0.95
 - Method = Multivariate Delta-Method (Logit)
 - p-Values for  H_0: p(i,j)=1/2

 *Interpretation*
 p(a,b)  1/2 : b tends to be larger than a
 *--Mult.Distribution---*
 - Equicoordinate Quantile
 - Global p-Value
 *--*
$weight.matrix

 snipped for brevity - all zeros 

$Data.Info
   Sample Size
1 Duoderm   24
2 Fibrase   24
3 Kollagenase   24
4 Non-treated   24
5Stimulen   24
6 Vehicle   24

$Analysis.of.relative.effects
  Comparison rel.effect confidence.interval t.value
1 p(Non-treated,Duoderm)0.5   [ 0.499 ; 0.501 ]   0
2 p(Non-treated,Fibrase)0.5   [ 0.499 ; 0.501 ]   0
3 p(Non-treated,Kollagenase)0.5   [ 0.499 ; 0.501 ]   0
4p(Non-treated,Stimulen)0.5   [ 0.499 ; 0.501 ]   0
5 p(Non-treated,Vehicle)0.5   [ 0.499 ; 0.501 ]   0
  p.value.adjusted p.value.unadjusted
11  1
21  1
31  1
41  1
51  1

$Mult.Distribution
  Quantile p.Value.global
1 2.568766  1

$Correlation
[1] NA

A graphic also appears indicating zero effect, which is what one would
expect since Ulceration = 5 for every observation in the data frame.

 sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] splines   stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] nparcomp_1.0-1  multcomp_1.2-5  survival_2.36-9 mvtnorm_0.9-999
[5] sos_1.3-0   brew_1.0-6  plyr_1.5.2

loaded via a namespace (and not attached):
[1] tcltk_2.13.0 tools_2.13.0

Check your version of R and the nparcomp package against this. If you
have an older version of R or nparcomp, perhaps an upgrade is
sufficient to fix the problem.

HTH,
Dennis

On Fri, Apr 29, 2011 at 2:49 PM, Jun Shen jun.shen...@gmail.com wrote:
 Dear list,

 I tried to use the nparcomp to run some post hoc non-parametric comparison
 and got and error.

 Error in uniroot(pfct, interval = interval) :
  f() values at end points not of opposite sign

  Appreciate any comments.

 the command line:

nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')


 Jun
 ===
 data as follows

 structure(list(Group = c(Duoderm, Duoderm, Duoderm, Duoderm,
 Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
 Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
 Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
 Duoderm, Duoderm, Fibrase, Fibrase, Fibrase, Fibrase,
 Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
 Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
 Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
 Fibrase, Fibrase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase, Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase, Non-treated, Non-treated, Non-treated, Non-treated,
 Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
 Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
 Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
 Non-treated, Non-treated, Non-treated, Non-treated, Non-treated,
 Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
 Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, Stimulen,
 Stimulen, Stimulen, Stimulen, Stimulen, Stimulen, 

Re: [R] Use nparcomp function from nparcomp library to run post hoc

2011-04-29 Thread Jun Shen
Hi, Dennis,

Thanks for the reply. I tried to upgrade to R 2.13.0. Then when I tried to
load the library(nparcomp), I got an error

Error: package 'mvtnorm' is not installed for 'arch=i386'

What does that mean? Thanks.

Jun

On Fri, Apr 29, 2011 at 5:49 PM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 Is this the function nparcomp() in the nparcomp package or the one
 from the mutoss package? When using functions from packages, it is
 useful to indicate the package name. I'm assuming you're using the
 nparcomp package, because your code worked for me when that package
 was loaded:

  library(nparcomp)
 Loading required package: multcomp
 Loading required package: mvtnorm
 Loading required package: survival
 Loading required package: splines
  nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')

   Nonparametric Multiple Comparison Procedure based on relative
 contrast effects , Type of Contrast : Dunnett
  NOTE:
  *---Weight Matrix--*
  - Weight matrix for choosen contrast based on all-pairs comparisons

  *---Analysis of relative effects---*
  - Simultaneous Confidence Intervals for relative effects p(i,j)
  with confidence level 0.95
  - Method = Multivariate Delta-Method (Logit)
  - p-Values for  H_0: p(i,j)=1/2

  *Interpretation*
  p(a,b)  1/2 : b tends to be larger than a
  *--Mult.Distribution---*
  - Equicoordinate Quantile
  - Global p-Value
  *--*
 $weight.matrix

 snipped for brevity - all zeros 

 $Data.Info
   Sample Size
 1 Duoderm   24
 2 Fibrase   24
 3 Kollagenase   24
 4 Non-treated   24
 5Stimulen   24
 6 Vehicle   24

 $Analysis.of.relative.effects
  Comparison rel.effect confidence.interval t.value
 1 p(Non-treated,Duoderm)0.5   [ 0.499 ; 0.501 ]   0
 2 p(Non-treated,Fibrase)0.5   [ 0.499 ; 0.501 ]   0
 3 p(Non-treated,Kollagenase)0.5   [ 0.499 ; 0.501 ]   0
 4p(Non-treated,Stimulen)0.5   [ 0.499 ; 0.501 ]   0
 5 p(Non-treated,Vehicle)0.5   [ 0.499 ; 0.501 ]   0
  p.value.adjusted p.value.unadjusted
 11  1
 21  1
 31  1
 41  1
 51  1

 $Mult.Distribution
  Quantile p.Value.global
 1 2.568766  1

 $Correlation
 [1] NA

 A graphic also appears indicating zero effect, which is what one would
 expect since Ulceration = 5 for every observation in the data frame.

  sessionInfo()
 R version 2.13.0 (2011-04-13)
 Platform: x86_64-pc-linux-gnu (64-bit)

 locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=C  LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
  [9] LC_ADDRESS=C   LC_TELEPHONE=C
 [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

 attached base packages:
 [1] splines   stats graphics  grDevices utils datasets  methods
 [8] base

 other attached packages:
 [1] nparcomp_1.0-1  multcomp_1.2-5  survival_2.36-9 mvtnorm_0.9-999
 [5] sos_1.3-0   brew_1.0-6  plyr_1.5.2

 loaded via a namespace (and not attached):
 [1] tcltk_2.13.0 tools_2.13.0

 Check your version of R and the nparcomp package against this. If you
 have an older version of R or nparcomp, perhaps an upgrade is
 sufficient to fix the problem.

 HTH,
 Dennis

 On Fri, Apr 29, 2011 at 2:49 PM, Jun Shen jun.shen...@gmail.com wrote:
  Dear list,
 
  I tried to use the nparcomp to run some post hoc non-parametric
 comparison
  and got and error.
 
  Error in uniroot(pfct, interval = interval) :
   f() values at end points not of opposite sign
 
   Appreciate any comments.
 
  the command line:
 
 nparcomp(Ulceration~Group,data=test,type='Dunnett',control='Non-treated')
 
 
  Jun
  ===
  data as follows
 
  structure(list(Group = c(Duoderm, Duoderm, Duoderm, Duoderm,
  Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
  Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
  Duoderm, Duoderm, Duoderm, Duoderm, Duoderm, Duoderm,
  Duoderm, Duoderm, Fibrase, Fibrase, Fibrase, Fibrase,
  Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
  Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
  Fibrase, Fibrase, Fibrase, Fibrase, Fibrase, Fibrase,
  Fibrase, Fibrase, Kollagenase, Kollagenase, Kollagenase,
  Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase,
  Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase,
  Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase,
  Kollagenase, Kollagenase, Kollagenase, Kollagenase,
 Kollagenase,
  Kollagenase, Non-treated, Non-treated, Non-treated,
 Non-treated,
  Non-treated, 

Re: [R] For loop and sqldf

2011-04-29 Thread David Winsemius


On Apr 29, 2011, at 4:27 PM, mathijsdevaan wrote:


Hi list,

Can anyone tell my why the following does not work? Thanks a lot!  
Your help

is very much appreciated.

DF = data.frame(read.table(textConnection(B  C  D  E  F  G
8025  1995  0  4  1  2
8025  1997  1  1  3  4
8026  1995  0  7  0  0
8026  1996  1  2  3  0
8026  1997  1  2  3  1
8026  1998  6  0  0  4
8026  1999  3  7  0  3
8027  1997  1  2  3  9
8027  1998  1  2  3  1
8027  1999  6  0  0  2
8028  1999  3  7  0  0
8029  1995  0  2  3  3
8029  1998  1  2  3  2
8029  1999  6  0  0  1),head=TRUE,stringsAsFactors=FALSE))


list-sort(unique(DF$C))  ; require(sqldf); data -list() # added inits


for (t in 1:length(list))
{
year = as.character(list[t])
data[year]-sqldf('select * from DF where C = [year]')


#I see you have already gotten a workable answer, but thought you  
might want to see if this would work:


data[year]-sqldf(paste('select * from DF where C = ', year,  sep=) )

# Two changes ... let `year` get evaluated and don't put `year` in  
brackets.



}



 data
$`1995`
[1] 8025 8026 8029

$`1996`
[1] 8026

$`1997`
[1] 8025 8026 8027

$`1998`
[1] 8026 8027 8029

$`1999`
[1] 8026 8027 8028 8029
I am trying to split up the data.frame into 5 new ones, one for  
every year.





--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >