[R] [R-pkgs] caret version 4.06 released

2009-01-26 Thread Max Kuhn
Version 4.06 of the caret package was sent to CRAN.

caret can be used to tune the parameters of predictive models using
resampling, estimate variable importance and visualize the results.
There are also various modeling and helper functions that can be
useful for training models. caret has wrappers to over 50 different
models for classification and regression. See the package vignettes or
the paper at

  http://www.jstatsoft.org/v28/i05

for more details.

Significant internal changes were made to how the models are fit in
train(). Now, the function used to compute the models is passed in as
a parameter (defaulting to lapply). In this way, users can use their
own parallel processing software without new versions of caret.
Examples using MPI and NWS are given in ?train.

The package now contains a function (splsda) that extends the spls
function to classification (in the same manner than caret's plsda
function extends plsr).

Also, fixed a bug where the MSE (instead of RMSE) was reported for
random forest OOB resampling

There are more examples in ?train.

Changes to confusionMatrix, sensitivity, specificity and the
predicative value functions:

 - each was made more generic with default and table methods
 - confusionMatrix extractor functions for matrices and tables were added
 - the pos/neg predicted value computations were changed to
incorporate prevalence
 - prevalence was added as an option to several functions
 - detection rate and prevalence statistics were added to confusionMatrix
 - the examples were expanded in the help files

This version of caret will break compatibility with caretLSF and
caretNWS. However, these packages will not be needed now (see above)
and will be deprecated. They will work on versions of caret = 3.51
and will not be developed going forward. However, they can still be
found at

  fhttps://r-forge.r-project.org/projects/caret/

Send questions, comments etc to max.k...@pfizer.com.

Max

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can't load rJava in R 2.8.1 on Windows XP

2009-01-26 Thread Dieter Menne
Duncan Murdoch murdoch at stats.uwo.ca writes:

 I don't know what's going wrong on your system.  I added a browser() 
 call to the .onLoad function in R/windows/FirstLib.R on my system, and I 
 see it successfully gets JAVA_HOME from the registry.  It gets a number 
 of other files, then adds these paths to my PATH variable.  I've used 
 strsplit() to separate them for viewing.
 
 [14] C:\\Program Files\\Java\\jre1.6.0_07\\bin\\client
 [15] C:\\Program Files\\Java\\jre1.6.0_07/bin
 [16] C:\\Program Files\\Java\\jre1.6.0_07/bin/client
 [17] C:\\Program Files\\Java\\jre1.6.0_07/jre/bin/client
 
 I believe LoadLibrary needs paths to be specified with backslashes, so 
 you might be able to fix things on your system by changing the file.path 
 calls in that function to use fsep=\\ instead of the default /.

Thanks for your help. 

I think I tracked it down. It has nothing to do with rJava, but
rather with Sys.getenv(). Looks like this function truncates around 1024
characters, and my path is very long due to Visual Studio + Delphi
+ SQL Server.

See the printout below. Note that the last entry should read \\Delphi,
and that more entries are coming in my system path.
This also explains why only some people have the problem.
No workaround found yet. I keep this message here for other people who 
have the problem, but possibly this is more for R-devel to be continued.

Dieter

 p = Sys.getenv(PATH)
 nchar(p) 
PATH 
1019 
 strsplit(p,;)$PATH[-(1:27)]
[1] C:\\Program Files\\Microsoft SQL
Server\\100\\Tools\\Binn\\VSShell\\Common7\\IDE\\
[2] C:\\Program Files\\MiKTeX 2.7\\miktex\\bin   
[3] C:\\Users\\Dieter\\Documents\\Delp


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to analyse and model 2 time series, when one series needs to be differenced?

2009-01-26 Thread Andreas Klein
Hello.

How can I analyse the cross-correlation between two time series with ccf, if 
one of the time series need to be differenced, so it is stationary?
The two time series differ when in length and maybe ccf produces not the 
correct cross-correlation?!

Another problem:
How can I model the two time series as an VARI-process with the dse package? - 
So how can I handle it, that one series has to be differenced and the other 
series not?

I hope you can give me some hints.


Regards,
Andreas.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] HMISC package: wtd.table()

2009-01-26 Thread Norbert NEUWIRTH
Hi useRs  developeRs,

I got stuck within a function of the Hmisc package. Sounds easy, hope it is:

I got 2 items (FamTyp.kurz, HGEW) of same length, no missings.

 length(FamTyp.kurz);summary(FamTyp.kurz)
[1] 14883
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  10.00   20.00   21.00   21.66   23.00   31.00 
 length(HGEW);summary(HGEW)
[1] 14883
   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
  104.5   409.6   489.4   549.8   623.3  3880.0   

Now I simply want to compute a table of unweighted and weighted values. But, 
...  the weights do not seem to be accepted 

 
   print(unweighted );print(table(FamTyp.kurz))
[1] unweighted 
FamTyp.kurz
  10  1120   21 22  23 30  31 
1755  683 3322 1683 2428 1440 1748 1824   

   print(weighted   
 );print(wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))
[1] weighted   
Error in wtd.table(FamTyp.kurz, weigths = HGEW, normwt = FALSE, na.rm = TRUE) : 
  unused arguments (weigths = c(495.55949, 495.55949, 678.16378, 678.16378,  
.
  
  
any ideas ???  

thanx in advance,
Norbert

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] heatmap with levelplot

2009-01-26 Thread Antje

Hi there,

I'd like to create a heatmap from my matrix with
a) a defined color range (lets say from yellow to red)
b) using striking colors above and below a certain threshold (above = green, 
below = blue)


Example matrix (there should be a few outliers generated...) + simple levelplot 
without outliers marked:


library(lattice)
my.mat - matrix(rnorm(800), nrow = 40)
threshold - c(-1,1) # should be used for the extreme colors
colorFun - colorRampPalette(c(yellow,red))
levelplot(my.mat, col.regions = colorFun(50))


I don't know how to handle the extrem values...

Can anybody help?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HMISC package: wtd.table()

2009-01-26 Thread Norbert NEUWIRTH
oops, that really was easy:

(wtd.table(FamTyp.kurz,HGEW,normwt=FALSE,na.rm=TRUE))  instead of 
(wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))

sorry for that question ...

Am 26.01.2009, 10:30 Uhr, schrieb Norbert NEUWIRTH 
norbert.s.neuwi...@univie.ac.at:

 Hi useRs  developeRs,

 I got stuck within a function of the Hmisc package. Sounds easy, hope it is:

 I got 2 items (FamTyp.kurz, HGEW) of same length, no missings.

 length(FamTyp.kurz);summary(FamTyp.kurz)
 [1] 14883
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   10.00   20.00   21.00   21.66   23.00   31.00
 length(HGEW);summary(HGEW)
 [1] 14883
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   104.5   409.6   489.4   549.8   623.3  3880.0

 Now I simply want to compute a table of unweighted and weighted values. But, 
 ...  the weights do not seem to be accepted 

 
   print(unweighted );print(table(FamTyp.kurz))
 [1] unweighted 
 FamTyp.kurz
   10  1120   21 22  23 30  31
 1755  683 3322 1683 2428 1440 1748 1824

   print(weighted   
 );print(wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))
 [1] weighted   
 Error in wtd.table(FamTyp.kurz, weigths = HGEW, normwt = FALSE, na.rm = TRUE) 
 :
   unused arguments (weigths = c(495.55949, 495.55949, 678.16378, 678.16378,  
 .
 
 
 any ideas ???

 thanx in advance,
 Norbert


 



-- 
** 
Mag. Norbert Neuwirth

Österreichisches Institut für Familienforschung (ÖIF) - Universität Wien
Austrian Institute for Family Studies - University of Vienna
 
http://www.oif.ac.at 
 
e-mail:norbert.neuwi...@oif.ac.at
tel:  +43-1-4277-489-11
fax: +43-1-4277-9-489
address:  A-1010 Wien, Grillparzerstraße 7/9

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with clustering

2009-01-26 Thread mauede
I am going to try out a tentative clustering of some feature vectors.
The range of values spanned by the three items making up the features vector is 
quite different:

Item-1 goes roughly from 70 to 525 (integer numbers only)
Item-2 is in-between 0 and 1 (all real numbers between 0 and 1)
Item-3 goes from 1 to 10 (integer numbers only)

In order to spread out Item-2 even further I might try to replace Item-2 with 
Log10(Item-2).

My concern is that, regardless the distance measure used, the item whose order 
of magnitude is the highest may carry the highest weight in the process of 
calculating the similarity matrix therefore fading out the influence of the 
items with smaller variation in the resulting clusters.
Should I normalize all feature vector elements to 1 in advance of generating 
the similarity matrix ?

Thank you so much.
Maura 







tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] infer haplotypes phasing trios tdthap

2009-01-26 Thread David Clayton
tdthap wassn't intended to solve that problem and it has been removed 
from my own web site since I no longer consider it important enough to 
support.


DC





-Original Message-
From: Tiago R Magalhães [mailto:tiag...@gmail.com] 
Sent: 22 January 2009 11:10

To: r-help@R-project.org
Subject: infer haplotypes phasing trios tdthap

Dear R mailing list,

I have a dataset with genotypes from trios and I would like to infer 
haplotypes for each mother, father and child. The package that I could 
find that can do this is tdthap.


But when the mother is homozygous (e.g., 2/2) the haplotype is called as 
not possible to infer (0); I would prefer for it to call the genotype 
(2). From what I understand it is doing what I would like for the father 
(example below).


Can anyone provide me with some information about this tdthap behaviour? 
And is there any other package that would do this? (Searched for it, 
couldn't find it)


Thank you very much,

Tiago Magalhães



example (ped file with pedigrees)
9 100 102 101 1 2 1 1 2 1 2 2 1 2
9 101 0 0 2 1 1 1 2 1 2 2 2 2
9 102 0 0 1 1 2 1 2 1 2 2 1 1


data out: hap.transmit(example)

pedidfathermother
9  100102   101

f.tr.1f.tr.2f.tr.3f.tr.4   
1 0   2  1


m.tr.1m.tr.2m.tr.3m.tr.4
   00 0  0



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Meaning of Inner Product (%*%) Between Slot and Vector

2009-01-26 Thread Gundala Viswanath
Dear all,

I have the following object and vector:

 print(alpha)
Slot ra:
 [1] 0.994704478 0.002647761 0.000882587 0.000882587 0.000882587 0.989459074
 [7] 0.005270463 0.002635231 0.002635231 0.994717023 0.005282977 1.0
[13] 1.0

Slot ja:
 [1] 1 5 2 3 4 2 1 3 4 3 3 4 5

Slot ia:
[1]  1  6 10 12 13 14

Slot dimension:
[1] 5 5

 print(p)
[1] 0.4 0.2 0.2 0.2 0.2

Now what I don't understand is, after performing inner product
it gives this:

 print(alpha %*% p)

  [,1]
[1,] 0.3989409
[2,] 0.2010541
[3,] 0.200
[4,] 0.200
[5,] 0.200


My questions are:
1. How does %*% work in the above example?
2. Is there a more understandable (naive) way to implement such
   product in this context?

- Gundala Viswanath
Jakarta - Indonesia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to modify an R built-in function?

2009-01-26 Thread diego Diego
Hello R experts!
 Last week I run in to a lot a problems triyng to fit an ARIMA model to a
time series. The problem is that the internal process of the arima function
call function optim to estimate the model parameters, so far so good...
but my data presents a problem with the default method BFGS of the optim
function, the output error looks like this:

Error en optim(init[mask], armafn, method = BFGS, hessian = TRUE, control
= optim.control,  :
  non-finite finite-difference value [7]

I've searched through the R-forums for an answer and the only thing that
look like it might help is a suggestion to modify the R-arima function in a
way that allows to select the optimization method for the optim function.
The post is available here:

http://finzi.psych.upenn.edu/R/Rhelp02a/archive/138255.html

The problem is that I'm not familiar with the procedure thet the author
suggest, ie, I don't know how to modify the R fucntion through a R-script.

Any help will be very appreciated!!!


regards!!!


Diego.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with clustering

2009-01-26 Thread Christian Hennig
Generally, how to scale different variables when aggregating them in a 
dissimilarity measure is strongly dependent on the subject matter, what the 
aim of clustering and your cluster comncept is. This cannot be answered 
properly on such a mailing list.


A standard transformation before computing dissimilarities would be to 
scale all variables to variance 1 by dividing by their standard deviations. 
This gives in some well defined sense all 
variables the same weight (which may be somewhat affected by 
outliers, heavy tails, skewness; note, however, that normalising to the same 
range shares the same problems more severly).


Regards,
Christian

On Mon, 26 Jan 2009, mau...@alice.it wrote:


I am going to try out a tentative clustering of some feature vectors.
The range of values spanned by the three items making up the features vector is 
quite different:

Item-1 goes roughly from 70 to 525 (integer numbers only)
Item-2 is in-between 0 and 1 (all real numbers between 0 and 1)
Item-3 goes from 1 to 10 (integer numbers only)

In order to spread out Item-2 even further I might try to replace Item-2 with 
Log10(Item-2).

My concern is that, regardless the distance measure used, the item whose order 
of magnitude is the highest may carry the highest weight in the process of 
calculating the similarity matrix therefore fading out the influence of the 
items with smaller variation in the resulting clusters.
Should I normalize all feature vector elements to 1 in advance of generating 
the similarity matrix ?

Thank you so much.
Maura







tutti i telefonini TIM!


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] XML package help

2009-01-26 Thread Skewes,Aaron
Thanks! Works like a charm.

-Aaron


From: Duncan Temple Lang [dun...@wald.ucdavis.edu]
Sent: Friday, January 23, 2009 6:48 PM
To: Skewes,Aaron
Cc: r-help@r-project.org
Subject: Re: [R] XML package help

Skewes,Aaron wrote:
 Please consider this:

 Manifest xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; 
 !-- eName   : name of the element.
   eValue   : value of the element. --

 OutputFilePath./XYZ/OutputFilePath
 FilesList
 File
 FileTypeId10/FileTypeId
 FilePath./XYZ//FilePath
 PatientCharacteristics eName=one eValue=1/
 PatientCharacteristics eName=two eValue=2/
 PatientCharacteristics eName=three eValue=3/
 /File
 /FilesList
 /Manifest

 I am attempting to use XML package and xpathSApply() to extract, say, the 
 eValue attribute for eName=='0ne' for all File nodes that have 
 FileTypeId==10.  I try the following, amoung several things:



  getNodeSet(doc,
//File[FileTypeId/text()='10']/patientcharacteristi...@ename='one']/@eValue)


should do it.
You need to compare the text() of the FileTypeId node.
And the / after the PatientCharacterstics and before the [] will cause
trouble.


HTH,

   D.


 doc-xmlInternalTreeParse(Manifest)
 Root = xmlRoot(doc)
 xpathSApply(Root, 
 //File[FileTypeId=10]/PatientCharacteristics/[...@ename='one'], xmlAttrs)

 and it does not work.

 Might somebody help me with the syntax here?

 Thanks a lot!!
 Aaron


   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Mode (statistics) in R?

2009-01-26 Thread Jason Rupert
Hopefully this is a pretty simple question:
 
Is there a function in R that calculates the mode of a sample?   That is, I 
would like to be able to determine the value that occurs the most frequently in 
a data set. 
 
I tried the default R mode function, but it appears to provide a storage type 
or something else.  
 
I tried the RSeek and some R documentation that I downloaded, but nothing seems 
to mention calculating the mode. 
 
Thanks again.
 
 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] glm StepAIC with all interactions and update to remove a term vs. glm specifying all but a few terms and stepAIC

2009-01-26 Thread Robert Michael Inman
Problem:
I am sorting through model selection process for first time and want to make
sure that I have used glm, stepAIC, and update correctly.  Something is
strange because I get a different result between: 

1) a glm of 12 predictor variables followed by a stepAIC where all
interactions are considered and then an update to remove one specific
interaction.

vs.

2) entering all the terms individually in a glm (exept the one that I
removed with update and 4 others like it but which did not make it to final
model anyway), and then running stepAIC.  

Question:
Why do these processes not yield same model?   



Here are all the details if helpful:
I start with 12 potential predictor variables, 7 primary terms and 5
additional that are I(primary_terms^2).  I run a glm for these 12 and then
do stepAIC (BIC actually) both directions.  The scope argument is
scope=list(upper=~.^2,lower=NULL).  This means there are 78 predictor terms
considered, the 12 primary terms and 66 interactions [n(n+1)/2].  I see this
with trace=T also.  Here is the code used:

glm1-glm(formula = PRESENCE == 1 ~ SNOW + I(SNOW^2) + POP_DEN + ROAD_DE
+ ADJELEV + I(ADJELEV^2) + TRI + I(TRI^2) + EDGE + I(EDGE^2) + TREECOV +
I(TREECOV^2),family = binomial, data = wolv)
summary(glm1)
library(MASS)
stepglm2-stepAIC(glm1,scope=list(upper=~.^2,lower=NULL),
trace=T,k=log(4828),direction=both) 
summary(stepglm2)
extractAIC(stepglm2,k=log(4828))   

This results in a 15 term model with a BIC of 3758.659

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)   -4.983e+01  9.263e+00  -5.379 7.50e-08 ***
SNOW   6.085e-02  8.641e-03   7.041 1.90e-12 ***
ROAD_DE   -5.637e-01  1.192e-01  -4.730 2.24e-06 ***
ADJELEV2.880e-02  7.457e-03   3.863 0.000112 ***
I(ADJELEV^2)  -4.038e-06  1.487e-06  -2.715 0.006618 ** 
TRI5.675e-02  1.081e-02   5.248 1.54e-07 ***
I(TRI^2)  -1.713e-03  4.243e-04  -4.036 5.43e-05 ***
EDGE   6.418e-03  1.697e-03   3.782 0.000156 ***
TREECOV1.680e-01  2.929e-02   5.735 9.76e-09 ***
SNOW:ADJELEV  -4.313e-05  6.935e-06  -6.219 5.00e-10 ***
ADJELEV:TREECOV   -6.628e-05  1.161e-05  -5.711 1.13e-08 ***
SNOW:I(ADJELEV^2)  7.437e-09  1.384e-09   5.373 7.74e-08 ***
TRI:I(TRI^2)   1.321e-06  3.419e-07   3.863 0.000112 ***
I(ADJELEV^2):I(TRI^2) -2.127e-10  5.745e-11  -3.702 0.000214 ***
ADJELEV:I(TRI^2)   1.029e-06  3.004e-07   3.424 0.000617 ***
SNOW:TRI   1.057e-05  3.372e-06   3.135 0.001721 ** 



The final model included a the TRI:I(TRI^2) term, which is effectively a
cubic function.  So this was removed because cubic's were not considered for
all variables.  I used update to remove TRI:I(TRI^2).  Code:

stepglm3-update(stepglm2,~.-TRI:I(TRI^2),trace=T)
summary(stepglm3)
extractAIC(stepglm3,k=log(4828))

This results in a 14 term model with a BIC of 3770.172.  The BIC is a little
higher, but the cubic term improved fit and is no longer in, so expected.

Coefficients:
Estimate Std. Error z value Pr(|z|)
(Intercept)   -5.329e+01  9.267e+00  -5.750 8.92e-09 ***
SNOW   6.241e-02  8.695e-03   7.178 7.06e-13 ***
ROAD_DE   -5.756e-01  1.184e-01  -4.863 1.16e-06 ***
ADJELEV3.233e-02  7.452e-03   4.338 1.44e-05 ***
I(ADJELEV^2)  -4.724e-06  1.487e-06  -3.177 0.001489 ** 
TRI1.834e-02  5.402e-03   3.395 0.000687 ***
I(TRI^2)  -1.122e-03  3.920e-04  -2.863 0.004190 ** 
EDGE   6.344e-03  1.690e-03   3.754 0.000174 ***
TREECOV1.745e-01  2.923e-02   5.969 2.39e-09 ***
SNOW:ADJELEV  -4.444e-05  6.984e-06  -6.363 1.98e-10 ***
ADJELEV:TREECOV   -6.885e-05  1.160e-05  -5.937 2.90e-09 ***
SNOW:I(ADJELEV^2)  7.681e-09  1.395e-09   5.506 3.67e-08 ***
I(ADJELEV^2):I(TRI^2) -1.839e-10  5.692e-11  -3.232 0.001231 ** 
ADJELEV:I(TRI^2)   8.860e-07  2.974e-07   2.979 0.002892 ** 
SNOW:TRI   1.219e-05  3.260e-06   3.740 0.000184 ***

This all seems to be as it should.  I then decided to try and confim this
result by running a glm without any of the 5 potential cubic terms ( note -
TRI:I(TRI^2) was the only one that made it into the final model but there
were 5 potential).  After entering the 73 potential terms (12 primary
vaiables and now 66 minus 5 interactions = 73 total), the glm and stepAIC
produces a completely different final model.  It has 8 variables that were
not in the model that was chosen with scope statement and manually removing
TRI:TRI^2, and it is missing 7 variables that were in the model chosen with
the scope statement.  It has 8 variables that were in both.  Code and
Result:

glmalt1b-glm(formula = PRESENCE ==1 ~

Re: [R] how to modify an R built-in function?

2009-01-26 Thread stephen sefick
type the name of the function in an R session.  The source code will
result- have fun.

On Mon, Jan 26, 2009 at 7:36 AM, diego Diego dhab...@gmail.com wrote:
 Hello R experts!
  Last week I run in to a lot a problems triyng to fit an ARIMA model to a
 time series. The problem is that the internal process of the arima function
 call function optim to estimate the model parameters, so far so good...
 but my data presents a problem with the default method BFGS of the optim
 function, the output error looks like this:

 Error en optim(init[mask], armafn, method = BFGS, hessian = TRUE, control
 = optim.control,  :
  non-finite finite-difference value [7]

 I've searched through the R-forums for an answer and the only thing that
 look like it might help is a suggestion to modify the R-arima function in a
 way that allows to select the optimization method for the optim function.
 The post is available here:

 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/138255.html

 The problem is that I'm not familiar with the procedure thet the author
 suggest, ie, I don't know how to modify the R fucntion through a R-script.

 Any help will be very appreciated!!!


 regards!!!


 Diego.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Stephen Sefick

Let's not spend our time and resources thinking about things that are
so little or so large that all they really do for us is puff us up and
make us feel like gods.  We are mammals, and have not exhausted the
annoying little problems of being mammals.

-K. Mullis

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode (statistics) in R?

2009-01-26 Thread Carlos J. Gil Bellosta
Hello,

You can try ?table. 

Best regards,

Carlos J. Gil Bellosta
http://www.datanaytics.com

On Mon, 2009-01-26 at 05:28 -0800, Jason Rupert wrote:
 Hopefully this is a pretty simple question:
 
 Is there a function in R that calculates the mode of a sample? That is, I 
 would like to be able to determine the value that occurs the most frequently 
 in a data set. 
 
 I tried the default R mode function, but it appears to provide a storage 
 type or something else. 
 
 I tried the RSeek and some R documentation that I downloaded, but nothing 
 seems to mention calculating the mode. 
 
 Thanks again.
 
 
 
 
   
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode (statistics) in R?

2009-01-26 Thread Mike Lawrence
Here's a rather convoluted way of finding the mode (or, at least, the
first mode):

x = round(rnorm(100,sd=5))
my_mode = as.numeric(names(table(x))[which.max(table(x))])




On Mon, Jan 26, 2009 at 9:28 AM, Jason Rupert jasonkrup...@yahoo.com wrote:
 Hopefully this is a pretty simple question:

 Is there a function in R that calculates the mode of a sample?   That is, I 
 would like to be able to determine the value that occurs the most frequently 
 in a data set.

 I tried the default R mode function, but it appears to provide a storage 
 type or something else.

 I tried the RSeek and some R documentation that I downloaded, but nothing 
 seems to mention calculating the mode.

 Thanks again.





[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode (statistics) in R?

2009-01-26 Thread Jason Rupert
Thanks. 
 
I ended up breaking it up into two steps:
 
table_data-table(data)
subset(table_data, table_data==max(table_data))
 
Thanks again.


--- On Mon, 1/26/09, Mike Lawrence m...@thatmike.com wrote:

From: Mike Lawrence m...@thatmike.com
Subject: Re: [R] Mode (statistics) in R?
To: jasonkrup...@yahoo.com
Cc: r-help@r-project.org
Date: Monday, January 26, 2009, 7:39 AM

Here's a rather convoluted way of finding the mode (or, at least, the
first mode):

x = round(rnorm(100,sd=5))
my_mode = as.numeric(names(table(x))[which.max(table(x))])




On Mon, Jan 26, 2009 at 9:28 AM, Jason Rupert jasonkrup...@yahoo.com
wrote:
 Hopefully this is a pretty simple question:

 Is there a function in R that calculates the mode of a sample?
  That is, I would like to be able to determine the value that occurs the most
frequently in a data set.

 I tried the default R mode function, but it appears to provide
a storage type or something else.

 I tried the RSeek and some R documentation that I downloaded, but nothing
seems to mention calculating the mode.

 Thanks again.





[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Mike Lawrence
Graduate Student
Department of Psychology
Dalhousie University
www.thatmike.com

Looking to arrange a meeting? Check my public calendar:
http://www.thatmike.com/mikes-public-calendar

~ Certainty is folly... I think. ~



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to modify an R built-in function?

2009-01-26 Thread Gabor Grothendieck
Unless there is some real reason you need an arima model perhaps you
could just try an ar model instead.  ?ar

On Mon, Jan 26, 2009 at 7:36 AM, diego Diego dhab...@gmail.com wrote:
 Hello R experts!
  Last week I run in to a lot a problems triyng to fit an ARIMA model to a
 time series. The problem is that the internal process of the arima function
 call function optim to estimate the model parameters, so far so good...
 but my data presents a problem with the default method BFGS of the optim
 function, the output error looks like this:

 Error en optim(init[mask], armafn, method = BFGS, hessian = TRUE, control
 = optim.control,  :
  non-finite finite-difference value [7]

 I've searched through the R-forums for an answer and the only thing that
 look like it might help is a suggestion to modify the R-arima function in a
 way that allows to select the optimization method for the optim function.
 The post is available here:

 http://finzi.psych.upenn.edu/R/Rhelp02a/archive/138255.html

 The problem is that I'm not familiar with the procedure thet the author
 suggest, ie, I don't know how to modify the R fucntion through a R-script.

 Any help will be very appreciated!!!


 regards!!!


 Diego.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode (statistics) in R?

2009-01-26 Thread Marc Schwartz
on 01/26/2009 07:28 AM Jason Rupert wrote:
 Hopefully this is a pretty simple question:
 �
 Is there a function in R that calculates the mode of a sample?�� That is, I 
 would like to be able to determine the value that occurs the most frequently 
 in a data set. 
 �
 I tried the default R mode function, but it appears to provide a storage 
 type or something else.� 
 �
 I tried the RSeek and some R documentation that I downloaded, but nothing 
 seems to mention calculating the mode. 
 �
 Thanks again.

It depends upon the type of data you are dealing with.

If it is discrete, you can use table() to calculate frequencies and then
take the max:

set.seed(1)

tl - table(sample(letters, 100, replace = TRUE))

 tl

a b c d e f g h i j k l m n o p q r s t u v w x y z
2 3 3 3 2 4 6 1 6 5 6 4 7 2 2 2 5 4 5 3 8 4 5 4 3 1

 tl[which.max(tl)]
u
8


Alternatively, if the data is continuous, then you will need to look at
some form of density estimation. There have been various discussions
over the years on how to go about doing this, but a simplistic approach
would be:

  set.seed(1)

  x - rnorm(100)

  dx - density(x)

   dx$x[which.max(dx$y)]
  [1] 0.3294585


  # Review plot
  plot(dx)
  abline(v = dx$x[which.max(dx$y)])


See ?table, ?which.max and ?density

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] text vector clustering

2009-01-26 Thread San Miguel Martín , Eduardo
Dear srinivas,
 
You can try using trigrams, a special case of N-grams, often used in Natural 
Language Processing.
 
 I am interested in grouping/cluster these names   as those which are
similar  letter to letter.  Are there any text clustering algorithm in R
which can group names of similar type in to segments of exactly matching ,
90% matching, 80% matching,etc.
 
As an example:
 
# supose we have a list with locations
# (here we got a matrix, second column is used to create the sample, not 
relevant)
 
# locations with errors
Poblacion_dist = matrix(
 c(MADRIZ, 0.3,
   BARÇELONA, 0.25,
   BILAO, 0.135,
   SEVILA, 0.1,
   VALENÇIA, 0.1,
   CORUNA, 0.025,
   ALACANTE,0.025,
   VALLADOLI, 0.025,
   SANTIAGO, 0.01,
   SAN SEBASTIAN, 0.01,
   CADIZ, 0.01,
   ZARAGOZA, 0.01), 
 ncol = 2, byrow=T)
 
# True locations
Poblacion = matrix(
 c(MADRID, 0.3,
   BARCELONA, 0.25,
   BILBAO, 0.135,
   SEVILLA, 0.1,
   VALENCIA, 0.1,
   CORUÑA, 0.025,
   ALICANTE,0.025,
   VALLADOLID, 0.025,
   SANTIAGO, 0.01,
   SAN_SEBASTIAN, 0.01,
   CADIZ, 0.01,
   ZARAGOZA, 0.01), 
 ncol = 2, byrow=T) 
 
muestrear = function(que, cuantas_veces){
   sample(que[,1], prob = as.numeric(que[,2]), cuantas_veces)
   }
 
Provincias = ((replicate(10,c(muestrear(Poblacion,1), 
c(muestrear(Poblacion_dist,1))

 
# now we have a list with 20 locations 
Provincias = Provincias[1:length(Provincias)]
 
# next we need to process each location as a set of trigrams
word2trigram = function(word){
   trigramatrix =  matrix(c(seq(1, nchar(word)-2), seq(1, nchar(word)-2)+2), 
ncol = 2, byrow = F)
   trigram = c()
   for (i in 1:nrow(trigramatrix)) {
   trigram = 
append(trigram,substr(word,trigramatrix[i,1],trigramatrix[i,2]))
   }
   return(trigram)
}
Prov2trigram = lapply(Provincias, word2trigram)
 
# every trigram in the sample
Trigrams = levels(factor((unlist(Prov2trigram
 
# we get how many times appears a trigram in a location
ocrrnc.mtrx = matrix(rep(0,length(Trigrams)* length(Prov2trigram)), ncol = 
length(Prov2trigram))
for (i in 1:ncol(ocrrnc.mtrx)) {
  ocrrnc.mtrx[,i] = as.integer(table(append(Prov2trigram[[i]], Trigrams))-1)
  }
 
# calculate cosine (often used in NLP)
matrizCos = function(X){
X  = t(X )
nterm = nrow(X )
modulo = c()
cosen = matrix(rep(0,(nterm*nterm)),ncol = nterm)
for (i in 1:nterm){
Vec = X [i,]
modulo[i] = sqrt(Vec%*%Vec)
cosen[,i] = (X  %*% Vec)
}
cosen = (cosen/modulo)/matrix(rep(modulo,nterm),ncol = nterm,byrow=T)
cosen[is.nan(cosen)] - 0
return (cosen)
}
rslt.dst.mat = matrizCos(ocrrnc.mtrx)
 
# and get the clusters
attr(rslt.dst.mat , dimnames)-list(Provincias , Provincias )
plot(hclust(as.dist(1-rslt.dst.mat),method = 'med'))
 
I hope this helps,
Eduardo San Miguel Martin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] heatmap with levelplot

2009-01-26 Thread Antje
I played a little bit around and got the following solution which works for 
now, though it seems to be too complicated to me.

If anybody else know another solution - please let me know!!!


library(lattice)
my.mat - matrix(rnorm(800), nrow = 40)

colorFun - colorRampPalette(c(yellow,red))

b - boxplot(my.mat, plot = FALSE)
thr - c(b$stats[1],b$stats[5])
col.bins - 100
step - abs(thr[2] - thr[1])/50

limit - ifelse(min(my.mat)  thr[1] - step, min(my.mat) - step, min(my.mat))
lp - rev(seq(thr[1] - step, limit - step, -step))
mp - seq(thr[1], thr[2], step)
limit - ifelse(max(my.mat)  thr[2] + step, max(my.mat) + step, max(my.mat))
up - seq(thr[2] + step, limit + step, step)

my.at - c(lp,mp,up)

my.col.regions - c(rep(green, length(lp)), colorFun(length(mp)), rep(blue, 
length(up)) )


levelplot(my.mat, at = my.at, col.regions = my.col.regions)






Antje schrieb:

Hi there,

I'd like to create a heatmap from my matrix with
a) a defined color range (lets say from yellow to red)
b) using striking colors above and below a certain threshold (above = 
green, below = blue)


Example matrix (there should be a few outliers generated...) + simple 
levelplot without outliers marked:


library(lattice)
my.mat - matrix(rnorm(800), nrow = 40)
threshold - c(-1,1) # should be used for the extreme colors
colorFun - colorRampPalette(c(yellow,red))
levelplot(my.mat, col.regions = colorFun(50))


I don't know how to handle the extrem values...

Can anybody help?

Antje

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Mode (statistics) in R?

2009-01-26 Thread patricia garcía gonzález

Hello, 
 
I think this will work:
 
names( sort( -table( x ) ) )[1]
 
Regards
 
Patricia García
 
 From: c...@datanalytics.com To: jasonkrup...@yahoo.com Date: Mon, 26 Jan 
 2009 18:34:00 +0500 CC: r-help@r-project.org Subject: Re: [R] Mode 
 (statistics) in R?  Hello,  You can try ?table.   Best regards,  
 Carlos J. Gil Bellosta http://www.datanaytics.com  On Mon, 2009-01-26 at 
 05:28 -0800, Jason Rupert wrote:  Hopefully this is a pretty simple 
 question:Is there a function in R that calculates the mode of a 
 sample? That is, I would like to be able to determine the value that occurs 
 the most frequently in a data set. I tried the default R mode 
 function, but it appears to provide a storage type or something else. 
 I tried the RSeek and some R documentation that I downloaded, but nothing 
 seems to mention calculating the mode. Thanks again.
 [[alternative HTML version deleted]]
 __  R-help@r-project.org 
 mailing list  https://stat.ethz.ch/mailman/listinfo/r-help  PLEASE do 
 read the posting guide http://www.R-project.org/posting-guide.html  and 
 provide commented, minimal, self-contained, reproducible code.  
 __ R-help@r-project.org mailing 
 list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the 
 posting guide http://www.R-project.org/posting-guide.html and provide 
 commented, minimal, self-contained, reproducible code.
_


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-26 Thread bartjoosen

 I added patientinformation1 variable and then I gave the command for
 tapply but its giving me the following error:

 Error in tapply(pat1, format(dos, %Y%m), function(x) sum(x == 0)) :
   arguments must have same length



seems like you added patientinformation1, but still use pat1 in the tapply
call.

Bart
-- 
View this message in context: 
http://www.nabble.com/Plotting-graph-for-Missing-values-tp21659322p21666790.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Getting data from a PDF-file into R

2009-01-26 Thread joe1985

Hello

I have around 200 PDF-documents, containing data i want organized in R as a
dataframe. The PDF-documents look like this;

  http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver.jpeg 

or like this;

http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver%2B2.jpeg 

So i want to pull out the data in coloured boxes it become organized like
this (just in R instead of excel);


http://www.nabble.com/file/p21667074/PRRS-billede%2Bexcel.jpeg 

So the 0'es and 1'es represent when either PRRS-neg occurs presented by a
0 in the colums PRRS-VAC and PRRS-DK on a particular date. And the same with
PRRS-pos VAC or Vac presented by a 1 in the colum PRRS-VAC, and
PRRS-pos DK  or DK presented by a 1 in the colum PRRS-DK. And also with
sanVAC there should be a 1 in the colum VACsan, and with sanDK there
should be a 1 in the colum DKsan. The first date for each CHR-nr should
either be the earliest date ne the red box (as in the first picture), or the
date with word før before the date (as in the second picture). All the 200
PDF-documents looks like the ones in the pictures, each reprenting a
different CHR-nr


Hope you can help me
-- 
View this message in context: 
http://www.nabble.com/Getting-data-from-a-PDF-file-into-R-tp21667074p21667074.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-26 Thread Tony Breyal
Hi, i ran your getURL example and had the same problem with
downloading the file.

## R Start..
 library(RCurl)
 toString(getURL(http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2;))
[1] 
## R end.

However, if it is interesting that if  you manually save the page to
your desktop, getURL works fine on it:

## R Start..
 library(URL)
 toString(getURL('file:PFO-SBS001//Redirected//tonyb//Desktop//webpage.html'))
[1] \n\n\n\n\n\n\n\n\n\n\n\n\n\n\n!DOCTYPE HTML PUBLIC \-//W3C//DTD
HTML 4.01 Transitional//EN\ \http://www.w3.org/TR/html4/loose.dtd\;
\nhtml\nhead\n\
[etc...]
## R end.


very strange indeed.I use RCurl for web crawling every now and again
so i would be interested in knowing why this happens too :-)

Tony Breyal



On 26 Jan, 13:58, clair.crossup...@googlemail.com
clair.crossup...@googlemail.com wrote:
 Dear R-help,

 There seems to be a web page I am unable to download using RCurl. I
 don't understand why it won't download:

  library(RCurl)
  my.url - 
  http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...;
  getURL(my.url)

 [1] 

 Other web pages are ok to download but this is the first time I have
 been unable to download a web page using the very nice RCurl package.
 While i can download the webpage using the RDCOMClient, i would like
 to understand why it doesn't work as above please?

  library(RDCOMClient)
  my.url - 
  http://www.nytimes.com/2009/01/07/technology/business-computing/07pro...;
  ie - COMCreate(InternetExplorer.Application)
  txt - list()
  ie$Navigate(my.url)
 NULL
  while(ie[[Busy]]) Sys.sleep(1)
  txt[[my.url]] - ie[[document]][[body]][[innerText]]
  txt

 $`http://www.nytimes.com/2009/01/07/technology/business-computing/
 07program.html?_r=2`
 [1] Skip to article Try Electronic Edition Log ...

 Many thanks for your time,
 C.C

 Windows Vista, running with administrator privileges. sessionInfo()

 R version 2.8.1 (2008-12-22)
 i386-pc-mingw32

 locale:
 LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
 1252;LC_MONETARY=English_United Kingdom.
 1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods
 base

 other attached packages:
 [1] RDCOMClient_0.92-0 RCurl_0.94-0

 loaded via a namespace (and not attached):
 [1] tools_2.8.1

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph for Missing values

2009-01-26 Thread Petr PIKAL
Hi Jim

r-help-boun...@r-project.org napsal dne 26.01.2009 15:44:32:

 From your original posting:
 
  I tried the code which u provided.
  In place of dos in command pat1 - rbinom(length(dos), 1, .5)  # 
generate
  some data
  I added patientinformation1 variable and then I gave the command for
  tapply but its giving me the following error:
 
  Error in tapply(pat1, format(dos, %Y%m), function(x) sum(x == 0)) :
arguments must have same length
 
 I would say that pat1 and dos were not of the same length.  Check
 your code and objects to verify this; that is what the error message
 is saying.  You said you added the patientinformation1 variable, but
 it does not seem to appear in the error message.

You are really patient. I presume Shreyasee does not know much about data 
structures and function use in R. It probably could help a lot if s/he 
looked into same basic documents like R intro.

If I understand correctly what was done is

pat1 - rbinom(length(patientinformation1), 1, .5)

what does not make much sense as it code an artificial data as well and 
most probably there is dos version in memory which was constructed 
during testing your code and which has length 335. This could result in 
mentioned error

  Error in tapply(pat1, format(dos, %Y%m), function(x) sum(x == 0)) :
arguments must have same length

Then note

  ds - read.csv(file=D:/Shreyasee laptop data/ASC Dataset/Subset of 
the ASC
  Dataset.csv, header=TRUE)
  attach(ds)
  str(dos)


if str(ds) is issued, it could reveal what kind of data s/he has. 
Also format(dos, ...) would not work as dos is factor not Date

  str(dos)
 
  I am getting the following message:
 
   Factor w/ 12 levels -00-00,6-Aug,..: 6 6 6 6 6 6 6 6 6 6 ...

If it was

 aggregate(ds[,-1], list(format(ds$dos, %Y%m)), function(x) sum(x==0))
   Group.1 pat1 pat2
1   200605   12   16
2   200606   20   18
3   200607   12   13
4   200608   18   15
5   200609   18   11
6   200610   17   15
7   200611   19   17
8   200612   14   15
9   200701   14   18
10  200702   13   13
11  200703   16   19

could do the trick if patientinformation variables had the same structure 
as you anticipate which is not true

  *for(i in 1:length(dos))
  for(j in 1:length(patientinformation1)
  if(dos[i]==May-06  patientinformation1[j]==)
  a - j+1

Well, if Shreyasee manage to redefine dos to Date mode (which will not be 
straightforward if dos has awkward structure), then something like

aggregate(ds[,-1], list(format(ds$dos, %Y%m)), function(x) sum(x==))

could do the trick.

Regards
Petr

 
 On Sun, Jan 25, 2009 at 11:48 PM, Shreyasee 
shreyasee.prad...@gmail.com wrote:
  Hi Jim,
 
  I run the following code
 
  ds - read.csv(file=D:/Shreyasee laptop data/ASC Dataset/Subset of 
the ASC
  Dataset.csv, header=TRUE)
  attach(ds)
  str(dos)
 
  I am getting the following message:
 
   Factor w/ 12 levels -00-00,6-Aug,..: 6 6 6 6 6 6 6 6 6 6 ...
 
  Thanks,
  Shreyasee
 
 
 
  On Mon, Jan 26, 2009 at 12:20 PM, jim holtman jholt...@gmail.com 
wrote:
 
  do:
 
  str(dos)
  str(patientinformation1)
 
  They must be the same length for the command to work: must be a one 
to
  one match of the data.
 
  On Sun, Jan 25, 2009 at 10:23 PM, Shreyasee 
shreyasee.prad...@gmail.com
  wrote:
   Hi Jim,
  
   I tried the code which u provided.
   In place of dos in command pat1 - rbinom(length(dos), 1, .5)  #
   generate
   some data
   I added patientinformation1 variable and then I gave the command 
for
   tapply but its giving me the following error:
  
   Error in tapply(pat1, format(dos, %Y%m), function(x) sum(x == 0)) 
:
 arguments must have same length
  
  
   Thanks,
   Shreyasee
  
  
  
   On Mon, Jan 26, 2009 at 10:50 AM, jim holtman jholt...@gmail.com
   wrote:
  
   YOu can save the output of the tapply and then replicate it for 
each
   of the variables.  The data can be used to plot the graphs.
  
   On Sun, Jan 25, 2009 at 9:38 PM, Shreyasee
   shreyasee.prad...@gmail.com
   wrote:
Hi Jim,
   
I need to calculate the missing values in variable
patientinformation1
for
the period of May 2006 to March 2007 and then plot the graph of 
the
percentage of the missing values over these months.
This has to be done for each variable.
The code which you have provided, calculates the missing values 
for
the
months variable, am I right?
I need to calculate for all the variables for each month.
   
Thanks,
Shreyasee
   
   
On Mon, Jan 26, 2009 at 10:29 AM, jim holtman 
jholt...@gmail.com
wrote:
   
Here is an example of how you might approach it:
   
 dos - seq(as.Date('2006-05-01'), as.Date('2007-03-31'), 
by='1
 day')
 pat1 - rbinom(length(dos), 1, .5)  # generate some data
 # partition by month and then list out the number of zero 
values
 (missing)
 tapply(pat1, format(dos, %Y%m), function(x) sum(x==0))
200605 200606 200607 200608 200609 200610 200611 200612 200701
200702

[R] Help with sas.get

2009-01-26 Thread Sebastien Bihorel

Dear R-users,

I am seeking advises on the sas.get function from the Hmisc package. I 
have tried to import some of our SAS files using the syntax presented in 
the function help example but the importation always failed. The 
function does not seem to recognize our sas files and complains about 
the lack of format library files (I am not SAS proficient, but I guess 
that is what the formats.sas7bcat is, isn't?).
Currently, my working directly contain different .sas7bdat files but no 
.sas7bcat files. Is the existence of format files assumed by the function?


I would really appreciate if a user experienced with this function could 
provide some guidance.


Thank you

Working environment: R 2.8.1 is installed on linux machines with the 
most recent version of the Hmisc package; SAS 9 runs on a Solaris based 
system.



### Code
mypath - /home/sbihorel/my_documents/Testing_env/SAS_dataset_R_import
mydf - sas.get(library=mypath,member=test)


### Error message
Error in sas.get(library = mypath, member = test) :
  SAS output files not found
In addition: Warning message:
In sas.get(library = mypath, member = test) :

/home/sbihorel/my_documents/Testing_env/SAS_dataset_R_import/formats.sc? 
or formats.sas7bcat  not found. Formatting ignored.


Execution halted

--
Sebastien Bihorel

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting data from a PDF-file into R

2009-01-26 Thread Peter Dalgaard
joe1985 wrote:
 Hello
 
 I have around 200 PDF-documents, containing data i want organized in R as a
 dataframe. The PDF-documents look like this;
 
   http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver.jpeg 
 
 or like this;
 
 http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver%2B2.jpeg 
 
 So i want to pull out the data in coloured boxes it become organized like
 this (just in R instead of excel);
 
 
 http://www.nabble.com/file/p21667074/PRRS-billede%2Bexcel.jpeg 
 
 So the 0'es and 1'es represent when either PRRS-neg occurs presented by a
 0 in the colums PRRS-VAC and PRRS-DK on a particular date. And the same with
 PRRS-pos VAC or Vac presented by a 1 in the colum PRRS-VAC, and
 PRRS-pos DK  or DK presented by a 1 in the colum PRRS-DK. And also with
 sanVAC there should be a 1 in the colum VACsan, and with sanDK there
 should be a 1 in the colum DKsan. The first date for each CHR-nr should
 either be the earliest date ne the red box (as in the first picture), or the
 date with word før before the date (as in the second picture). All the 200
 PDF-documents looks like the ones in the pictures, each reprenting a
 different CHR-nr
 
 
 Hope you can help me

Not on the basis of .jpeg files, I think. We'd need some indication of
what the PDF looks like inside.  There's a tool called pdftotext, which
might do something for you, IF you can figure out reliably where your
data begin and end.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting data from a PDF-file into R

2009-01-26 Thread hadley wickham
On Mon, Jan 26, 2009 at 9:40 AM, Peter Dalgaard
p.dalga...@biostat.ku.dk wrote:
 joe1985 wrote:
 Hello

 I have around 200 PDF-documents, containing data i want organized in R as a
 dataframe. The PDF-documents look like this;

   http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver.jpeg

 or like this;

 http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver%2B2.jpeg

 So i want to pull out the data in coloured boxes it become organized like
 this (just in R instead of excel);


 http://www.nabble.com/file/p21667074/PRRS-billede%2Bexcel.jpeg

 So the 0'es and 1'es represent when either PRRS-neg occurs presented by a
 0 in the colums PRRS-VAC and PRRS-DK on a particular date. And the same with
 PRRS-pos VAC or Vac presented by a 1 in the colum PRRS-VAC, and
 PRRS-pos DK  or DK presented by a 1 in the colum PRRS-DK. And also with
 sanVAC there should be a 1 in the colum VACsan, and with sanDK there
 should be a 1 in the colum DKsan. The first date for each CHR-nr should
 either be the earliest date ne the red box (as in the first picture), or the
 date with word før before the date (as in the second picture). All the 200
 PDF-documents looks like the ones in the pictures, each reprenting a
 different CHR-nr


 Hope you can help me

 Not on the basis of .jpeg files, I think. We'd need some indication of
 what the PDF looks like inside.  There's a tool called pdftotext, which
 might do something for you, IF you can figure out reliably where your
 data begin and end.

An alternative is to outsource the problem.  You can get very
reasonable data entry quotes from sites like http://www.elance.com/,
and depending on how much you value your time this might end up being
a much cheaper option than figuring out how to do it programmatically
(but not as intellectually satisfying).

Hadley

-- 
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] PCALG Package

2009-01-26 Thread Tibert, Brock
Hi all,

Can anyone help me setup  this package so I can use it.  I am getting errors 
with the Rgraphviz package and have tried a number of ways to get this to work. 
 Any help will be greatly appreciated!  I am sorta new to R but have been 
actively trying to get into using it as my main analysis software.

Thanks,

Brock


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] list.files changed in 2.7.0

2009-01-26 Thread davidr
Hmm. I get exactly the same files and directories with C: and C:/,
except for the double slashes now.
Previously the two calls to list.files gave exactly the same results.
My current directory (getwd()) is not C:. I'm puzzled by your output.

-- David

-Original Message-
From: henrik.bengts...@gmail.com [mailto:henrik.bengts...@gmail.com] On
Behalf Of Henrik Bengtsson
Sent: Friday, January 23, 2009 8:36 PM
To: David Reiner dav...@rhotrading.com
Cc: r-help@r-project.org
Subject: Re: [R] list.files changed in 2.7.0

And I'm not sure that list.files(C:, full.names=TRUE) returns
correct pathnames, because it lists the files in the current directory
(of C:), not the root of C:. There is a difference between C: and C:/,
and you should get:

list.files(C:, full.names=TRUE)
[1] C:aFile.txt
[2] C:anotherFile.txt

list.files(C:/, full.names=TRUE)
[1] C:/Documents and Settings
[2] C:/Program Files

Now we get:

list.files(C:, full.names=TRUE)
[1] C:/aFile.txt
[2] C:/anotherFile.txt

list.files(C:/, full.names=TRUE)
[1] C://Documents and Settings
[2] C://Program Files

This causes

pathnames - list.files(C:, full.names=TRUE);
file.exists(pathnames);

to return all FALSE (not expected), whereas,

pathnames - list.files(C:);
file.exists(pathnames);

returns all TRUE (expected).

So, that extract slash seems to be the cause.

My $.02

/Henrik

On Fri, Jan 23, 2009 at 2:42 PM,  dav...@rhotrading.com wrote:
 I just noticed a change in the behavior of list.files from 2.6.1pat to
 2.7.0
 (I noticed it in 2.8.1 and traced back.)

 Previously, if the directory ended with a slash, and full.names=TRUE,
 the names
 returned had a single slash at the end of the directory,
 but now there are two. I noticed since I was getting a list of certain
 files and
 then grepping in the list for a full name formed with a single slash.
 (The double slash would be OK if I were opening the file since the OS
 treats double
 slash in a path the same as a single slash.)

 I searched through the release notes, etc., and couldn't find this
 announced.

 Try
 list.files(C:, full.names=TRUE)
 list.files(C:/, full.names=TRUE)

 Is there any chance that this could be put back to the single slash
 behavior?

 (This was on Windows XP.)

 Thanks,
 David L. Reiner

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting data from a PDF-file into R

2009-01-26 Thread Henrique Dallazuanna
You can convert the pdf to text, then manipulate the output to read only the
data.

In linux has pdftotext function, in linux you can download the xpdf zip,
that contais such function.

Best


On 1/26/09, joe1985 johan...@dsr.life.ku.dk wrote:


 Hello

 I have around 200 PDF-documents, containing data i want organized in R as a
 dataframe. The PDF-documents look like this;

 http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver.jpeg

 or like this;

 http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver%2B2.jpeg

 So i want to pull out the data in coloured boxes it become organized like
 this (just in R instead of excel);


 http://www.nabble.com/file/p21667074/PRRS-billede%2Bexcel.jpeg

 So the 0'es and 1'es represent when either PRRS-neg occurs presented by a
 0 in the colums PRRS-VAC and PRRS-DK on a particular date. And the same
 with
 PRRS-pos VAC or Vac presented by a 1 in the colum PRRS-VAC, and
 PRRS-pos DK  or DK presented by a 1 in the colum PRRS-DK. And also with
 sanVAC there should be a 1 in the colum VACsan, and with sanDK there
 should be a 1 in the colum DKsan. The first date for each CHR-nr should
 either be the earliest date ne the red box (as in the first picture), or
 the
 date with word før before the date (as in the second picture). All the
 200
 PDF-documents looks like the ones in the pictures, each reprenting a
 different CHR-nr


 Hope you can help me
 --
 View this message in context:
 http://www.nabble.com/Getting-data-from-a-PDF-file-into-R-tp21667074p21667074.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RExcel foreground and background server

2009-01-26 Thread Irina Ursachi
Dear all,

I have a question regarding background and foreground server in RExcel:
Can somebody explain the main difference between them? As far as I
understood from the RExcel webpage, for both of them one  needs rights
for access to Windows registries. The only difference that I can see so
far is that for the installation of the background server, one needs the
R(D)COM package, whereas for the foreground server installation, the
rcom package is required.

Further more, does anyone know, how Excel proccesses and R processes
communicate? Is it possible for more than one Excel-process to
communicate with more than one R process?

Thank you in advance!

Best regards,
Irina Ursachi.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-26 Thread Duncan Temple Lang



clair.crossup...@googlemail.com wrote:

Dear R-help,

There seems to be a web page I am unable to download using RCurl. I
don't understand why it won't download:


library(RCurl)
my.url - 
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2;
getURL(my.url)

[1] 




 I like the irony that RCurl seems to have difficulties downloading an 
article about R.  Good thing it is just a matter of additional arguments

to getURL() or it would be bad news.


The followlocation parameter defaults to FALSE, so

  getURL(my.url, followlocation = TRUE)

gets what you want.

The way I found this  is

 getURL(my.url, verbose = TRUE)

and take a look at the information being sent from R
and received by R from the server.

This gives

* About to connect() to www.nytimes.com port 80 (#0)
*   Trying 199.239.136.200... * connected
* Connected to www.nytimes.com (199.239.136.200) port 80 (#0)
 GET /2009/01/07/technology/business-computing/07program.html?_r=2 
HTTP/1.1

Host: www.nytimes.com
Accept: */*

 HTTP/1.1 301 Moved Permanently
 Server: Sun-ONE-Web-Server/6.1
 Date: Mon, 26 Jan 2009 16:10:51 GMT
 Content-length: 0
 Content-type: text/html
 Location: 
http://www.nytimes.com/glogin?URI=http://www.nytimes.com/2009/01/07/technology/business-computing/07program.htmlOQ=_rQ3D3op=42fceb38q2fq5duarq5d3-z8q26--q24jq5djccq7bq5dcmq5dc1q5dq24...@-f-q2anq5dry8h@a88q3dz-dbyq...@q2aq5dc1bq26-q2aq26q5bddfq24df



And the 301 is the critical thing here.

 D.



Other web pages are ok to download but this is the first time I have
been unable to download a web page using the very nice RCurl package.
While i can download the webpage using the RDCOMClient, i would like
to understand why it doesn't work as above please?





library(RDCOMClient)
my.url - 
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2;
ie - COMCreate(InternetExplorer.Application)
txt - list()
ie$Navigate(my.url)

NULL

while(ie[[Busy]]) Sys.sleep(1)
txt[[my.url]] - ie[[document]][[body]][[innerText]]
txt

$`http://www.nytimes.com/2009/01/07/technology/business-computing/
07program.html?_r=2`
[1] Skip to article Try Electronic Edition Log ...


Many thanks for your time,
C.C

Windows Vista, running with administrator privileges.

sessionInfo()

R version 2.8.1 (2008-12-22)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] RDCOMClient_0.92-0 RCurl_0.94-0

loaded via a namespace (and not attached):
[1] tools_2.8.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ANOVA with subsampling question

2009-01-26 Thread Wade Wall
Hi all,

I am trying to analyze an experiment I ran, but not sure how to code in R.

I have germinated seeds in petri dishes at 3 different temperatures
(call it low, med, and high) and 2 different light levels (light and
dark).  For each seed I have recorded time to germination (not
counting those that didn't germinate because I will analyze in a
separate ANOVA).  Each temperature/light treatment has 5 petri dishes
with 10 seeds per dish, for a total of 30 dishes and 300 seeds.  The
replicate is petri dish, but I want to treat the seeds as subsampling
in the error effects.

Any help would be appreciated.  I have looked at code for mixed
models, but want to make sure that I am on the right track.

Thanks a lot,

Wade

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HMISC package: wtd.table()

2009-01-26 Thread Frank E Harrell Jr

Norbert NEUWIRTH wrote:

oops, that really was easy:

(wtd.table(FamTyp.kurz,HGEW,normwt=FALSE,na.rm=TRUE))  instead of 
(wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))


That is one solution.  The other is to spell 'weights' correctly :-)
Frank



sorry for that question ...

Am 26.01.2009, 10:30 Uhr, schrieb Norbert NEUWIRTH 
norbert.s.neuwi...@univie.ac.at:


Hi useRs  developeRs,

I got stuck within a function of the Hmisc package. Sounds easy, hope it is:

I got 2 items (FamTyp.kurz, HGEW) of same length, no missings.


length(FamTyp.kurz);summary(FamTyp.kurz)

[1] 14883
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  10.00   20.00   21.00   21.66   23.00   31.00

length(HGEW);summary(HGEW)

[1] 14883
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  104.5   409.6   489.4   549.8   623.3  3880.0

Now I simply want to compute a table of unweighted and weighted values. But, 
...  the weights do not seem to be accepted 



  print(unweighted );print(table(FamTyp.kurz))

[1] unweighted 
FamTyp.kurz
  10  1120   21 22  23 30  31
1755  683 3322 1683 2428 1440 1748 1824


  print(weighted   
);print(wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))

[1] weighted   
Error in wtd.table(FamTyp.kurz, weigths = HGEW, normwt = FALSE, na.rm = TRUE) :
  unused arguments (weigths = c(495.55949, 495.55949, 678.16378, 678.16378,  
.


any ideas ???

thanx in advance,
Norbert










--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] commercially supported version of R for 64 -bit Windows?

2009-01-26 Thread David M Smith
That's correct: REvolution Computing (whom I work for) is in the process of
porting R and packages to 64-bit Windows.  The development process has been
underway for several months and is near completion.  There will be a beta
test in February and we expect to release in March.
When the beta program is launched it will be announced at
http://blog.revolution-computing.com , but if anyone is interested in
getting involved sooner please let me know.

# David Smith

-- 
David M Smith da...@revolution-computing.com
Director of Community, REvolution Computing www.revolution-computing.com
Tel: +1 (206) 577-4778 x3203 (Seattle, USA)

On Sun, Jan 25, 2009 at 10:47 AM, Dirk Eddelbuettel e...@debian.org wrote:


 On 25 January 2009 at 04:39, new ruser wrote:
 | Can anyone please refer me to all firms that offer and/or are developing
 a
 | commercially supported version of R for 64 -bit Windows? - Thanks

 Try contacting Revolution-Computing.com --- to the best of my knowledge
 they
 expect to have such a product forthcoming in 2009.

 64bit versions have of course been available on Linux / Unix for over a
 decade so you could use that now.  Works great for me on Debian and Ubuntu.

 Dirk

 --
 Three out of two people have difficulties with fractions.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reshape problem: id and variable names not being recognized

2009-01-26 Thread MW Frost
Hi everyone. Long time listener, first-time caller here.

I have a data set that's been melted with the excellent reshape package, but
I can't seem to cast it the way I need to.

Here's the melted data's structure:

 str(mdat)
'data.frame':   6978 obs. of  4 variables:
$ VehType : Factor w/ 2 levels Car,Truck: 1 1 2 1 1 2 1 1 1 1 ...
$ Year: Factor w/ 6 levels 2003,2004,..: 5 1 5 6 6 2 2 3 2 5 ...
$ variable: Factor w/ 1 level mpg: 1 1 1 1 1 1 1 1 1 1 ...
$ value   : num  22.4 21.5 22.6 22.4 25 ...

For the purpose of testing, I have stripped out all the variables except for
mpg.
Casting it without specifying any ids or variables works fine:

 cast(mdat,,mean)
  VehType Year  mpg
1  Car 2003 22.03623
2  Car 2004 21.94160
3  Car 2005 21.77286
4  Car 2006 21.49105
5  Car 2007 21.38180
6  Car 2008 21.56873
7Truck 2003 16.91461
8Truck 2004 16.88771
9Truck 2005 17.19801
10   Truck 2006 17.48225
11   Truck 2007 17.40694
12   Truck 2008 17.74042

I should then be able to make a crosstab of the means by writing a formula,
right? It fails, though:

 cast(mdat, VehType ~ Year | mpg, mean)
Error: Casting formula contains variables not found in molten data: mpg

When I make the same table by using variable instead of the name of my
variable, it works:

 cast(mdat, VehType ~ Year | variable, mean)
$mpg
  VehType 2003 2004 2005 2006 2007 2008
1 Car 22.03623 21.94160 21.77286 21.49105 21.38180 21.56873
2   Truck 16.91461 16.88771 17.19801 17.48225 17.40694 17.74042

Why can't it find the mpg variable when I call it explicitly?

Thanks,
Matt Frost

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] RCurl unable to download a particular web page -- what is so special about this web page?

2009-01-26 Thread Jeffrey Horner

Duncan Temple Lang wrote:



clair.crossup...@googlemail.com wrote:

Dear R-help,

There seems to be a web page I am unable to download using RCurl. I
don't understand why it won't download:


library(RCurl)
my.url - 
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2; 


getURL(my.url)

[1] 




 I like the irony that RCurl seems to have difficulties downloading an 
article about R.  Good thing it is just a matter of additional arguments

to getURL() or it would be bad news.
Don't forget the irony that https is supported in url() and 
download.file() on Windows but not UNIX...


http://tolstoy.newcastle.edu.au/R/e2/devel/07/01/1634.html

Jeff



The followlocation parameter defaults to FALSE, so

  getURL(my.url, followlocation = TRUE)

gets what you want.

The way I found this  is

 getURL(my.url, verbose = TRUE)

and take a look at the information being sent from R
and received by R from the server.

This gives

* About to connect() to www.nytimes.com port 80 (#0)
*   Trying 199.239.136.200... * connected
* Connected to www.nytimes.com (199.239.136.200) port 80 (#0)
 GET /2009/01/07/technology/business-computing/07program.html?_r=2 
HTTP/1.1

Host: www.nytimes.com
Accept: */*

 HTTP/1.1 301 Moved Permanently
 Server: Sun-ONE-Web-Server/6.1
 Date: Mon, 26 Jan 2009 16:10:51 GMT
 Content-length: 0
 Content-type: text/html
 Location: 
http://www.nytimes.com/glogin?URI=http://www.nytimes.com/2009/01/07/technology/business-computing/07program.htmlOQ=_rQ3D3op=42fceb38q2fq5duarq5d3-z8q26--q24jq5djccq7bq5dcmq5dc1q5dq24...@-f-q2anq5dry8h@a88q3dz-dbyq...@q2aq5dc1bq26-q2aq26q5bddfq24df 




And the 301 is the critical thing here.

 D.



Other web pages are ok to download but this is the first time I have
been unable to download a web page using the very nice RCurl package.
While i can download the webpage using the RDCOMClient, i would like
to understand why it doesn't work as above please?





library(RDCOMClient)
my.url - 
http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html?_r=2; 


ie - COMCreate(InternetExplorer.Application)
txt - list()
ie$Navigate(my.url)

NULL

while(ie[[Busy]]) Sys.sleep(1)
txt[[my.url]] - ie[[document]][[body]][[innerText]]
txt

$`http://www.nytimes.com/2009/01/07/technology/business-computing/
07program.html?_r=2`
[1] Skip to article Try Electronic Edition Log ...


Many thanks for your time,
C.C

Windows Vista, running with administrator privileges.

sessionInfo()

R version 2.8.1 (2008-12-22)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
1252;LC_MONETARY=English_United Kingdom.
1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods
base

other attached packages:
[1] RDCOMClient_0.92-0 RCurl_0.94-0

loaded via a namespace (and not attached):
[1] tools_2.8.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshape problem: id and variable names not being recognized

2009-01-26 Thread jim holtman
Look at your 'str(mdat)' and you will see that there is not a variable
call 'mpg'; it is one of the levels of the 'variable'.

On Mon, Jan 26, 2009 at 11:38 AM, MW Frost mwfr...@gmail.com wrote:
 Hi everyone. Long time listener, first-time caller here.

 I have a data set that's been melted with the excellent reshape package, but
 I can't seem to cast it the way I need to.

 Here's the melted data's structure:

 str(mdat)
 'data.frame':   6978 obs. of  4 variables:
 $ VehType : Factor w/ 2 levels Car,Truck: 1 1 2 1 1 2 1 1 1 1 ...
 $ Year: Factor w/ 6 levels 2003,2004,..: 5 1 5 6 6 2 2 3 2 5 ...
 $ variable: Factor w/ 1 level mpg: 1 1 1 1 1 1 1 1 1 1 ...
 $ value   : num  22.4 21.5 22.6 22.4 25 ...

 For the purpose of testing, I have stripped out all the variables except for
 mpg.
 Casting it without specifying any ids or variables works fine:

 cast(mdat,,mean)
  VehType Year  mpg
 1  Car 2003 22.03623
 2  Car 2004 21.94160
 3  Car 2005 21.77286
 4  Car 2006 21.49105
 5  Car 2007 21.38180
 6  Car 2008 21.56873
 7Truck 2003 16.91461
 8Truck 2004 16.88771
 9Truck 2005 17.19801
 10   Truck 2006 17.48225
 11   Truck 2007 17.40694
 12   Truck 2008 17.74042

 I should then be able to make a crosstab of the means by writing a formula,
 right? It fails, though:

 cast(mdat, VehType ~ Year | mpg, mean)
 Error: Casting formula contains variables not found in molten data: mpg

 When I make the same table by using variable instead of the name of my
 variable, it works:

 cast(mdat, VehType ~ Year | variable, mean)
 $mpg
  VehType 2003 2004 2005 2006 2007 2008
 1 Car 22.03623 21.94160 21.77286 21.49105 21.38180 21.56873
 2   Truck 16.91461 16.88771 17.19801 17.48225 17.40694 17.74042

 Why can't it find the mpg variable when I call it explicitly?

 Thanks,
 Matt Frost

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Tinn-R

2009-01-26 Thread Tibert, Brock
Hi Everyone,

I was hoping someone could help me with the settings for Tinn-R.  I see in the 
screen shots that it has syntax help, or something similar (tips on what 
functions, etc).  I can not seem to get this to turn on in the program, and I 
am wondering if I have to set up a few options.   I quickly read through the 
help and could not figure it out.

Many thanks!

- Brock

P.S.  It appears as if Tinn-R is widely used, but would you recommend something 
different?  I am new to R and programming, but have learned (somewhat) using 
VBA editors I have grown to love to the intelligent typing that goes along with 
it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Goodness of fit for gamma distributions

2009-01-26 Thread Dan31415

I'm looking for goodness of fit tests for gamma distributions with large data
sizes. I have a matrix with around 10,000 data values in it and i have
fitted a gamma distribution over a histogram of the data. 

The problem is testing how well that distribution fits. Chi-squared seems to
be used more for discrete distributions and kolmogorov-smirnov seems that
large sample sizes make it had to evaluate the D statistic. Also i haven't
found a qq plot for gamma, although i think this might be an appropriate
test.

in summary
-is there a gamma goodness of fit test that doesnt depend on the sample
size?
-is there a way of using qqplot for gamma distributions, if so how would you
calculate it from a matrix of data values?

regards,
Dann
-- 
View this message in context: 
http://www.nabble.com/Goodness-of-fit-for-gamma-distributions-tp21668711p21668711.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HMISC package: wtd.table()

2009-01-26 Thread Dieter Menne
Frank E Harrell Jr f.harrell at vanderbilt.edu writes:

  (wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))
 
 That is one solution.  The other is to spell 'weights' correctly 

Have pity with us German speakers. It was such a paing to learn  
th that we cannot resist to apply it whenever pothible.

Dieter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] error managment

2009-01-26 Thread diego Diego
Hello R experts!
 I'm running a FOR loop in which at every step an arima model is generated.
The problem is some series produces numeric problems with optim. My question
is if there is a way of telling to R that at every critical error of optim
jumps to the next series instead of stopping the calculations. Or better
yet, tell it to run another arima fit but with a different optmization
algorithm.



Thanks!!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error managment

2009-01-26 Thread Gabor Grothendieck
?try

On Mon, Jan 26, 2009 at 1:05 PM, diego Diego dhab...@gmail.com wrote:
 Hello R experts!
  I'm running a FOR loop in which at every step an arima model is generated.
 The problem is some series produces numeric problems with optim. My question
 is if there is a way of telling to R that at every critical error of optim
 jumps to the next series instead of stopping the calculations. Or better
 yet, tell it to run another arima fit but with a different optmization
 algorithm.



 Thanks!!!

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] problem with table()

2009-01-26 Thread Dominik Hattrup
Hey everyone,

I am looking for the easiest way to get this table

# Table2
Year / 2000 / 2002 / 2004
Julia / 3 / 4 / 1
Peter / 1 / 2 / 4
...   /   ...   /   ...   /   ...

out of this one? 

# Table1
name   /   year   /   cases
Julia   /   2000   /   1
Julia   /   2000   /   2  
Julia   /   2002   /   4
Peter   /   2000   /   1
Julia   /   2004   /   1
Peter   /   2004   /   2
Peter   /   2002   /   2
Peter   /   2004   /   2
...   /   ...   /   ...

Code for table1:
name - c('Julia','Julia','Julia','Peter','Julia','Peter','Peter','Peter')
year - c(2000,2000,2002,2000,2004,2004,2002,2004)
cases - c(1,2,4,1,1,2,2,2)
table1 - data.frame(name,year,cases)

Thanks! Dominik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with table()

2009-01-26 Thread Marc Schwartz
on 01/26/2009 12:23 PM Dominik Hattrup wrote:
 Hey everyone,
 
 I am looking for the easiest way to get this table
 
 # Table2
 Year / 2000 / 2002 / 2004
 Julia / 3 / 4 / 1
 Peter / 1 / 2 / 4
 ...   /   ...   /   ...   /   ...
 
 out of this one? 
 
 # Table1
 name   /   year   /   cases
 Julia   /   2000   /   1
 Julia   /   2000   /   2  
 Julia   /   2002   /   4
 Peter   /   2000   /   1
 Julia   /   2004   /   1
 Peter   /   2004   /   2
 Peter   /   2002   /   2
 Peter   /   2004   /   2
 ...   /   ...   /   ...
 
 Code for table1:
 name - c('Julia','Julia','Julia','Peter','Julia','Peter','Peter','Peter')
 year - c(2000,2000,2002,2000,2004,2004,2002,2004)
 cases - c(1,2,4,1,1,2,2,2)
 table1 - data.frame(name,year,cases)
 
 Thanks! Dominik

table() generates frequencies from individual values, not from already
tabulated data.

In this case, you can use xtabs():

 xtabs(cases ~ name + year, data = table1)
   year
name2000 2002 2004
  Julia341
  Peter124


See ?xtabs

An alternative would be to use tapply():

 with(table1, tapply(cases, list(name = name, year = year), sum))
   year
name2000 2002 2004
  Julia341
  Peter124


See ?tapply

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Large regular expressions

2009-01-26 Thread Stavros Macrakis
Given a vector of reference strings Ref and a vector of test strings
Test, I would like to find elements of Test which do not contain
elements of Ref as \b-delimited substrings.

This can be done straightforwardly for length(Ref)  6000 or so (R
2.8.1 Windows) by constructing a pattern like \b(a|b|c)\b, but not for
larger Refs (see below).  The easy workaround for this is to split Ref
into smaller subsets and test each subset separately.  Is there a
better solution e.g. along the lines of fgrep?  My real data have
length(Ref) == 6 or more.

  -s

-

Example

Test - as.character(floor(runif(2000,1,2)))  # Real data is short phrases

testing - function(n) {
  Ref - as.character(1:n)   # Real data is sentences
  Pat - paste('\\b(',paste(Ref,collapse=|),')\\b',sep='')
  grep(Pat,Test)
}

testing(2000) = no problem

However, testing(1) gives an error message (invalid regular
expression) and a warning (memory exhausted), and testing(10)
crashes R (Process R exited abnormally with code 5).

Using grep(...,perl=TRUE) as suggested in the man page also fails with
testing(1), though it gives a more helpful error message (regular
expression is too large) without crashing the process.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] name scoping within dataframe index

2009-01-26 Thread Alexy Khrabrov
Every time I have to prefix a dataframe column inside the indexing  
brackets with the dataframe name, e.g.


df[df$colname==value,]

-- I am wondering, why isn't there an R scoping rule that search  
starts with the dataframe names, as if we'd said


with(df, df[colname==value,])

-- wouldn't that be a reasonable default to prepend to the name search  
path?


Cheers,
Alexy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problem with table()

2009-01-26 Thread Dominik Hattrup
Marc Schwartz schrieb:
 on 01/26/2009 12:23 PM Dominik Hattrup wrote:
   
 Hey everyone,

 I am looking for the easiest way to get this table

 # Table2
 Year / 2000 / 2002 / 2004
 Julia / 3 / 4 / 1
 Peter / 1 / 2 / 4
 ...   /   ...   /   ...   /   ...

 out of this one? 

 # Table1
 name   /   year   /   cases
 Julia   /   2000   /   1
 Julia   /   2000   /   2  
 Julia   /   2002   /   4
 Peter   /   2000   /   1
 Julia   /   2004   /   1
 Peter   /   2004   /   2
 Peter   /   2002   /   2
 Peter   /   2004   /   2
 ...   /   ...   /   ...

 Code for table1:
 name - c('Julia','Julia','Julia','Peter','Julia','Peter','Peter','Peter')
 year - c(2000,2000,2002,2000,2004,2004,2002,2004)
 cases - c(1,2,4,1,1,2,2,2)
 table1 - data.frame(name,year,cases)

 Thanks! Dominik
 

 table() generates frequencies from individual values, not from already
 tabulated data.

 In this case, you can use xtabs():

   
 xtabs(cases ~ name + year, data = table1)
 
year
 name2000 2002 2004
   Julia341
   Peter124


 See ?xtabs

 An alternative would be to use tapply():

   
 with(table1, tapply(cases, list(name = name, year = year), sum))
 
year
 name2000 2002 2004
   Julia341
   Peter124


 See ?tapply

 HTH,

 Marc Schwartz

   
Thank You for the quick answer. Works perfect! Dominik

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name scoping within dataframe index

2009-01-26 Thread Duncan Murdoch

On 1/26/2009 1:46 PM, Alexy Khrabrov wrote:
Every time I have to prefix a dataframe column inside the indexing  
brackets with the dataframe name, e.g.


df[df$colname==value,]

-- I am wondering, why isn't there an R scoping rule that search  
starts with the dataframe names, as if we'd said


with(df, df[colname==value,])

-- wouldn't that be a reasonable default to prepend to the name search  
path?


If you did that, it would be quite difficult to get at a colname 
variable that *isn't* the column of df.  It would be something like


 df[get(colname, parent.frame()) == value,]

So just use subset(), or with(), or type the extra 3 chars.

Duncan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Large regular expressions

2009-01-26 Thread Gabor Grothendieck
I am using

 R.version.string # Vista
[1] R version 2.8.1 Patched (2008-12-26 r47350)

and it also caused R to actually crash for me.

On Mon, Jan 26, 2009 at 1:38 PM, Stavros Macrakis macra...@alum.mit.edu wrote:
 Given a vector of reference strings Ref and a vector of test strings
 Test, I would like to find elements of Test which do not contain
 elements of Ref as \b-delimited substrings.

 This can be done straightforwardly for length(Ref)  6000 or so (R
 2.8.1 Windows) by constructing a pattern like \b(a|b|c)\b, but not for
 larger Refs (see below).  The easy workaround for this is to split Ref
 into smaller subsets and test each subset separately.  Is there a
 better solution e.g. along the lines of fgrep?  My real data have
 length(Ref) == 6 or more.

  -s

 -

 Example

 Test - as.character(floor(runif(2000,1,2)))  # Real data is short phrases

 testing - function(n) {
  Ref - as.character(1:n)   # Real data is sentences
  Pat - paste('\\b(',paste(Ref,collapse=|),')\\b',sep='')
  grep(Pat,Test)
 }

 testing(2000) = no problem

 However, testing(1) gives an error message (invalid regular
 expression) and a warning (memory exhausted), and testing(10)
 crashes R (Process R exited abnormally with code 5).

 Using grep(...,perl=TRUE) as suggested in the man page also fails with
 testing(1), though it gives a more helpful error message (regular
 expression is too large) without crashing the process.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name scoping within dataframe index

2009-01-26 Thread Gabor Grothendieck
Try:

subset(df, colname == value)

On Mon, Jan 26, 2009 at 1:46 PM, Alexy Khrabrov delivera...@gmail.com wrote:
 Every time I have to prefix a dataframe column inside the indexing brackets
 with the dataframe name, e.g.

 df[df$colname==value,]

 -- I am wondering, why isn't there an R scoping rule that search starts with
 the dataframe names, as if we'd said

 with(df, df[colname==value,])

 -- wouldn't that be a reasonable default to prepend to the name search path?

 Cheers,
 Alexy

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with colormodel in pdf driver

2009-01-26 Thread Luis Torgo

Greg Snow wrote:

You may want to consider a dotchart instead of a barplot.  Then you can 
distinguish between groups by using symbols, grouping, and labels rather than 
depending on colors/shades of grey.

  
Thanks Greg. The only problem is that I was trying to illustrate the use 
of barplot() ...


I guess for now I can always use the pdf() driver with the default RGB 
colormodel and then use command line tools (e.g. ImageMagick) to convert 
the resulting graphs to grayscale...


Thanks all for the help.

Luis

--
Luis Torgo
  FEP/LIAAD - INESC Porto, LA   Phone : (+351) 22 339 20 93
  University of Porto   Fax   : (+351) 22 339 20 99
  R. de Ceuta, 118, 6o  email : lto...@liaad.up.pt
  4050-190 PORTO - PORTUGAL WWW   : http://www.liaad.up.pt/~ltorgo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name scoping within dataframe index

2009-01-26 Thread Alexy Khrabrov

On 1/26/2009 1:46 PM, Alexy Khrabrov wrote:
Every time I have to prefix a dataframe column inside the indexing   
brackets with the dataframe name, e.g.

df[df$colname==value,]
-- I am wondering, why isn't there an R scoping rule that search   
starts with the dataframe names, as if we'd said

with(df, df[colname==value,])
-- wouldn't that be a reasonable default to prepend to the name  
search  path?


If you did that, it would be quite difficult to get at a colname  
variable that *isn't* the column of df.  It would be something like


df[get(colname, parent.frame()) == value,]


Actually, what I propose is  a special search rule which simply looks  
at the enclosing dataframe.name[...] outside the brackets and looks up  
the columns first.


It would break legacy code which used the column names identical to  
variables in this context, but there's probably other ideas to enhance  
R readability which would break legacy code.  Perhaps when the next  
major overhaul occurs, this is something folks can voice opinions  
about.  I find the need for inner prefixing quite unnatural, FWIW.


Cheers,
Alexy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name scoping within dataframe index

2009-01-26 Thread Duncan Murdoch

On 1/26/2009 2:01 PM, Alexy Khrabrov wrote:

On 1/26/2009 1:46 PM, Alexy Khrabrov wrote:
Every time I have to prefix a dataframe column inside the indexing   
brackets with the dataframe name, e.g.

df[df$colname==value,]
-- I am wondering, why isn't there an R scoping rule that search   
starts with the dataframe names, as if we'd said

with(df, df[colname==value,])
-- wouldn't that be a reasonable default to prepend to the name  
search  path?


If you did that, it would be quite difficult to get at a colname  
variable that *isn't* the column of df.  It would be something like


df[get(colname, parent.frame()) == value,]


Actually, what I propose is  a special search rule which simply looks  
at the enclosing dataframe.name[...] outside the brackets and looks up  
the columns first.


Yes, I understood that, and I explained why it would be a bad idea.

Duncan Murdoch



It would break legacy code which used the column names identical to  
variables in this context, but there's probably other ideas to enhance  
R readability which would break legacy code.  Perhaps when the next  
major overhaul occurs, this is something folks can voice opinions  
about.  I find the need for inner prefixing quite unnatural, FWIW.


Cheers,
Alexy


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tinn-R

2009-01-26 Thread anna freni sterrantino
Hi,
if you are on linux Emacs + ESS  is
quite popular too.

Cheers

Anna

 Anna Freni Sterrantino
Ph.D Student 
Department of Statistics
University of Bologna, Italy
via Belle Arti 41, 40124 BO.





Da: Tibert, Brock btib...@bentley.edu
A: r-help@r-project.org r-help@r-project.org
Inviato: Lunedì 26 gennaio 2009, 18:01:25
Oggetto: [R] Tinn-R

Hi Everyone,

I was hoping someone could help me with the settings for Tinn-R.  I see in the 
screen shots that it has syntax help, or something similar (tips on what 
functions, etc).  I can not seem to get this to turn on in the program, and I 
am wondering if I have to set up a few options.   I quickly read through the 
help and could not figure it out.

Many thanks!

- Brock

P.S.  It appears as if Tinn-R is widely used, but would you recommend something 
different?  I am new to R and programming, but have learned (somewhat) using 
VBA editors I have grown to love to the intelligent typing that goes along with 
it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tinn-R

2009-01-26 Thread Thomas Roth (geb. Kaliwe)
First thing you need to do is, save the file with an .r ending. Tinn-R 
will then come up with syntax highlighting (as far as i remember).


Options-Main-Application   leads you to the r-configuration...

HTH
Thomas

Tibert, Brock schrieb:

Tibert, Brock schrieb:

Hi Everyone,

I was hoping someone could help me with the settings for Tinn-R.  I see in the 
screen shots that it has syntax help, or something similar (tips on what 
functions, etc).  I can not seem to get this to turn on in the program, and I 
am wondering if I have to set up a few options.   I quickly read through the 
help and could not figure it out.

Many thanks!

- Brock

P.S.  It appears as if Tinn-R is widely used, but would you recommend something 
different?  I am new to R and programming, but have learned (somewhat) using 
VBA editors I have grown to love to the intelligent typing that goes along with 
it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name scoping within dataframe index

2009-01-26 Thread Alexy Khrabrov


On Jan 26, 2009, at 2:12 PM, Duncan Murdoch wrote:

df[get(colname, parent.frame()) == value,]
Actually, what I propose is  a special search rule which simply  
looks  at the enclosing dataframe.name[...] outside the brackets  
and looks up  the columns first.


Yes, I understood that, and I explained why it would be a bad idea.


Well this is the case in all programming languages with scoping where  
inner-scope variables override the outer ones.  Usually it's solved  
with prefixing with the outer scope, outercsope.name or  
outerscope::name or so.  So it only underscores the need to improve  
scoping access in R.


Dataframe column names belong to the dataframe object and the natural  
thing would be to enable easy access to naming; you'd need to apply an  
extra effort to access an overridden unrelated external variable.   
Again, just an analogy from other programming languages.


Cheers,
Alexy

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Spectral analysis with mtm-svd Multi-Taper Method Combined with Singular Value Decomposition

2009-01-26 Thread Yasir Kaheil

Hi list, 
Does anyone know if there is a library in R that does MTM-SVD method for
spectral analysis?
Thanks

-
Yasir H. Kaheil
Columbia University
-- 
View this message in context: 
http://www.nabble.com/Spectral-analysis-with-mtm-svd-Multi-Taper-Method-Combined-with-Singular-Value-Decomposition-tp21671934p21671934.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] randomSurvivalForest plotting

2009-01-26 Thread A Van Dyke

i would like to plot a subset of variables with the highest variable
importance measures (say the top 20) instead of plotting all of the
variables included in the analysis (~75).  i tried arguments that work to
restrict the number of variables displayed in the plot in randomForest as
follows:

plot(rsfCauc.out,sort=TRUE,n.var=min(30,nrow(rsfCauc.out$importance)),type=TRUE,class=NULL,scale=TRUE,main=deparse(substitute(rsfCauc.out)))

however, that code only resulted in the plot with all 75 variables.  

help would be much appreciated.  many thanks!
-- 
View this message in context: 
http://www.nabble.com/randomSurvivalForest-plotting-tp21672013p21672013.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] name scoping within dataframe index

2009-01-26 Thread Duncan Murdoch

On 1/26/2009 2:20 PM, Alexy Khrabrov wrote:

On Jan 26, 2009, at 2:12 PM, Duncan Murdoch wrote:

df[get(colname, parent.frame()) == value,]
Actually, what I propose is  a special search rule which simply  
looks  at the enclosing dataframe.name[...] outside the brackets  
and looks up  the columns first.


Yes, I understood that, and I explained why it would be a bad idea.


Well this is the case in all programming languages with scoping where  
inner-scope variables override the outer ones.  Usually it's solved  
with prefixing with the outer scope, outercsope.name or  
outerscope::name or so.  So it only underscores the need to improve  
scoping access in R.


Dataframe column names belong to the dataframe object and the natural  
thing would be to enable easy access to naming; you'd need to apply an  
extra effort to access an overridden unrelated external variable.   
Again, just an analogy from other programming languages.


The issue is that in most cases the outer scope would be unnamed:  it's 
the one that currently doesn't need a prefix.  So if we have a prefix 
meaning this scope, why wouldn't that evaluate to df in that 
context?  I guess we need a prefix meaning the caller's scope, but 
that's just going to lead to confusion:  is it the caller of the 
function that is trying to index df, or the function trying to do the 
indexing?  So we'd need a prefix specific to indexing:  and that's just 
too ugly for words.


As I said, use subset() or with().  For subset selection, subset() works 
very nicely.  (I don't like the way it does column selection, but that's 
a different argument.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HMISC package: wtd.table()

2009-01-26 Thread Frank E Harrell Jr

Dieter Menne wrote:

Frank E Harrell Jr f.harrell at vanderbilt.edu writes:


(wtd.table(FamTyp.kurz,weigths=HGEW,normwt=FALSE,na.rm=TRUE))
That is one solution.  The other is to spell 'weights' correctly 


Have pity with us German speakers. It was such a paing to learn  
th that we cannot resist to apply it whenever pothible.


Dieter


Good point Dieter.  I'm still learning english myself.

Frank




--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with colormodel in pdf driver

2009-01-26 Thread Prof Brian Ripley

On Mon, 26 Jan 2009, Luis Torgo wrote:


Greg Snow wrote:
You may want to consider a dotchart instead of a barplot.  Then you can 
distinguish between groups by using symbols, grouping, and labels rather 
than depending on colors/shades of grey.



Thanks Greg. The only problem is that I was trying to illustrate the use of 
barplot() ...


I guess for now I can always use the pdf() driver with the default RGB 
colormodel and then use command line tools (e.g. ImageMagick) to convert the 
resulting graphs to grayscale...


You won't be able to convert PDF to PDF with ImageMagick (possible 
with a helper).



Thanks all for the help.


Or update your R, as the posting guide suggested.  It works in 
R-patched and R-devel.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [stats-rosuda-devel] Problem with JGR. Was: Re: Using help()

2009-01-26 Thread Simon Urbanek


On Jan 25, 2009, at 9:35 , Michael Kubovy wrote:


Dear Friends,

Thanks to Rolf Turner, Brian Ripley and Patrick Burns for their  
answers.They don't quite resolve the problem, which I now realize is  
due to non-standard behavior of JGR, at least on my machine (I  
verified that Mac GUI works entirely as expected):


 My installation
Running the JGR GUI:
 sessionInfo()
R version 2.8.1 (2008-12-22)
i386-apple-darwin8.11.1

locale:
C/C/en_US/C/C/C

attached base packages:
[1] grid  stats graphics  grDevices utils datasets   
methods

[8] base

other attached packages:
[1] JGR_1.6-2   iplots_1.1-2JavaGD_0.5-2rJava_0.6-1
[5] MASS_7.2-45 lattice_0.17-20

loaded via a namespace (and not attached):
[1] tools_2.8.1

What happens with ? and ?? **



? is interpreted by JGR (re-mapped to internal call to help followed  
by help.search if no topics were found) instead of R. So JGR is  
smarter than R used to be, but that has changed in R 2.8.  
Unfortunately R has currently no publicly available API to support  
what the Mac-GUI does, because it uses a nasty trick by modifying R's  
sources to hook inside R. I'm working on fixing this for R 2.9.0-to- 
be, but currently JGR is out of luck and has to rely on its own  
attempts to parse the command line, so the results will vary until then.



If I type ?normal I get the long list, not No documentation  
found. When I type ?plot I get the help page for plot {JM}, and  
not plot.default {graphics}; when I type ?dnorm I get a rather long  
list of help pages.


If I type ??normal
I get
?normal.htm
.com.symantec.APSock
.com.symantec.aptmp
.DM_1039:1232634821l:DlnIrq
.DM_11869:1232818209l:m4AGyL
.DM_13345:1232655220l:C1js39
.DM_14309:1232822090l:e6wvqw
.DM_15688:1232659145l:ffZvPg
.DM_16640:1232825979l:n5TrAz
.DM_18040:1232662823l:Gb81yX
…

 Another JGR problem **

Help pages for newly installed packages are accessible only after  
JGR is restarted.




I can see what could cause that, but in theory that should affect all  
html-based systems if R really doesn't update the links. I didn't  
actually look but it's possible that JGR just needs to call  
make.packages.html() -- in effect, try calling that function and if  
that solves your problem that's what's it is ...



Cheers,
S



Thanks,
MK

On Jan 24, 2009, at 8:54 PM, Rolf Turner wrote:


On 25/01/2009, at 2:33 PM, Michael Kubovy wrote:



…
(1) If I type ?normal because I forgot the name dnorm() I get a long
list of relevant pages. Getting to right page is laborious.

(2) If I remember dnorm() and want to be reminded of the call, I  
also

get a list of pages.
…



…
If you type ``?normal'' you get a ``No documentation found'' message.

If you type ``??normal'' you indeed get a long list of pages, some of
which might be relevant.  (If you want help on ``dnorm'' then the  
relevant

page is stats::Normal.  And then typing ``?Normal'' gets you what you
want.  Which is somewhat on the obscure side of obvious, IMHO.)

If you type ``?dnorm'' then you get exactly what you want  
immediately.

Exactly?  Well, there's also info on pnorm, qnorm, and rnorm, but I
expect you can live with that.

…
Rolf Turner


___
stats-rosuda-devel mailing list
stats-rosuda-de...@listserv.uni-augsburg.de
http://mailman.rz.uni-augsburg.de/mailman/listinfo/stats-rosuda-devel


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] HMISC package: wtd.table()

2009-01-26 Thread Peter Dalgaard

Frank E Harrell Jr wrote:

I'm still learning english myself.


Including capitalization rules?


--
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Power analysis for MANOVA?

2009-01-26 Thread Adam D. I. Kramer

Hello,

I have searched and failed for a program or script or method to
conduct a power analysis for a MANOVA. My interest is a fairly simple case
of 5 dependent variables and a single two-level categorical predictor
(though the categories aren't balanced).

If anybody happens to know of a script that will do this in R, I'd
love to know of it! Otherwise, I'll see about writing one myself.

What I currently see is this, from help.search(power):

stats::power.anova.test
Power calculations for balanced one-way
analysis of variance tests
stats::power.prop.test
Power calculations two sample test for
proportions
stats::power.t.test Power calculations for one and two sample t
tests

Any references on power in MANOVA would also be helpful, though of
course I will do my own lit search for them myself.

Cordially,
Adam D. I. Kramer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for MANOVA?

2009-01-26 Thread Mitchell Maltenfort
http://www.amazon.com/Statistical-Power-Analysis-Behavioral-Sciences/dp/0805802835

Cohen's book was in fact the basis for the pwr package at CRAN.

And it does have a MANOVA power analysis, which was left out of the
pwr package.



On Mon, Jan 26, 2009 at 4:12 PM, Adam D. I. Kramer a...@ilovebacon.org wrote:
 Hello,

I have searched and failed for a program or script or method to
 conduct a power analysis for a MANOVA. My interest is a fairly simple case
 of 5 dependent variables and a single two-level categorical predictor
 (though the categories aren't balanced).

If anybody happens to know of a script that will do this in R, I'd
 love to know of it! Otherwise, I'll see about writing one myself.

What I currently see is this, from help.search(power):

 stats::power.anova.test
Power calculations for balanced one-way
analysis of variance tests
 stats::power.prop.test
Power calculations two sample test for
proportions
 stats::power.t.test Power calculations for one and two sample t
tests

Any references on power in MANOVA would also be helpful, though of
 course I will do my own lit search for them myself.

 Cordially,
 Adam D. I. Kramer

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Due to the recession, requests for instant gratification will be
deferred until arrears in scheduled gratification have been satisfied.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for MANOVA?

2009-01-26 Thread Stephan Kolassa

Hi Adam,

My (and, judging from previous traffic on R-help about power analyses, 
also some other people's) preferred approach is to simply simulate an 
effect size you would like to detect a couple of thousand times, run 
your proposed analysis and look how often you get significance. In your 
simple case, this should be quite easy.


HTH,
Stephan


Adam D. I. Kramer schrieb:

Hello,

I have searched and failed for a program or script or method to
conduct a power analysis for a MANOVA. My interest is a fairly simple case
of 5 dependent variables and a single two-level categorical predictor
(though the categories aren't balanced).

If anybody happens to know of a script that will do this in R, I'd
love to know of it! Otherwise, I'll see about writing one myself.

What I currently see is this, from help.search(power):

stats::power.anova.test
Power calculations for balanced one-way
analysis of variance tests
stats::power.prop.test
Power calculations two sample test for
proportions
stats::power.t.test Power calculations for one and two sample t
tests

Any references on power in MANOVA would also be helpful, though of
course I will do my own lit search for them myself.

Cordially,
Adam D. I. Kramer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sweave'ing Danish characters

2009-01-26 Thread Peter Jepsen
Hi,

I am writing an Sweave document and am using 'xtable' to make frequency tables 
of diagnoses of people undergoing cholecystectomy. Some of these diagnoses 
contain Danish characters (æ, ø, and å), and these characters are all 
garbled in the Latex document after I run Sweave. The odd thing is, everything 
looks absolutely right in the R console, and if I enter the same Danish 
characters in a new variable, the new variable produces no problems?! 
Therefore, I cannot offer a reproducible example, but I am hoping nonetheless 
that someone can point me towards a solution.

To illustrate:

 library(xtable)
 library(Hmisc)
 rm(list=ls())
 load(u:/kirurgi/cholecystit/Chol_oprenset.Rdata)
   
 test2 - chol$nydiag[3]   # This 3rd observation contains a diagnosis 
 with Danish characters (Kræft i fordøjelsessystemet, meaning 
 gastrointestinal cancer).
 
 print(xtable(table(test2)))
% latex table generated in R 2.8.1 by xtable 1.5-4 package
% Mon Jan 26 23:31:37 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
  \hline
  test2 \\
  \hline
Kræft i fordøjelsessystemet1 \\# It looks right here, but in the 
.tex-file it says Kræft i fordøjelsessystemet
   \hline
\end{tabular}
\end{center}
\end{table}

 print(xtable(table(Kræft i fordøjelsessystemet)))   # This, on the other 
 hand, works like a charm.
% latex table generated in R 2.8.1 by xtable 1.5-4 package
% Mon Jan 26 23:36:53 2009
\begin{table}[ht]
\begin{center}
\begin{tabular}{rr}
  \hline
  V1 \\
  \hline
Kræft i fordøjelsessystemet1 \\# See, no problems here!
   \hline
\end{tabular}
\end{center}
\end{table}


I am using Windows Vista 64-bit and MikTex 2.7. 

Best regards,
Peter.

 sessionInfo()
R version 2.8.1 (2008-12-22) 
i386-pc-mingw32 

locale:
LC_COLLATE=Danish_Denmark.1252;LC_CTYPE=Danish_Denmark.1252;LC_MONETARY=Danish_Denmark.1252;LC_NUMERIC=C;LC_TIME=Danish_Denmark.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] Hmisc_3.4-4foreign_0.8-30 xtable_1.5-4  

loaded via a namespace (and not attached):
[1] cluster_1.11.12 grid_2.8.1  lattice_0.17-20 tools_2.8.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] suppressing time shift in plot of POSIXct object?

2009-01-26 Thread Jim Porzak
Friends,

I have a POSIXct vector located in EST timezone. When I plot against
it here in PST, the time axis is shifted 3 hours back in time. IOW,
plot adjusts for time zone difference. Now that's really great, if
that's what one wants. However, I want time axis to use actual times
in object (without any shift).

For example:

n - 360
y - rnorm(n)
t - seq(from = as.POSIXct(2009-01-26 12:00:00, tz = EST), by =
60, length.out = n)
head(t)
#[ 1] 2009-01-26 12:00:00 EST 2009-01-26 12:01:00 EST 2009-01-26
12:02:00 EST
# [4] 2009-01-26 12:03:00 EST 2009-01-26 12:04:00 EST 2009-01-26
12:05:00 EST
Sys.timezone()
# [1] PST

#But doing:
plot(y ~ t, type = l)

results in plot starting at 09:00 (here in California)

I've poked around in help, etc but haven't any way to force use of
timezone in t.

What am I missing?

TIA,
Jim Porzak
TGN.com
San Francisco, CA
http://www.linkedin.com/in/jimporzak
use R! Group SF: http://ia.meetup.com/67/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for MANOVA?

2009-01-26 Thread Adam D. I. Kramer


On Mon, 26 Jan 2009, Stephan Kolassa wrote:


My (and, judging from previous traffic on R-help about power analyses,
also some other people's) preferred approach is to simply simulate an
effect size you would like to detect a couple of thousand times, run your
proposed analysis and look how often you get significance.  In your simple
case, this should be quite easy.


I actually don't have much experience running monte-carlo designs like
this...so while I'd certainly prefer a bootstrapping method like this one,
simulating the effect size given my constraints isn't something I've done
before.

The MANOVA procedure takes 5 dependent variables, and determines what
combination of the variables best discriminates the two levels of my
independent variable...then the discrimination rate is represented in the
statistic (Pillai's V=.00019), which is then tested (F[5,18653] = 0.71).  So
coming up with a set of constraints that would produce V=.00019 given my
data set doesn't quite sound trivial...so I'll go for the par library
reference mentioned earlier before I try this.  That said, if anyone can
refer me to a tool that will help me out (or an instruction manual for RNG),
I'd also be much obliged.

Many thanks,
Adam




HTH,
Stephan


Adam D. I. Kramer schrieb:

Hello,

I have searched and failed for a program or script or method to
conduct a power analysis for a MANOVA. My interest is a fairly simple case
of 5 dependent variables and a single two-level categorical predictor
(though the categories aren't balanced).

If anybody happens to know of a script that will do this in R, I'd
love to know of it! Otherwise, I'll see about writing one myself.

What I currently see is this, from help.search(power):

stats::power.anova.test
Power calculations for balanced one-way
analysis of variance tests
stats::power.prop.test
Power calculations two sample test for
proportions
stats::power.t.test Power calculations for one and two sample t
tests

Any references on power in MANOVA would also be helpful, though of
course I will do my own lit search for them myself.

Cordially,
Adam D. I. Kramer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with colormodel in pdf driver

2009-01-26 Thread Luis Torgo

Prof Brian Ripley wrote:

On Mon, 26 Jan 2009, Luis Torgo wrote:


Greg Snow wrote:
You may want to consider a dotchart instead of a barplot.  Then you 
can distinguish between groups by using symbols, grouping, and 
labels rather than depending on colors/shades of grey.



Thanks Greg. The only problem is that I was trying to illustrate the 
use of barplot() ...


I guess for now I can always use the pdf() driver with the default 
RGB colormodel and then use command line tools (e.g. ImageMagick) to 
convert the resulting graphs to grayscale...


You won't be able to convert PDF to PDF with ImageMagick (possible 
with a helper).

Well actually I can and it worked perfectly. Just did:
$ mogrify *.pdf -type Grayscale

and all my PDFs got changed from RGB to Grayscale. Maybe a problem with 
versions of ImageMagick mine is


$ mogrify -version
Version: ImageMagick 6.3.7 08/21/08 Q16 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2008 ImageMagick Studio LLC




Thanks all for the help.


Or update your R, as the posting guide suggested.  It works in 
R-patched and R-devel.

That's good news. I'll give it a try, thanks.



--
Luis Torgo
  FEP/LIAAD - INESC Porto, LA   Phone : (+351) 22 339 20 93
  University of Porto   Fax   : (+351) 22 339 20 99
  R. de Ceuta, 118, 6o  email : lto...@liaad.up.pt
  4050-190 PORTO - PORTUGAL WWW   : http://www.liaad.up.pt/~ltorgo

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for MANOVA?

2009-01-26 Thread Charles C. Berry


If you know what a 'general linear hypothesis test' is see

http://cran.r-project.org/src/contrib/Archive/hpower/hpower_0.1-0.tar.gz

HTH,

Chuck

On Mon, 26 Jan 2009, Adam D. I. Kramer wrote:



On Mon, 26 Jan 2009, Stephan Kolassa wrote:


 My (and, judging from previous traffic on R-help about power analyses,
 also some other people's) preferred approach is to simply simulate an
 effect size you would like to detect a couple of thousand times, run your
 proposed analysis and look how often you get significance.  In your simple
 case, this should be quite easy.


I actually don't have much experience running monte-carlo designs like
this...so while I'd certainly prefer a bootstrapping method like this one,
simulating the effect size given my constraints isn't something I've done
before.

The MANOVA procedure takes 5 dependent variables, and determines what
combination of the variables best discriminates the two levels of my
independent variable...then the discrimination rate is represented in the
statistic (Pillai's V=.00019), which is then tested (F[5,18653] = 0.71).  So
coming up with a set of constraints that would produce V=.00019 given my
data set doesn't quite sound trivial...so I'll go for the par library
reference mentioned earlier before I try this.  That said, if anyone can
refer me to a tool that will help me out (or an instruction manual for RNG),
I'd also be much obliged.

Many thanks,
Adam




 HTH,
 Stephan


 Adam D. I. Kramer schrieb:
  Hello,
 
  I have searched and failed for a program or script or method to
  conduct a power analysis for a MANOVA. My interest is a fairly simple 
  case

  of 5 dependent variables and a single two-level categorical predictor
  (though the categories aren't balanced).
 
  If anybody happens to know of a script that will do this in R, I'd

  love to know of it! Otherwise, I'll see about writing one myself.
 
  What I currently see is this, from help.search(power):
 
  stats::power.anova.test

  Power calculations for balanced one-way
  analysis of variance tests
  stats::power.prop.test
  Power calculations two sample test for
  proportions
  stats::power.t.test Power calculations for one and two sample t
  tests
 
  Any references on power in MANOVA would also be helpful, though of

  course I will do my own lit search for them myself.
 
  Cordially,

  Adam D. I. Kramer
 
  __

  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html

  and provide commented, minimal, self-contained, reproducible code.
 





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in segmented() output from segmented package

2009-01-26 Thread tsippel

Hi-
I'm gettting the following error message when trying to use the segmented
function to look for breakpoints in my data.  

Error in segmented.glm(glm, seg.Z = ~segmentdist, psi = 2, control =
seg.control(display = F),  : 
  (Some) estimated psi out of its range

Here are some real data and the models I'm calling which gives the error
above.  

 segmentdist
 [1]  0.00  8.547576 12.700485 13.291767 15.701552 17.567891 18.936836
19.846242 20.325434 20.397607 20.066126 17.976218 16.772871 16.513030
16.434075
[16] 16.508426 16.717404 17.049235 17.501350 18.077070

 dal
 [1] 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5
9.0 9.5

lm-lm(data=df, segmentdist~dal)

lm(formula = segmentdist ~ dal, data = df)

Coefficients:
(Intercept)  dal  
   13.77564 -0.06682 

seg-segmented(lm, seg.Z=~segmentdist, psi=2,
control=seg.control(display=F), model.frame=T)

The range of the data I'm looking for breaks in is min=0, max=44.5, so I
don't understand how my psi=2 could be out of range.  

Thanks for your help,

Tim
-- 
View this message in context: 
http://www.nabble.com/Error-in-segmented%28%29-output-from-segmented-package-tp21674240p21674240.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] why two diff. se in nlsList?

2009-01-26 Thread Tao Shi

Hi list,

In the object returned by summary.nlsList, what's the difference between 
coefficients and parameters?  The have the same Estimate, different se 
(therefore t value), but same p values.

R.2.8.0 on winxp with nlme_3.1-89

Thanks,

...Tao

+

 library(nlme)
 fm1 - nlsList(uptake ~ SSasympOff(conc, Asym, lrc, c0),
   data = CO2, start = c(Asym = 30, lrc = -4.5, c0 = 52))

 summary(fm1)$para[,,1]
Estimate Std. Error  t value Pr(|t|)
Qn1 38.13977  0.9911148 38.48169 1.991990e-06
Qn2 42.87169  1.0932089 39.21638 2.583953e-06
Qn3 44.22800  1.0241029 43.18706 1.809264e-07
Qc1 36.42874  1.1941594 30.50576 1.140085e-05
Qc3 40.68373  1.2480923 32.59673 1.424635e-04
Qc2 39.81950  1.0167249 39.16447 2.692304e-06
Mn3 28.48286  1.0624246 26.80930 1.066434e-06
Mn2 32.12827  1.0174826 31.57624 3.488786e-06
Mn1 34.08482  1.3400596 25.43530 4.199333e-06
Mc2 13.55519  1.0506404 12.90184 4.385886e-06
Mc3 18.53506  0.8363371 22.16219 1.461563e-06
Mc1 21.78723  1.4113318 15.43735 5.756870e-06

 summary(fm1)$coef[,,1]
Estimate Std. Error  t value Pr(|t|)
Qn1 38.13977  0.9163882 41.61967 1.991990e-06
Qn2 42.87169  1.0994599 38.99341 2.583953e-06
Qn3 44.22800  0.5829894 75.86415 1.809264e-07
Qc1 36.42874  1.3556273 26.87224 1.140085e-05
Qc3 40.68373  2.8632576 14.20890 1.424635e-04
Qc2 39.81950  1.0317496 38.59415 2.692304e-06
Mn3 28.48286  0.5852408 48.66861 1.066434e-06
Mn2 32.12827  0.8883225 36.16735 3.488786e-06
Mn1 34.08482  0.9872439 34.52522 4.199333e-06
Mc2 13.55519  0.3969189 34.15104 4.385886e-06
Mc3 18.53506  0.4121147 44.97549 1.461563e-06
Mc1 21.78723  0.6830001 31.89930 5.756870e-06


_


ore_012009
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Error in Surv(time, status) : Time variable is not numeric

2009-01-26 Thread Braem M

Dear,

I want to analyze two-level survival data using a shared frailty model, for
which I want to use the R package 'Frailtypack, proposed by Rondeau et al.
The dataset was built using SAS software. I also tried to change the format
using SPSS and Excell. 

My (reduced) dataset has following column names:
ID entrytimestatusfamily var1

I used following command:
 frailtyPenal(Surv(time, status) ~var1 + cluster(family), Frailty=TRUE
 ,n.knots=8, kappa1=1500,
+ cross.validation=FALSE)

And got this error :
Error in Surv(time, status) : Time variable is not numeric
In addition: Warning message:
In is.na(time) : is.na() applied to non-(list or vector) of type 'closure'

I think R transforms the data when importing into R, so that the
observations are not numeric anymore.

Does anyone know how to handle this problem?

Thanks,

Marie
-- 
View this message in context: 
http://www.nabble.com/Error-in-Surv%28time%2C-status%29-%3A-Time-variable-is-not-numeric-tp21674025p21674025.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for MANOVA?

2009-01-26 Thread Adam D. I. Kramer


On Mon, 26 Jan 2009, Charles C. Berry wrote:



If you know what a 'general linear hypothesis test' is see

http://cran.r-project.org/src/contrib/Archive/hpower/hpower_0.1-0.tar.gz



I do, and am quite interested, however this package will not install on R
2.8.1: First, it said that there was no maintainer in the description, so
I added one (figuring that the 1991 date of the package was to blame),
however it still will not compile:

parmesan:tmp$ sudo R CMD INSTALL hpower/
* Installing to library '/usr/local/lib/R/library'
* Installing *source* package 'hpower' ...
** R
** preparing package for lazy loading
Error in parse(n = -1, file = file) : unexpected '{' at
5: ##
6: pfnc_function(q,df1,df2,lm,iprec=c(6)) {
Calls: Anonymous - code2LazyLoadDB - sys.source - parse
Execution halted
ERROR: lazy loading failed for package 'hpower'
** Removing '/usr/local/lib/R/library/hpower'
parmesan:tmp$

...any tips?

--Adam


HTH,

Chuck

On Mon, 26 Jan 2009, Adam D. I. Kramer wrote:



On Mon, 26 Jan 2009, Stephan Kolassa wrote:


 My (and, judging from previous traffic on R-help about power analyses,
 also some other people's) preferred approach is to simply simulate an
 effect size you would like to detect a couple of thousand times, run your
 proposed analysis and look how often you get significance.  In your 
simple

 case, this should be quite easy.


I actually don't have much experience running monte-carlo designs like
this...so while I'd certainly prefer a bootstrapping method like this one,
simulating the effect size given my constraints isn't something I've done
before.

The MANOVA procedure takes 5 dependent variables, and determines what
combination of the variables best discriminates the two levels of my
independent variable...then the discrimination rate is represented in the
statistic (Pillai's V=.00019), which is then tested (F[5,18653] = 0.71). 
So

coming up with a set of constraints that would produce V=.00019 given my
data set doesn't quite sound trivial...so I'll go for the par library
reference mentioned earlier before I try this.  That said, if anyone can
refer me to a tool that will help me out (or an instruction manual for 
RNG),

I'd also be much obliged.

Many thanks,
Adam




 HTH,
 Stephan


 Adam D. I. Kramer schrieb:
  Hello,
   I have searched and failed for a program or script or method to
  conduct a power analysis for a MANOVA. My interest is a fairly simple  
case

  of 5 dependent variables and a single two-level categorical predictor
  (though the categories aren't balanced).
   If anybody happens to know of a script that will do this in R, 
I'd

  love to know of it! Otherwise, I'll see about writing one myself.
   What I currently see is this, from help.search(power):
   stats::power.anova.test
  Power calculations for balanced one-way
  analysis of variance tests
  stats::power.prop.test
  Power calculations two sample test for
  proportions
  stats::power.t.test Power calculations for one and two sample t
  tests
   Any references on power in MANOVA would also be helpful, though 
of

  course I will do my own lit search for them myself.
   Cordially,
  Adam D. I. Kramer
   __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide  
http://www.R-project.org/posting-guide.html

  and provide commented, minimal, self-contained, reproducible code.
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.




Charles C. Berry(858) 534-2098
   Dept of Family/Preventive 
Medicine

E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] suppressing time shift in plot of POSIXct object?

2009-01-26 Thread jim holtman
Try:

Sys.setenv(TZ=EST)
plot(y ~ t, type = l)

You can save TZ before you set it and then restore it.

On Mon, Jan 26, 2009 at 5:47 PM, Jim Porzak jpor...@gmail.com wrote:
 Friends,

 I have a POSIXct vector located in EST timezone. When I plot against
 it here in PST, the time axis is shifted 3 hours back in time. IOW,
 plot adjusts for time zone difference. Now that's really great, if
 that's what one wants. However, I want time axis to use actual times
 in object (without any shift).

 For example:

 n - 360
 y - rnorm(n)
 t - seq(from = as.POSIXct(2009-01-26 12:00:00, tz = EST), by =
 60, length.out = n)
 head(t)
 #[ 1] 2009-01-26 12:00:00 EST 2009-01-26 12:01:00 EST 2009-01-26
 12:02:00 EST
 # [4] 2009-01-26 12:03:00 EST 2009-01-26 12:04:00 EST 2009-01-26
 12:05:00 EST
 Sys.timezone()
 # [1] PST

 #But doing:
 plot(y ~ t, type = l)

 results in plot starting at 09:00 (here in California)

 I've poked around in help, etc but haven't any way to force use of
 timezone in t.

 What am I missing?

 TIA,
 Jim Porzak
 TGN.com
 San Francisco, CA
 http://www.linkedin.com/in/jimporzak
 use R! Group SF: http://ia.meetup.com/67/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] working with tables -- was Re: Mode (statistics) in R?

2009-01-26 Thread Carl Witthoft

Ok, so I'm slowly figuring out what a factor is, and was able to follow
the related thread about finding a mode by using constructs like

my_mode = as.numeric(names(table(x))[which.max(table(x))])


Now, suppose I want to keep looking for other modes?  For example,

Rgames sample(seq(1,10),50,replace=TRUE)-bag
Rgames bag
 [1]  2  8  8 10  7  3  2  9  8  3  8  9  6  6 10 10  7  1
[19]  9  5  4  3  3  5 10  3  6  3  2  8  4  2  1 10  6  2
[37]  6  6  9  8  6  8  8  4  3  6  3  9  5  1
Rgames names(which.max(table(bag)))
[1] 3

I can then do

Rgames bag2-bag[bag!=3]

and repeat the which.max stuff.
I came up with the following command to find the actual magnitude of the 
mode:


Rgames table(bag)-tbag
Rgames tbag
bag
 1  2  3  4  5  6  7  8  9 10
 3  5  8  3  3  8  2  8  5  5

Rgames tbag[dimnames(tbag)$bag==3]-bagmode
Rgames bagmode
3
8


Related to this, since bag2 is now bereft of threes,
Rgames table(bag2)
bag2
 1  2  4  5  6  7  8  9 10
 3  5  3  3  8  2  8  5  5

I was able to make the same table with

Rgames newtable-tbag[c(dimnames(tbag)$bag)!=3]
Rgames newtable
bag
 1  2  4  5  6  7  8  9 10
 3  5  3  3  8  2  8  5  5


Is there a cleaner syntax to do these things?

Thanks for your help--and feel free to point me to the Inferno or other 
paper on the philosophy and use of factors and tables.


Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The Quality Accuracy of R

2009-01-26 Thread Gabor Grothendieck
It would be possible to develop tools to develop code coverage
statistics quantifying the percent of the code that the tests
exercise.

On Fri, Jan 23, 2009 at 10:04 AM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:
 Hi All,



 We have all had to face skeptical colleagues asking if software made by
 volunteers could match the quality and accuracy of commercially written
 software. Thanks to the prompting of a recent R-help thread, I read, R:
 Regulatory Compliance and Validation Issues, A Guidance Document for the
 Use of R in Regulated Clinical Trial Environments
 (http://www.r-project.org/doc/R-FDA.pdf). This is an important document,
 of interest to the general R community. The question of R's accuracy is
 such a frequent one, it would be beneficial to increase the visibility
 of the non-clinical  information it contains. A document aimed at a
 general audience, entitled something like, R: Controlling Quality and
 Assuring Accuracy could be compiled from the these sections:



 1.  What is R? (section 4)

 2.  The R Foundation for Statistical Computing (section  3)

 3.  The Scope of this Guidance Document (section 2)

 4.  Software Development Life Cycle (section 6)



 Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
 did such a great job that very few words would need to change. The only
 addition I suggest is to mention how well R did in, Keeling  Parvur's
 A comparative study of the reliability to nine statistical software
 packages, May 1, 2007 Computational Statistics  Data Analysis, Vol.51,
 pp 3811-3831.



 Given the importance of this issue, I would like to see such a document
 added to the PDF manuals in R's Help.



 The document mentions (Sect. 6.3) that a set of validation tests, data
 and known results are available. It would be useful to have an option to
 run that test suite in every R installation, providing clear progress,
 Validating accuracy of t-tests...Validating accuracy of linear
 regression Whether or not people chose to run the tests, they would
 at least know that such tests are available. Back in my mainframe
 installation days, this step was part of many software installations and
 it certainly gave the impression that those were the companies that took
 accuracy seriously. Of course the other companies probably just ran
 their validation suite before shipping, but seeing it happen had a
 tremendous impact.  I don't know how much this would add to download,
 but if it was too much, perhaps it could be implemented as a separate
 download.



 I hope these suggestions can help mitigate the concerns so many non-R
 users have.



 Cheers,

 Bob



 =

 Bob Muenchen (pronounced Min'-chen),

 Manager, Research Computing Support

 U of TN Office of Information Technology

 Stokely Management Center, Suite 200

 916 Volunteer Blvd., Knoxville, TN 37996-0520

 Voice: (865) 974-5230

 FAX: (865) 974-4810

 Email: muenc...@utk.edu

 Web: http://oit.utk.edu/research http://oit.utk.edu/scc

 Map to Office: http://www.utk.edu/maps

 Newsletter: http://listserv.utk.edu/archives/rcnews.html
 http://listserv.utk.edu/archives/statnews.html

 =




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The Quality Accuracy of R

2009-01-26 Thread Muenchen, Robert A (Bob)
That's a great idea. I know of no commercial vendors who provide such
detailed info.

Bob

-Original Message-
From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Sent: Monday, January 26, 2009 7:52 PM
To: Muenchen, Robert A (Bob)
Cc: R-help@r-project.org
Subject: Re: [R] The Quality  Accuracy of R

It would be possible to develop tools to develop code coverage
statistics quantifying the percent of the code that the tests
exercise.

On Fri, Jan 23, 2009 at 10:04 AM, Muenchen, Robert A (Bob)
muenc...@utk.edu wrote:
 Hi All,



 We have all had to face skeptical colleagues asking if software made
by
 volunteers could match the quality and accuracy of commercially
written
 software. Thanks to the prompting of a recent R-help thread, I read,
R:
 Regulatory Compliance and Validation Issues, A Guidance Document for
the
 Use of R in Regulated Clinical Trial Environments
 (http://www.r-project.org/doc/R-FDA.pdf). This is an important
document,
 of interest to the general R community. The question of R's accuracy
is
 such a frequent one, it would be beneficial to increase the visibility
 of the non-clinical  information it contains. A document aimed at a
 general audience, entitled something like, R: Controlling Quality and
 Assuring Accuracy could be compiled from the these sections:



 1.  What is R? (section 4)

 2.  The R Foundation for Statistical Computing (section  3)

 3.  The Scope of this Guidance Document (section 2)

 4.  Software Development Life Cycle (section 6)



 Marc Schwartz, Frank Harrell, Anthony Rossini, Ian Francis and others
 did such a great job that very few words would need to change. The
only
 addition I suggest is to mention how well R did in, Keeling  Parvur's
 A comparative study of the reliability to nine statistical software
 packages, May 1, 2007 Computational Statistics  Data Analysis,
Vol.51,
 pp 3811-3831.



 Given the importance of this issue, I would like to see such a
document
 added to the PDF manuals in R's Help.



 The document mentions (Sect. 6.3) that a set of validation tests, data
 and known results are available. It would be useful to have an option
to
 run that test suite in every R installation, providing clear progress,
 Validating accuracy of t-tests...Validating accuracy of linear
 regression Whether or not people chose to run the tests, they
would
 at least know that such tests are available. Back in my mainframe
 installation days, this step was part of many software installations
and
 it certainly gave the impression that those were the companies that
took
 accuracy seriously. Of course the other companies probably just ran
 their validation suite before shipping, but seeing it happen had a
 tremendous impact.  I don't know how much this would add to download,
 but if it was too much, perhaps it could be implemented as a separate
 download.



 I hope these suggestions can help mitigate the concerns so many non-R
 users have.



 Cheers,

 Bob



 =

 Bob Muenchen (pronounced Min'-chen),

 Manager, Research Computing Support

 U of TN Office of Information Technology

 Stokely Management Center, Suite 200

 916 Volunteer Blvd., Knoxville, TN 37996-0520

 Voice: (865) 974-5230

 FAX: (865) 974-4810

 Email: muenc...@utk.edu

 Web: http://oit.utk.edu/research http://oit.utk.edu/scc

 Map to Office: http://www.utk.edu/maps

 Newsletter: http://listserv.utk.edu/archives/rcnews.html
 http://listserv.utk.edu/archives/statnews.html

 =




[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Pausing processing into an interactive session

2009-01-26 Thread Zhou Fang
Hi all,

As a possibly silly request, is it possible to interactively pause a
R-calculation and do a browser(), say, without browser or other debug
handlers being explicitly included in the code?

Imagine the following situation:

You write up a big calculation for R to calculate. We are talking
hours here, or worse. A few hours into the calculation, you decide
that you want to check on how it's going. Unfortunately, you didn't
forsee the output you really want to check on. Oops.

What would seem ideal is something like this: as well as Ctrl-C, which
would terminate the current computation, we really want some key combo
perhaps that would pause the computation, perhaps at the next
'reasonable spot'. (Not Ctrl-Z either, as it doesn't let you look at
what's going on in the program). Then you can examine variables, for
example. Maybe even tweak them manually. And press the key to resume
the calculation.

Is this already possible somehow? Can it be made possible? Or would
there not be any point?

Thanks,

Zhou

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] WhisperStation R

2009-01-26 Thread zerfetzen

What do you think of this:

http://www.microway.com/whisperstation/whisperstation-r.html

I'm considering ditching my Windows Vista 2 GB RAM computer for
WhisperStation R using Debian 64-bit Linux with 32 GB RAM and setting the
whole thing up for R and WinBUGS.  I put in a price request, but I know
nothing about Linux, or WhisperStation R for that matter, and am really
curious what you think?
-- 
View this message in context: 
http://www.nabble.com/WhisperStation-R-tp21678280p21678280.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running R under Sun Grid Engine with OpenMPI tight integration

2009-01-26 Thread Peter Waltman
Hi -

I saw your posting on the R-help mailing list.  Were you ever able to get
this working?  did you end up switching to use the rsge library?

I'm trying to do the same, and not having very much luck getting it going.

Thanks!

Peter Waltman

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with loading RMySQL under sge/qsub

2009-01-26 Thread Peter Waltman
Hi -

I'm trying to set up a parallelized batch job that is run under rmpi and
managed by sge, using qsub, but it reports that it can't load RMySQL because
it can't find the libmysqlclient.so.15 file.

Note, when I run R interactively, and manually load the RMySQL library, it
works without a hitch, however, when I have qsub launch R, it reports the
following error:

Error in dyn.load(file, DLLpath = DLLpath, ...) :
unable to load shared library
'/home/install/usr/apps/R-2.8.0/lib64/RMySQL/libs/RMySQL.so':
libmysqlclient.so.15: cannot open shared object file: No such file or
directory

On the web, I found this posting to this list:
http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12876.html, which recommends
setting the LD_LIBRARY_PATH env var to the location of the
libmysqlclient.so.15 file.

I've set that in my .bashrc, and use the '-V' switch to qsub to make sure
I'm exporting my environment variables to qsub, but still get the error.
I've also double checked the qsub job's status, with qstat -j jobid and
the LD_LIBRARY_PATH is set to what I've set it to.

Since it only happens when under qsub, I think it's got to be something with
either how I'm calling qsub or how sge is configured, but I can't figure out
what or what the problem is.

Can anyone suggest a workaround, or make a suggestion?  I'm really stuck
here.

Thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Power analysis for MANOVA?

2009-01-26 Thread Charles C. Berry

On Mon, 26 Jan 2009, Adam D. I. Kramer wrote:



On Mon, 26 Jan 2009, Charles C. Berry wrote:



 If you know what a 'general linear hypothesis test' is see

  http://cran.r-project.org/src/contrib/Archive/hpower/hpower_0.1-0.tar.gz



I do, and am quite interested, however this package will not install on R
2.8.1: First, it said that there was no maintainer in the description, so
I added one (figuring that the 1991 date of the package was to blame),
however it still will not compile:

parmesan:tmp$ sudo R CMD INSTALL hpower/
* Installing to library '/usr/local/lib/R/library'
* Installing *source* package 'hpower' ...
** R
** preparing package for lazy loading
Error in parse(n = -1, file = file) : unexpected '{' at
5: ##
6: pfnc_function(q,df1,df2,lm,iprec=c(6)) {

_^_

AHA!

That underscore is the old 'assignment' operator - now no longer allowed.

Do a global replace of '_' with ' - ' in the R/*.R files and it should 
install.


HTH,

Chuck



Calls: Anonymous - code2LazyLoadDB - sys.source - parse
Execution halted
ERROR: lazy loading failed for package 'hpower'
** Removing '/usr/local/lib/R/library/hpower'
parmesan:tmp$

...any tips?

--Adam


 HTH,

 Chuck

 On Mon, 26 Jan 2009, Adam D. I. Kramer wrote:

 
  On Mon, 26 Jan 2009, Stephan Kolassa wrote:
 
My (and, judging from previous traffic on R-help about power 
analyses,

also some other people's) preferred approach is to simply simulate an
effect size you would like to detect a couple of thousand times, run 
your
proposed analysis and look how often you get significance.  In your 
   simple

case, this should be quite easy.
 
  I actually don't have much experience running monte-carlo designs like
  this...so while I'd certainly prefer a bootstrapping method like this 
  one,
  simulating the effect size given my constraints isn't something I've 
  done

  before.
 
  The MANOVA procedure takes 5 dependent variables, and determines what

  combination of the variables best discriminates the two levels of my
  independent variable...then the discrimination rate is represented in 
  the
  statistic (Pillai's V=.00019), which is then tested (F[5,18653] = 0.71). 
  So

  coming up with a set of constraints that would produce V=.00019 given my
  data set doesn't quite sound trivial...so I'll go for the par library
  reference mentioned earlier before I try this.  That said, if anyone can
  refer me to a tool that will help me out (or an instruction manual for 
  RNG),

  I'd also be much obliged.
 
  Many thanks,

  Adam
 
 
  
HTH,

Stephan
  
  
Adam D. I. Kramer schrieb:

 Hello,
  I have searched and failed for a program or script or method 
  to
 conduct a power analysis for a MANOVA. My interest is a fairly 
 simple  
   case
 of 5 dependent variables and a single two-level categorical 
 predictor

 (though the categories aren't balanced).
  If anybody happens to know of a script that will do this in 
  R, 
   I'd

 love to know of it! Otherwise, I'll see about writing one myself.
  What I currently see is this, from help.search(power):
  stats::power.anova.test
 Power calculations for balanced one-way
 analysis of variance tests
 stats::power.prop.test
 Power calculations two sample test for
 proportions
 stats::power.t.test Power calculations for one and two sample t
 tests
  Any references on power in MANOVA would also be helpful, 
  though 
   of

 course I will do my own lit search for them myself.
  Cordially,
 Adam D. I. Kramer
  __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide  
   http://www.R-project.org/posting-guide.html

 and provide commented, minimal, self-contained, reproducible code.
   
  
 
  __

  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
  http://www.R-project.org/posting-guide.html

  and provide commented, minimal, self-contained, reproducible code.
 
 


 Charles C. Berry(858) 534-2098
Dept of Family/Preventive
 Medicine
 E mailto:cbe...@tajo.ucsd.edu  UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901








Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing 

Re: [R] WhisperStation R

2009-01-26 Thread andrew
any idea why DDR2 667 MHz RAM isn't used instead of DDR?  I thought
that DDR 400MHz was almost finished in production...

On Jan 27, 1:01 pm, zerfetzen zerfet...@yahoo.com wrote:
 What do you think of this:

 http://www.microway.com/whisperstation/whisperstation-r.html

 I'm considering ditching my Windows Vista 2 GB RAM computer for
 WhisperStation R using Debian 64-bit Linux with 32 GB RAM and setting the
 whole thing up for R and WinBUGS.  I put in a price request, but I know
 nothing about Linux, or WhisperStation R for that matter, and am really
 curious what you think?
 --
 View this message in 
 context:http://www.nabble.com/WhisperStation-R-tp21678280p21678280.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Running R under Sun Grid Engine with OpenMPI tight integration

2009-01-26 Thread Rainer M Krug
On Tue, Jan 27, 2009 at 2:30 AM, Peter Waltman peter.walt...@gmail.com wrote:
 Hi -

 I saw your posting on the R-help mailing list.  Were you ever able to get
 this working?  did you end up switching to use the rsge library?

Yes - that is exactly what I did - I am using rsge or, which is in
most cases sufficient for me, starting several instances of R and run
the whole simulation (array processing).

But I would still like to know how I can use the Rmpi and snow on the
Sun Grid Engine.

Please keep me posted,

Rainer

 I'm trying to do the same, and not having very much luck getting it going.

 Thanks!

 Peter Waltman




-- 
Rainer M. Krug, PhD (Conservation Ecology, SUN), MSc (Conservation
Biology, UCT), Dipl. Phys. (Germany)

Centre of Excellence for Invasion Biology
Faculty of Science
Natural Sciences Building
Private Bag X1
University of Stellenbosch
Matieland 7602
South Africa

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Control of Quartz Window Location

2009-01-26 Thread Larry Weldon

If I use
plot(1:10)
quartz()
plot(1:10)

I get the second graph window almost on top of the first graph window.
How can I control the location of the quartz window?

Larry Weldon
Simon Fraser University
wel...@sfu.ca
www.stat.sfu.ca/~weldon

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[R] How do you specify font family in png output; png cross-platform issues

2009-01-26 Thread Paul Johnson
For teaching purposes, I prepared a little R program. I want to give
this to students who can run it and dump out many formats and then
compare their use in LaTeX documents.  I do not have too much trouble
with xfig or postscript format, but I've really run into a roadblock
where png files are concerned.

My original problem was that the png device does not accept a family option.
How can I have png output with the Times family to compare with the
postscript or pdf output?

While searching for information on this, I discovered there have been
a lot of R changes in png support.  If I give this script to people
with Mac or Windows, what are the chances that it will work?  If I'm
reading the png help page correctly, there are different types
available, Xlib and cairo, but I don't understand what all that means
for sending a program like this across systems. (fear the worst, but
ask hoping for best).

As far as I understand it, the paper=special option is needed so
that the eps or pdf output will fit into a document without creating
really huge margins around the graph. Correct?


x- rnorm(333)

y- rnorm(333)

plot ( x,y, xlab=Input Variable, ylab=Output Variable)

xfig(file=testplot.fig,  horizontal=F, height=6, width=6, family=Times)
plot ( x,y, xlab=Input Variable, ylab=Output Variable)
dev.off()

postscript(file=testplot-1.eps,  horizontal=F, height=6, width=6,
family=Times, onefile=F, paper=special)
plot ( x,y, xlab=Input Variable, ylab=Output Variable)
dev.off()

postscript(file=testplot-2.eps,  horizontal=F, height=4, width=4,
family=Times, onefile=F, paper=special)
plot ( x,y, xlab=Input Variable, ylab=Output Variable)
dev.off()

pdf(file=testplot-1.pdf, height=6, width=6,
family=Times,onefile=F,paper=special)
plot ( x,y, xlab=Input Variable, ylab=Output Variable)
dev.off()


png(file=testplot-1.png, height=350, width=550, type=Xlib)
plot ( x,y, xlab=Input Variable, ylab=Output Variable)
dev.off()


png(file=testplot-2.png, height=350, width=550, type=cairo)
plot ( x,y, xlab=Input Variable, ylab=Output Variable)
dev.off()




Can I bother you about one last png issue?

While searching r-help, I see posts about the difference in png output
between type Xlib and cairo.  For reasons I do not understand,
ordinary viewers like GQview or Firefox make cairo-produced png files
look blurry (in the words of posts on r-help).  The png output from
type=Xlib output is not blurry. This raises another level of
confusion about this exercise I'm devising.  Does R for Windows, as
provided on the CRAN system, use Xlib for png?

pj



-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting data from a PDF-file into R

2009-01-26 Thread joe1985




Peter Dalgaard wrote:
 
 joe1985 wrote:
 Hello
 
 I have around 200 PDF-documents, containing data i want organized in R as
 a
 dataframe. The PDF-documents look like this;
 
   http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver.jpeg 
 
 or like this;
 
 http://www.nabble.com/file/p21667074/PRRS-billede%2Bmed%2Bfarver%2B2.jpeg 
 
 So i want to pull out the data in coloured boxes it become organized like
 this (just in R instead of excel);
 
 
 http://www.nabble.com/file/p21667074/PRRS-billede%2Bexcel.jpeg 
 
 So the 0'es and 1'es represent when either PRRS-neg occurs presented by
 a
 0 in the colums PRRS-VAC and PRRS-DK on a particular date. And the same
 with
 PRRS-pos VAC or Vac presented by a 1 in the colum PRRS-VAC, and
 PRRS-pos DK  or DK presented by a 1 in the colum PRRS-DK. And also
 with
 sanVAC there should be a 1 in the colum VACsan, and with sanDK there
 should be a 1 in the colum DKsan. The first date for each CHR-nr should
 either be the earliest date ne the red box (as in the first picture), or
 the
 date with word før before the date (as in the second picture). All the
 200
 PDF-documents looks like the ones in the pictures, each reprenting a
 different CHR-nr
 
 
 Hope you can help me
 
 Not on the basis of .jpeg files, I think. We'd need some indication of
 what the PDF looks like inside.  There's a tool called pdftotext, which
 might do something for you, IF you can figure out reliably where your
 data begin and end.
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - (p.dalga...@biostat.ku.dk)  FAX: (+45) 35327907
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 Thank you for your quick respons
 
 Here they are as textfiles;
 
 
 
 
http://www.nabble.com/file/p21680833/Foersom%2B-%2B688.txt Foersom+-+688.txt 

http://www.nabble.com/file/p21680833/M%25C3%2598LLEVANG%2B602%2B.txt
M%C3%98LLEVANG+602+.txt 
-- 
View this message in context: 
http://www.nabble.com/Getting-data-from-a-PDF-file-into-R-tp21667074p21680833.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.