[R] p-Value

2007-08-27 Thread amna khan
Hi Sir

When we use Kendall Package to obtain Kendall's Tau statistic.
Then we also get two-sided p value. What does two-sided p-value mean?
The word two-sided is confusing to understand.

Kindly provide help in this regard.


-- 
AMINA SHAHZADI
Department of Statistics
GC University Lahore, Pakistan.
Email:
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (coxph, se) Obtaining standard errors of coefficients from coxph to store

2007-08-27 Thread joris . dewolf


David,

It would be helpful to give an example of what you would like to extract.

I guess you know how to extract elements from vectors and lists.
However, sometimes the objects returned by functions can be rather complex
(output of coxph() is...)
A general method to capture printed output is via capture.output(). Maybe
not fast, but if you have no other solution...

Joris

 a - rnorm(10,1,1)
 b - rnorm(10,1,1)
 mod - lm(a~b)
 smod - summary(mod)
 smod

Call:
lm(formula = a ~ b)

Residuals:
Min  1Q  Median  3Q Max
-1.7482 -0.5991  0.1211  0.8341  1.4975

Coefficients:
Estimate Std. Error t value Pr(|t|)
(Intercept)   1.6210 0.5332   3.040   0.0161 *
b-0.7667 0.5037  -1.522   0.1664
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.142 on 8 degrees of freedom
Multiple R-Squared: 0.2246, Adjusted R-squared: 0.1277
F-statistic: 2.317 on 1 and 8 DF,  p-value: 0.1664

 output - capture.output(print(smod))
 output
 [1] 
 [2] Call:
 [3] lm(formula = a ~ b)
 [4] 
 [5] Residuals:
 [6] Min  1Q  Median  3Q Max 
 [7] -1.7482 -0.5991  0.1211  0.8341  1.4975 
 [8] 
 [9] Coefficients:
[10] Estimate Std. Error t value Pr(|t|)  
[11] (Intercept)   1.6210 0.5332   3.040   0.0161 *
[12] b-0.7667 0.5037  -1.522   0.1664  
[13] ---
[14] Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
[15] 
[16] Residual standard error: 1.142 on 8 degrees of freedom
[17] Multiple R-Squared: 0.2246,\tAdjusted R-squared: 0.1277 
[18] F-statistic: 2.317 on 1 and 8 DF,  p-value: 0.1664 
[19] 





   
 David Lloyd 
 [EMAIL PROTECTED] 
 lloyd.com To 
 Sent by:  r-help@stat.math.ethz.ch  
 [EMAIL PROTECTED]  cc 
 at.math.ethz.ch   
   Subject 
   [R] (coxph,  se) Obtaining standard 
 16/08/2007 11:31  errors of coefficients from coxph   
   to store
   
   
   
   
   
   




Hi all,

I'm wanting to be able to find and store the z-score of coxph below: -

modz=coxph(Surv(TSURV,STATUS)~RAGE+DAGE+REG_WTIME_M+CLD_ISCH+POLY_VS,
data=kidneyT,method=breslow)


I know summary(modz) will give me this, but how do i extract the
standard error or z-score values in a similar way to obtaining the
coefficients by coef(modz) ? I think it must be something to do with
modz$var but I'm having a complete mental blank.

I need this info so I can write a function to use within a bootstrap so
I can record the number of times (proportion) each variable in the Cox
PH model is actually significant over all the bootstrap resamples.

Any assistance is greatly appreciated

DL


Click to find local singles for dating, romance and fun.
http://tagline.bidsystem.com/fc/Ioyw36XJJVs581mfqGSywy0Z69Mq8VM03oVytPu
8otqP84CBZmNX2G/



span id=m2wTlpfont face=Arial, Helvetica, sans-serif size=2
style=font-size:13.5
px___BRGet
the Free email that has everyone talking at a
href=http://www.mail2world.com target=newhttp://www.mail2world.com/abr
font color=#99Unlimited Email Storage #150; POP3 #150; Calendar
#150; SMS #150; Translator #150; Much More!/font/font/span
 [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Ptit_Bleu

Hi,

I recently discovered the R program and I thought it could be useful to me.
I have to analyse data saved as .Px file (x between 0 and 8 - .P0 files have
18 lines at the beginning that I have to skip). New files are generated
everyday.

This is my strategy :

In order to analyse the data, I first want to copy the new data in a
database in MySQL (which already contains the previous data).
So the first task is to compare the list of the files in the directory
(object : rfichiers) to the list of the files already saved (object :
tfichiers). The list containing the new files is then given by
nfichiers-setdiff(rfichiers, tfichiers).

It sounds easy ...
... but it doesn't work !!!

Up to now, I'm am able to connect to MySQL and, if the file tfichiers.r
doesn't exist, I can copy data files to the MySQL database.
But if tfichiers.r already exists and there is no new file to save, it
ignores the condition if (nfichiers!=0) and save all the files of the
directory to the database.

Is it a problem with the way I save tfichiers or is it a problem with the
condition if (nfichiers!=0) ?
Could you please give me some advices to correct my script (written with
Tinn-R) ?

I thank you in advance for your help.
Have a nice week,
Ptit Bleu.

PS : Ptit Bleu means something like Full Newbye in french. So thanks to be
patient :-)
PPS : I hope you understand my french english

--


# Connexion a la base de donnees database de MySQL

library(DBI)
library(RMySQL)
drv-dbDriver(MySQL)
con-dbConnect(drv, username=user, password=password, dbname=database,
host=localhost)


# Creation des objets contenant la liste des fichiers (rel pour chemin
relatif)
# - dans le repertoire : objet rfichiers
# - deja traites : objet tfichiers
# - nouveaux depuis la derniere connexion : objet nfichiers
# chemin est le repertoire de stockage des donnees
# RWork est le repertoire de travail de R
# sep='' pour eviter l'ajout d'un espace apres Mydata/

setwd(D:/RWork)
chemin-d:/Mydata/
relrfichiers-dir(chemin, pattern=.P)
rfichiers-paste(chemin,relrfichiers, sep='')

if (file.exists(tfichiers.r))
  {
tfichiers-load(tfichiers.r)
nfichiers-setdiff(rfichiers,tfichiers)
  } else {
nfichiers-rfichiers
  }


# p0fichiers : fichiers avec l'extension .P0 (fichiers contenant des lignes
d'infos à ne pas charger)
# pxfichiers : fichiers avec les extensions P1, ..., P8 (sans infos au
debut)

if (nfichiers!=0)
{
  p0fichiers-nfichiers[grep(.P0, nfichiers)]
  pxfichiers-setdiff(nfichiers, p0fichiers)


# Fusion des colonnes jour et heure pour permettre de tracer des variations
en fonction du temps
# Chaque fichier contenu dans l'objet p0fichiers est chargé, en supprimant
les 18 premieres lignes,
# et on met dans l'objet jourheure la fusion de la colonne jour (V1) et de
la colonne heure (V2)
# L'objet jourheure est recopie dans la premiere colonne de donnees
# On supprime ensuite la deuxieme colonne (contenant les heures) qui est
maintenant superflue
# L'objet donnees est copié dans la base de donnees MySQL Mydata
# Remarque : R comprend le format jour/mois/annee - MySQL : annee/mois/jour
- stockage en CHAR dans MySQL

  for (i in 1:length(p0fichiers))
{
  donnees-read.table(p0fichiers[i], quote=\, sep=;, dec=,,
skip=18)
  jourheure-paste(donnees$V1, donnees$V2, sep= )
  donnees[1]-jourheure
  donnees-donnees[,-2]
#  assignTable(con, Datatable, donnees, append=TRUE) - Ne marche pas
  dbWriteTable(con, Datatable, donnees, append=TRUE)
  rm(donnees, jourheure)
}


# Idem avec les fichiers d'extension .Px en chargant toutes les lignes
(skip=0)
# Amelioration possible : creer une fonction avec en argument p0fichiers ou
pxfichiers

  for (i in 1:length(pxfichiers))
{
  donnees-read.table(pxfichiers[i], quote=\, sep=;, dec=,,
skip=0)
  jourheure-paste(donnees$V1, donnees$V2, sep= )
  donnees[1]-jourheure
  donnees-donnees[,-2]
#   assignTable(con, Datatable, donnees, append=TRUE) - Ne marche pas
  dbWriteTable(con, Datatable, donnees, append=TRUE)
  rm(donnees, jourheure)
}
}

tfichiers-rfichiers 
save(rfichiers, file=tfichiers.r, ascii=TRUE) 
rm(list=ls())  

# Deconnexion à MySQL

dbDisconnect(con)
-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12343236
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] I again with shorter message and script

2007-08-27 Thread Ptit_Bleu

Hi,

I realized that my first message and the script were (maybe) too long and
difficult to read.
So I tested this shorter one :

- 

setwd(D:/RWork)
chemin-d:/Mydata/
relrfichiers-dir(chemin, pattern=.P)
rfichiers-paste(chemin,relrfichiers, sep='')

tfichiers-rfichiers
save(tfichiers, file=tfichiers.r, ascii=TRUE)

if (file.exists(tfichiers.r))
  {
tfichiers-load(tfichiers.r)
nfichiers-setdiff(rfichiers,tfichiers)
  }



The result is :
nfichiers is equal to rfichiers
and when I ask tfichiers, I obtain ... tfichiers :-(

I read the ?save and saw the warning about the arguments but I have no idea
how to solve this problem which must be a basic one (but do not forget that
I'm a newbye and that I'm french :-)

Thans again for your comments and help,
Ptit Bleu.

-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12343633
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Prof Brian Ripley

On Mon, 27 Aug 2007, Ptit_Bleu wrote:



Hi,

I recently discovered the R program and I thought it could be useful to me.
I have to analyse data saved as .Px file (x between 0 and 8 - .P0 files have
18 lines at the beginning that I have to skip). New files are generated
everyday.


relrfichiers-dir(chemin, pattern=.P)

does not do that, though.  Better to use

dir(chemin, pattern=\\.[0-8]$, full.names=TRUE)

or

Sys.glob(file.path(chemin, *.P[0-8]))



This is my strategy :

In order to analyse the data, I first want to copy the new data in a
database in MySQL (which already contains the previous data).
So the first task is to compare the list of the files in the directory
(object : rfichiers) to the list of the files already saved (object :
tfichiers). The list containing the new files is then given by
nfichiers-setdiff(rfichiers, tfichiers).

It sounds easy ...
... but it doesn't work !!!

Up to now, I'm am able to connect to MySQL and, if the file tfichiers.r
doesn't exist, I can copy data files to the MySQL database.
But if tfichiers.r already exists and there is no new file to save, it
ignores the condition if (nfichiers!=0) and save all the files of the
directory to the database.


What did you intend there?  It is not a test of no difference, but a test 
that each element of the difference is not 0, and furthermore if() 
expects a test of length one, not the length of nfichiers.  I suspect you 
intended to test length(nfichiers)  0.


It often helps to print (or use str on) the objects you create.  Try this 
on


nfichiers
nfichiers!=0


Is it a problem with the way I save tfichiers or is it a problem with the
condition if (nfichiers!=0) ?


Saving in R save format with extension .r is going to confuse others. 
Extension .rda is conventional for save format (and I doubt you need an 
ascii save).



Could you please give me some advices to correct my script (written with
Tinn-R) ?

I thank you in advance for your help.
Have a nice week,
Ptit Bleu.

PS : Ptit Bleu means something like Full Newbye in french. So thanks to be
patient :-)
PPS : I hope you understand my french english

--


# Connexion a la base de donnees database de MySQL

library(DBI)
library(RMySQL)
drv-dbDriver(MySQL)
con-dbConnect(drv, username=user, password=password, dbname=database,
host=localhost)


# Creation des objets contenant la liste des fichiers (rel pour chemin
relatif)
# - dans le repertoire : objet rfichiers
# - deja traites : objet tfichiers
# - nouveaux depuis la derniere connexion : objet nfichiers
# chemin est le repertoire de stockage des donnees
# RWork est le repertoire de travail de R
# sep='' pour eviter l'ajout d'un espace apres Mydata/

setwd(D:/RWork)
chemin-d:/Mydata/
relrfichiers-dir(chemin, pattern=.P)
rfichiers-paste(chemin,relrfichiers, sep='')
if (file.exists(tfichiers.r))
 {
   tfichiers-load(tfichiers.r)
   nfichiers-setdiff(rfichiers,tfichiers)
 } else {
   nfichiers-rfichiers
 }


# p0fichiers : fichiers avec l'extension .P0 (fichiers contenant des lignes
d'infos à ne pas charger)
# pxfichiers : fichiers avec les extensions P1, ..., P8 (sans infos au
debut)

if (nfichiers!=0)
{
 p0fichiers-nfichiers[grep(.P0, nfichiers)]
 pxfichiers-setdiff(nfichiers, p0fichiers)


# Fusion des colonnes jour et heure pour permettre de tracer des variations
en fonction du temps
# Chaque fichier contenu dans l'objet p0fichiers est chargé, en supprimant
les 18 premieres lignes,
# et on met dans l'objet jourheure la fusion de la colonne jour (V1) et de
la colonne heure (V2)
# L'objet jourheure est recopie dans la premiere colonne de donnees
# On supprime ensuite la deuxieme colonne (contenant les heures) qui est
maintenant superflue
# L'objet donnees est copié dans la base de donnees MySQL Mydata
# Remarque : R comprend le format jour/mois/annee - MySQL : annee/mois/jour
- stockage en CHAR dans MySQL

 for (i in 1:length(p0fichiers))
   {
 donnees-read.table(p0fichiers[i], quote=\, sep=;, dec=,,
skip=18)
 jourheure-paste(donnees$V1, donnees$V2, sep= )
 donnees[1]-jourheure
 donnees-donnees[,-2]
#  assignTable(con, Datatable, donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, Datatable, donnees, append=TRUE)
 rm(donnees, jourheure)
   }


# Idem avec les fichiers d'extension .Px en chargant toutes les lignes
(skip=0)
# Amelioration possible : creer une fonction avec en argument p0fichiers ou
pxfichiers

 for (i in 1:length(pxfichiers))
   {
 donnees-read.table(pxfichiers[i], quote=\, sep=;, dec=,,
skip=0)
 jourheure-paste(donnees$V1, donnees$V2, sep= )
 donnees[1]-jourheure
 donnees-donnees[,-2]
#   assignTable(con, Datatable, donnees, append=TRUE) - Ne marche pas
 dbWriteTable(con, Datatable, donnees, append=TRUE)
 rm(donnees, jourheure)
   }
}

tfichiers-rfichiers
save(rfichiers, file=tfichiers.r, ascii=TRUE)
rm(list=ls())

# Deconnexion à MySQL

dbDisconnect(con)



--
Brian D. Ripley,  

Re: [R] Calculating diameters of cirkels in a picture.

2007-08-27 Thread Bartjoosen

Hi All,

I really like to thank you for the answers, while I was searching for some
edge detection and clustering algorithms, Moshe came with a simple but
effective solution: use the area to find the diameter!

But I tried Moshe's solution, but I couldn't figure out what you mean with
morphological closing and the labeling to split the images.
Could you please clarify this a bit?

Thanks for your support


Bart


Moshe Olshansky-2 wrote:
 
 Hi Bart,
 
 One more comment:
 
 You do not really need the morphological closing to
 close the holes inside the circles. Another
 possibility is to reverse the black-and-withe picture,
 i.e. make the holes and background be 1 and the
 circles 0, label the connected components and then
 only the component which touches the boundaries is the
 background while all other components are holes and
 you can make them white (1) in the original
 black-and-white image.
 
 --- Moshe Olshansky [EMAIL PROTECTED] wrote:
 
 Hi Bart,
 
 I have never used image processing software in R (I
 was doing this with Matlab), but here is what I
 would
 have done algorithmically:
 1) convert the picture to gray-scale
 2) find a threshold value which separates the
 circles
 from the background and convert your image to black
 and white
 3) if the circles are far apart use morphological
 closing to fill in small holes inside the circles
 (may
 be do this several times)
 4) use labeling to split the image into connected
 components
 5) for each connected component get it's area (the
 number of pixels) and use the formula S = Pi*R^2 to
 find the approximate radii.
 
 Regards,
 
 Moshe.
 
 --- Julian Burgos [EMAIL PROTECTED] wrote:
 
  Hi Bart,
  
  If you only have 36 circles, the fastest way would
  be to use some image 
  processing software and measure the circles by
  hand.  One option is to 
  use ImageJ, which you can download here
  
  http://rsb.info.nih.gov/ij/
  
  Julian
  
  Bart Joosen wrote:
   Hi,
  
   Maybe this is more a programming questions than
 a
  specific R-project question, but maybe there is
  someone who can point me in the right direction.
  
   I have a picture of cirkels which I took with a
  digital camera.
   Now I want to use the diameter of the cirkels on
  the picture for analysis in R.
   I can use pixmap to import the picture, but how
 do
  I find the outside cirkels and calculate the
  diameter?
   I pointed out that I can use the edci package,
 but
  then I need to preprocess the data to reduce the
  points, otherwise it takes a long time, and my
  computer crashes.
  
   If you want to see such a picture, I cropped a
  larger one, and highlighted the cirkel which is of
  interest.
   In a real world, this is a plate with 36
 cirkels,
  which all should be measured.
   www.users.skynet.be/fa244930/fotos/outlined.jpg
  
  
   Thanks for your time
  
   Bart
[[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained,
  reproducible code.
  
  
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
  reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Calculating-diameters-of-cirkels-in-a-picture.-tf4319669.html#a12343143
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Ptit_Bleu

Dear Prof Ripley,

I thank you for your fast answer.
In order to follow your advices :

I deleted all the objects and the tfichiers.r already created.
I changed all the tfichiers.t of the script into tfichiers.rda

Then I launched the script twice.
The first time, as tfichiers.rda didnt' exist, it created one.
During the script, I got this warning :
1: la condition a une longueur  1 et seul le premier élément est utilisé
in: if (nfichiers != 0)
(translate with my words : the condition has a length superior to 1 and only
the first element is used in ...)
Below, you will find the results.

The second launch gave the same results for nfichiers and rfichiers but for
tfichiers I obtained
tfichiers.

Have you some ideas to help me (because I really have none ...)
Again thank you,
Ptit Bleu.

--
FIRST LAUNCH

nfichiers
[1] d:/Mydata/31_07_07.P0   d:/Mydata/31_07_2007.P0
[3] d:/Mydata/31_07_2007.P1 d:/Mydata/31_07_2007.P2
[5] d:/Mydata/31_07_2007.P3

 nfichiers!=0
[1] TRUE TRUE TRUE TRUE TRUE

rfichiers
[1] d:/Mydata/31_07_07.P0   d:/Mydata/31_07_2007.P0
[3] d:/Mydata/31_07_2007.P1 d:/Mydata/31_07_2007.P2
[5] d:/Mydata/31_07_2007.P3

tfichiers
[1] d:/Mydata/31_07_07.P0   d:/Mydata/31_07_2007.P0
[3] d:/Mydata/31_07_2007.P1 d:/Mydata/31_07_2007.P2
[5] d:/Mydata/31_07_2007.P3
--

SECOND LAUNCH
with these changes in order not to change tfichiers.rda
#tfichiers-rfichiers
#save(tfichiers, file=tfichiers.rda)

nfichiers
[1] d:/Mydata/31_07_07.P0   d:/Mydata/31_07_2007.P0
[3] d:/Mydata/31_07_2007.P1 d:/Mydata/31_07_2007.P2
[5] d:/Mydata/31_07_2007.P3

 nfichiers!=0
[1] TRUE TRUE TRUE TRUE TRUE

rfichiers
[1] d:/Mydata/31_07_07.P0   d:/Mydata/31_07_2007.P0
[3] d:/Mydata/31_07_2007.P1 d:/Mydata/31_07_2007.P2
[5] d:/Mydata/31_07_2007.P3

tfichiers
tfichiers

 
-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12344036
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Column naming mystery

2007-08-27 Thread Werner Wernersen
Hi,

I hope somebody could help me explain what seems
mysterious to me? 

I use this line on a dataframe ae:
summaryBy(total_inflated+total~gr1, data=ae, FUN=sum,
na.rm=T)

and it returns 3 columns as expected and columns gr1
and total_inflated.sumare correct but the
total.sum column consists of only zeros which is not
correct. The same happens when I rename the
total_inflated to total.inflated or
totalinflated but not when I rename it to
ttotal_inflated. In the latter case I get the
correct result also for the total.sum column.

Could anyone explain the rules for the column naming
to me?

Thank you very much in advance!
  Werner


  Machen Sie Yahoo! zu Ihrer Startseite. Los geht's:

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with save or/and if (I think but maybe not ...)

2007-08-27 Thread Ptit_Bleu

Dera Prof Ripley, 

You wrote :
What did you intend there?  It is not a test of no difference, but a test 
that each element of the difference is not 0, and furthermore if() 
expects a test of length one, not the length of nfichiers.  I suspect you 
intended to test length(nfichiers)  0.

And of course, you were right.
With the condition length(nfichiers)  0, there is no more warning.
And I tested manually length(nfichiers)0 for different cases and it gave
the result I expected.

But still I have a problem with the save and the retrieve of tfichiers.
I keep on looking at help file and testing manually alternative scripts ...

Hoping to read you again,
Ptit Bleu. 

   
-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12344227
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [SOLVED] save/load - I finally found (to be honnest : jholtman found)

2007-08-27 Thread Ptit_Bleu

The post of jholtman gave me the solution :
http://www.nabble.com/problems-saving-and-loading-%28PLMset%29-objects-tf4179541.html#a11885136

Like Quin Wills, I was trying to assign tfichiers.rda to tfichiers.
I've just write load(tfichiers.rda) instead of
tfichiers-(tfichiers.rda)
And now it works ... for this part (because if new files are only .P0, there
is a problem when the script try to read .P(not 0) file as there is none.
But this is not so difficult to solve even for me (I think, well, I hope).

Thanks to Prof Ripley and to all people helping people like me (maybe one
day I will also be able to help people).
Have a nice week,
Ptit Bleu.




-- 
View this message in context: 
http://www.nabble.com/Problem-with-save-or-and-if-%28I-think-but-maybe-not-...%29-tf4333945.html#a12345123
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Monmonier algorithm

2007-08-27 Thread Thibaut Jombart
Hello,

Here is a late answer, but an answer nonetheless to the question I asked 
almost one year ago on this list:

  On Wed, 29 Mar 2006, Thibaut Jombart wrote:

  Hello list, 
http://tolstoy.newcastle.edu.au/R/help/06/03/24318.html#24322qlink1
/ /
/ does anyone know if Monmonier algorithm is available in R? I've 
checked /
/ several spatial libraries, but I didn't find anything related to it. /
/ However, there is a huge documentation and I may have missed it. /
/ /
/ Before coding it, I'd like to be sure it doesn't already exist. /

  Googling, I found:

  http://www-med-physik.vu-wien.ac.at/staff/rub/abstracts/ISCB_2005.pdf

  which is a poster, and refers to using R for boundary finding, and 
other software for data management and display.  Perhaps the authors 
are able to help by making code available, the poster looks like a nice 
example of spatial data analysis.

 -- 
 Roger Bivand
 Economic Geography Section, Department of Economics, Norwegian School of
 Economics and Business Administration, Helleveien 30, N-5045 Bergen,
 Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
 e-mail: [EMAIL PROTECTED]

Basically, Monmonier algorithm aims at finding maximum-difference 
boundaries between geo-referenced objects. It requires a set of 
georeferenced objects along with matrix of distances among these objects.

Monmonier algorithm is now implemented in the adegenet package  
(http://pbil.univ-lyon1.fr/software/adegenet/). Main functions are 
'monmonier' and 'optimize.monmonier'. Despite the package is devoted to 
genetic data analysis, these functions can handle other kind of data as 
well.

The main difference I can see between this implementation and the 
original algorithm is that here, the function uses objects connected on 
a neighbouring graph rather than polygons of a Voronoi tesselation. 
Thus, Delaunay triangulation shall be used to recover the original 
version of the algorithm, but other graphs are also possible (e.g. 
Gabriel's graph).

Regards,

Thibaut.

-- 
##
Thibaut JOMBART
CNRS UMR 5558 - Laboratoire de Biométrie et Biologie Evolutive
Universite Lyon 1
43 bd du 11 novembre 1918
69622 Villeurbanne Cedex
Tél. : 04.72.43.29.35
Fax : 04.72.43.13.88
[EMAIL PROTECTED]
http://lbbe.univ-lyon1.fr/-Jombart-Thibault-.html?lang=en

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence intervals for ccf()

2007-08-27 Thread Gustaf Rydevik
Hello,

This is not a purely R-question, but perhaps someone can help me anyway.

I am trying to estimate the correlation between two time series (which
are both basically different types of  measurements of the same
phenomena), using both cor.test() (with pearson as method) and ccf().

Now, cor.test gives a confidence interval for the pearson correlation,
while ccf does not. I've tried to use bootstrap methods to get
confidence interval for the ccf function, but no luck. It is a bit
tricky, since the time series are non-stationary, and so I'm not sure
how to go about to generate the bootstrap-sample.

Does anyone have any ideas on how to do this, i.e get a confidence
interval for the ccf at different time lags?

Many thanks in advance,

Gustaf

-- 
Gustaf Rydevik, M.Sci.
tel: +46(0)703 051 451
address:Essingetorget 40,112 66 Stockholm, SE
skype:gustaf_rydevik

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] proftools package now available from CRAN

2007-08-27 Thread Luke Tierney
PROFILE OUTPUT PROCESSING TOOLS FOR R
 =


This package provides some simple tools for examining Rprof output
and, in particular, extracting and viewing call graph information.
Call graph information, including which direct calls where observed
and how much time was spent in these calls, can be very useful in
identifying performance bottlenecks.

One important caution: because of lazy evaluation a nested call
f(g(x)) will appear on the profile call stack as if g had been called
by f or one of f's callees, because it is the point at which the value
of g(x) is first needed that triggers the evaluation.


EXPORTED FUNCTIONS

The package exports five functions:

 readProfileData reads the data in the file produced by Rprof into a
 data structure used by the other functions in the package.
 The format of the data structure is subject to change.

 flatProfile is similar to summaryRprof.  It returns either a
 matrix with output analogous to gprof's flat profile or a
 matrix like the by.total component returned by summaryRprof;
 which is returned depends on the value of an optional second
 argument.

 printProfileCallGraph produces a printed representation of the
 call graph.  It is analogous to the call graph produced by
 gprof with a few minor changes.  Reading the gprof manual
 section on the call graph should help understanding this
 output.  The output is similar enough to gprof output for the
 cgprof (http://mvertes.free.fr/) script to be able to produce
 a call graph via Graphviz.

 profileCallGraph2Dot prints out a Graphviz .dot file representing
 the profile graph.  Times spent in calls can be mapped to node
 and edge colors.  The resulting files can then be viewed with
 the Graphviz command line tools.

 plotProfileCallGraph uses the graph and Rgraphviz packages to
 produce call graph visualizations within R.  You will need to
 install these packages to use this function.


A SIMPLE EXAMPLE

Collect profile information  for the examples for glm:

   Rprof(glm.out)
   example(glm)
   Rprof()
   pd - readProfileData(glm.out)

Obtain flat profile information:

   flatProfile(pd)
   flatProfile(pd, FALSE)

Obtain a printed call graph on the standard output:

   printProfileCallGraph(pd)

If you have the cgprof script and the Graphviz command line tools
available on a UNIX-like system, then you can save the printed graph
to a file,

   printProfileCallGraph(pd, glm.graph)

and either use

   cgprof -TX glm.graph

to display the graph in the interactive graph viewer dotty, or use

   cgprof -Tps glm.graph  glm.ps
   gv glm.ps

to create a PostScript version of the call graph and display it with
gv.

Instead of using the printed graph and cgprof you can use create a
Graphviz .dot file representation of the call graph with

   profileCallGraph2Dot(pd, filename = glm.dot, score = total)

and view the graph interactively with dotty using

   dotty glm.dot

or as a postscript file with

   dot -Tps glm.dot  glm.ps
   gv glm.ps

Finally, if you have the graph package from CRAN and the Rgraphviz
package from Bioconductor installed, then you can view the call graph
within R using

   plotProfileCallGraph(pd, score = total)

The default settings for this version need some work.]


OPEN ISSUES

My intention was to handle cycles roughly the same way that gprof
does.  I am not completely sure that I have managed to do this; I am
also not completely sure this is the best approach.

The graphs produced by cgprof and by plotProfileGraph and friends when
mergeEdges is false differ a bit.  I think this is due to the
heuristics of cgprof not handling cycle entries ideally and that the
plotProfileGraph graphs are actually closer to what is wanted.  When
mergeEdges is true the resulting graphs are DAGs, which simplifies
interpretation, but at the cost of lumping all cycle members together.

gprof provides options for pruning graph printouts by omitting
specified nodes.  It may be useful to allow this here as well.

Probably more use should be made of the graph package.


IMPLEMENTATION NOTES

The implementation is extremely crude (a real mess would be more
accurate) and will hopefully be improved over time--at this point it
is more of an existence proof than a final product.

Performance is less than ideal, though using these tools it was
possible to identify some problem points and speed up computing the
profile data by a factor of two (in other words, it may be bad now but
it used to be worse).  More careful design of the data structures and
memoizing calculations that are now repeated is likely to improve
performance substantially.




-- 
Luke Tierney
Chair, Statistics and 

Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-27 Thread John Kane

--- Duncan Murdoch [EMAIL PROTECTED] wrote:

 Deepayan Sarkar wrote:
  On 8/23/07, Duncan Murdoch [EMAIL PROTECTED]
 wrote:

  On 8/23/2007 11:28 AM, Prof Brian Ripley wrote:
  
  On Thu, 23 Aug 2007, John Kane wrote:
 

  The FAQ Section 7 is a very useful place for
 new users
  to find out any number of R idiosycracies. 
 However
  there is no numbering on the FAQ Table of
 Content or
  on the Sections Tables of Contents.
  
  Hmm, doc/FAQ does have a numbered table of
 contents and numbered sections
  and doc/manual/R-FAQ.html does have numbered
 sections and my browser's
  search finds 7.10 straight away.

  I think the suggestion is to change the contents
 lists in HTML from ul
  lists to ol lists.  Then one would see
 
  1. Introduction
  2. R Basics
  3. R and S
  4. R Web Interfaces
  5. R Add-On Packages
  6. R and Emacs
  7. R Miscellanea
  8. R Programming
  9. R Bugs
 10. Acknowledgments
 
  instead of
 
   * Introduction
   * R Basics
   * R and S
   * R Web Interfaces
   * R Add-On Packages
   * R and Emacs
   * R Miscellanea
   * R Programming
   * R Bugs
   * Acknowledgments
 
  in a browser, and I agree that would be
 preferable (assuming the
  numbering is consistent with what we get in the
 other formats).
  However, I don't see how to tell makeinfo --html
 to do this.  Adding
  --number-sections isn't enough.
  
 
  A simple CSS hack is to have
 
  ul{
  list-style-type: decimal;
  }
 
  in the style. The result can be seen in
 
  http://dsarkar.fhcrc.org/R/RFAQ-1.png
 
  A more sophisticated hack is to have something
 like
 
  ---
  body{
  counter-reset: chapter;
  counter-reset: section;
  }
  h2.chapter {
  counter-increment: chapter;
  counter-reset: section;
  }
 
  ul {
  list-style-type: none;
  }
 
  li:before {
  counter-increment: section;
  content: counter(chapter) . counter(section)
   ;
  }
  -
 
  which results in
 
  http://dsarkar.fhcrc.org/R/RFAQ-2.png
 
  The only problem here is that there is no way to
 distinguish between
  the chapter listing and the section listings (both
 are ul
  class=menu). If that could be made to have a
 different class, the
  chapter listing could be improved.
 
 I like the first, simple suggestion best; I'll put
 it into R-devel.  
 (With the slight change to use ul.menu instead
 of just ul, because FAQ 2.7 includes a plain ul
 list.)
 
 Duncan Murdoch
Thanks Deepayan and Duncan.  

It is not a make or break point in using R but it does
seem to make the FAQ a bit more user-friendly.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Column naming mystery

2007-08-27 Thread Werner Wernersen
Sorry that the problem description was not sufficient.
Here is a self-contained code replicating the problem:

require(doBy)
x -
as.data.frame(matrix(ncol=3,seq(1,12),dimnames=list(c(),c(hh,total,total.inf
summaryBy(total+total.inf~hh,x,FUN=sum)

What surprises me are the zeros in the resulting
total.sum column. The problem remains if total.inf is
renamed to totalinf or total_inf but not if renamed to
ttotal.inf .

Can anyone explain to me what the rules for naming
columns are so that I can avoid such mistakes in the
future?

Thanks a lot!


  

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to write nicely a condition on a loop for (that is, not like I did)

2007-08-27 Thread Ptit_Bleu

Hi again,

This is the follow of my post Problem with save or/and if (I think but
maybe not ...).
In this post, I wrote that I solved my main problem. And it is true.
I also wrote that there was still another problem, which I managed to solve. 

But I think there must be another way to solve it taking advantages of the R
language (which I don't master at all), that is with less if tests.

To sum up :
nfichiers is a list of files (with .P0 or .Px (x0) extension) I have to
copy to a database.
nfichiers can also be 0 if there is no file to copy
p0fichiers is the list of files having the .P0 extension if there are such
files to copy
And p0fichiers can also be 0 if there are only .Px files to copy

So, before doing the for loop, I want to test if p0fichiers really
contains something.
Thanks for your comments and your advices to improve this script.
Ptit Bleu.

-

So here is my solution :

p0fichiers-0 #initialization of
p0fichiers
if (length(nfichiers)0)  # if nfichiers contains
file names
{
  if (length(grep(.P0, nfichiers))0) {p0fichiers-nfichiers[grep(.P0,
nfichiers)]}  #look if there is .P0
  if (p0fichiers[1]0)   # if .P0 has been updated
with the test above
{
for (i in 1:length(p0fichiers))  # do the loop for
{
 donnees-read.table(p0fichiers[i], quote=\, sep=;,
dec=,, skip=18)
 jourheure-paste(donnees$V1, donnees$V2, sep= )
 donnees[1]-jourheure
 donnees-donnees[,-2]
rm(donnees, jourheure)
}
}
}
-- 
View this message in context: 
http://www.nabble.com/how-to-write-nicely-a-condition-on-a-loop-%22for%22-%28that-is%2C-not-like-I-did%29-tf4335310.html#a12347016
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can I interpret this test hypothesis test

2007-08-27 Thread Tom La Bone
 
Since no reply has been posted yet I will give it a shot. runs.test uses the
normal approximation and in your case it returned a z score of -1.8732. This
z score has a cumulative probability of 
 
pnorm(-1.8732,0,1)
[1] 0.03052039
 
If you are concerned about having too many runs and too few runs you would
select the two.sided option for runs.test, which gives a p-value of 0.0610
(0.0305 in each tail of the normal distribution). If you are concerned only
with too few runs you would select the less option, which will give a
p-value of 0.0305. Finally, if you are concerned only with too many runs you
would select the greater option which will give a p-value of 1-0.0305 =
0.9693. If your significance level is 0.05, you would compare 0.05 to 0.0610
and not reject the null hypothesis for the two-sided case and compare 0.05
to 0.0305 in the one-sided case and reject the null hypothesis. Note that
the normal approximation is OK for large samples but may give unacceptable
results for small samples. I am unaware of any packages in R that perform an
exact runs test.
 
Tom

 

I have used runs.test (Package tseries)  for computes the runs test
for randomness , but I get this result:
 
Runs test
-1.8732   P-value = 0.0610
 
Alternative Hypothesis : Two sided
 
How can I interpret this result ?
 
 

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FAQ 7.x when 7 does not exist. Useability question

2007-08-27 Thread Duncan Murdoch
On 8/27/2007 8:52 AM, John Kane wrote:
 --- Duncan Murdoch [EMAIL PROTECTED] wrote:

 I like the first, simple suggestion best; I'll put
 it into R-devel.  
 (With the slight change to use ul.menu instead
 of just ul, because FAQ 2.7 includes a plain ul
 list.)
 
 Duncan Murdoch
 Thanks Deepayan and Duncan.  
 
 It is not a make or break point in using R but it does
 seem to make the FAQ a bit more user-friendly.

I'm about to commit the change, but it's not perfect.  I've applied the 
change to the css used in all the manuals, not just the FAQ, so the HTML 
versions of the manuals now end up with numbered contents listings too. 
  However, appendices continue the chapter numbering, rather than 
switching to letters.  I think this is preferable to no numbering at 
all, but if others object to it, we can make this change for the FAQ only.

Another way to do this is what's used in the texinfo manual
http://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html
but I find that ugly and inconsistent.  The contents listing gets the 
numbering and lettering right (but not well formatted), but within each 
chapter the menus are unnumbered.

The texinfo format is just a bit limited for this kind of thing.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset using noncontiguous variables by name (not index)

2007-08-27 Thread Muenchen, Robert A (Bob)
Gabor, That works great!

I think this would be a very helpful addition to the main R
distribution. Perhaps with a single colon representing numerical order
(exactly as you have written it) and two colons representing the order
of the variables as they appear in the data frame (your first example).
That's analogous to SAS' x1-xN, which you know gets those N variables,
and a--z, which selects an unknown number of variables a through z. How
many that is depends upon their order in the data frame. That would not
only be very useful in general, but it would also make transitioning to
R from SAS or SPSS less confusing.

Is R still being extended in such basic ways, or does that muck up
existing programs too much?

Thanks,
Bob

 -Original Message-
 From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
 Sent: Sunday, August 26, 2007 8:52 PM
 To: Muenchen, Robert A (Bob)
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] subset using noncontiguous variables by name (not
 index)
 
 Try this:
 
  %:% - function(x, y) {
 +prex - gsub([0-9], , x); postx - gsub([^0-9], , x)
 +prey - gsub([0-9], , y); posty - gsub([^0-9], , y)
 +stopifnot(prex == prey)
 +paste(prex, seq(from = as.numeric(postx), to =
 as.numeric(posty)), sep = )
 + }
  x2 %:% x4
 [1] x2 x3 x4
 
 
 On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
  Thanks Bert  Gabor for two very interesting solutions!
 
  It would be very handy in R if string1:stringN generated
  string1,string2...stringN it would make selections like this
 much
  more obvious. I know it's easy to with the colon operator and paste
  function but that's quite a step up in complexity compared to SAS'
x1
  x3-x4 y2 or SPSS' x1,x3 to x4, y2. And it's complexity that
beginners
  face early in learning R.
 
  While on the subject of the colon operator, why doesn't
 anscombe[[1:4]]
  select the x variables in list form as anscombe[,1:4] or
 anscombe[1:4]
  do in data frame form?
 
  Thanks,
 
  Bob
 
  =
  Bob Muenchen (pronounced Min'-chen), Manager
  Statistical Consulting Center
  U of TN Office of Information Technology
  200 Stokely Management Center, Knoxville, TN 37996-0520
  Voice: (865) 974-5230
  FAX: (865) 974-4810
  Email: [EMAIL PROTECTED]
  Web: http://oit.utk.edu/scc,
  News: http://listserv.utk.edu/archives/statnews.html
  =
 
 
   -Original Message-
   From: Bert Gunter [mailto:[EMAIL PROTECTED]
   Sent: Sunday, August 26, 2007 6:50 PM
   To: 'Gabor Grothendieck'; Muenchen, Robert A (Bob)
   Cc: r-help@stat.math.ethz.ch
   Subject: RE: [R] subset using noncontiguous variables by name (not
   index)
  
   The problem is that x3:x5 does not mean what you think it means.
 The
   only
   reason it does the right thing in subset() is because a clever
 trick
  is
   used
   there (read the code -- it's not hard to understand) to ensure
that
 it
   does.
   Gabor has essentially mimicked that trick in his solution.
  
   However, it is not necessary do this. You can construct the call
   directly as
   you tried to do. Using the anscombe example, here's how:
  
   chooz - c(x1,x3:x4,y2)  ## enclose the desired expression in
 quotes
   do.call (subset, list( x = anscombe, select = parse(text =
chooz)))
  
   -- Bert Gunter
   Genentech Non-Clinical Statistics
   South San Francisco, CA
  
   The business of the statistician is to catalyze the scientific
   learning
   process.  - George E. P. Box
  
  
  
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Gabor
Grothendieck
Sent: Sunday, August 26, 2007 2:10 PM
To: Muenchen, Robert A (Bob)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] subset using noncontiguous variables by name
(not index)
   
Using builtin data frame anscombe try this. First we set up a
data frame
anscombe.seq which has one row containing 1, 2, 3, ... .  Then
  select
out from that data frame and unlist it to get the desired
index vector.
   
 anscombe.seq - replace(anscombe[1,], TRUE,
 seq_along(anscombe))
 idx - unlist(subset(anscombe.seq, select = c(x1, x3:x4, y2)))
 anscombe[idx]
   x1 x3 x4   y2
1  10 10  8 9.14
2   8  8  8 8.14
3  13 13  8 8.74
4   9  9  8 8.77
5  11 11  8 9.26
6  14 14  8 8.10
7   6  6  8 6.13
8   4  4 19 3.10
9  12 12  8 9.13
10  7  7  8 7.26
11  5  5  8 4.74
   
   
On 8/26/07, Muenchen, Robert A (Bob) [EMAIL PROTECTED] wrote:
 Hi All,

 I'm using the subset function to select a list of variables,
 some
   of
 which are contiguous in the data frame, and others of which
are not. It
 works fine when I use the form:

 subset(mydata,select=c(x1,x3:x5,x7) )

 In reality, my list is far more complex. So I would like to
store it in
 a variable to substitute in for c(x1,x3:x5,x7) but cannot get
 

Re: [R] R 2.5.1 - Rscript through tee

2007-08-27 Thread Dirk Eddelbuettel

On 26 August 2007 at 22:47, François Pinard wrote:
| I met a little problem for which someone might have a solution.  Let's 
| say I have an executable file (named pp.R) with this contents:
| 
|#!/usr/bin/Rscript
|options(echo=TRUE)
|a - 1
|Sys.sleep(3)
|a - 2
| 
| If I execute ./pp.R at the shell prompt, the output shows the timely 
| progress of the script as expected.  If I use ./pp.R | tee OUT 
| instead, the output seems buffered and I see it all at once at the end.
| 
| The problem does not come from the tee program, as if I use this 
| command:
| 
|(echo a; sleep 5; echo b) | tee OUT
| 
| the output is timely, not batched.
| 
| So, is there a way to tell R (or Rscript) that standard output should be 
| unbuffered, even if it is not directly connected to a terminal?

Use explicit print statements, e.g.  print(a - 1)

Also, you still have little as an alternate, at least on Unix [1].  Littler5D
actually won't show anything unless you explicitly call cat() or print(), but
then it does:

qa-v40z1:~/svn/hancock/app/aggposview cat /tmp/fp2.r
#!/usr/bin/env r

options(echo=TRUE)
cat(a - 1, \n)
Sys.sleep(3)
cat(a - 2, \n)
foo:~ /tmp/fp2.r | tee /tmp/fp2.r.out
1
2
foo:~ 

Littler is an 'all-in' binary and starts and runs demonstrably faster than
Rscript. 

Hth, Dirk
 
[1] And despite the rather petty refusal of Rscript's main author to a least
give a reference to littler in Rscript's documentation, let alone credit as
'we were there first', the fact remains that littler became available in Sep
2006 whereas Rscript was not released until R 2.5.0 a good six month
later. Oh well. 



| In case useful, here is local R information:
| 
| Version:
|  platform = x86_64-unknown-linux-gnu
|  arch = x86_64
|  os = linux-gnu
|  system = x86_64, linux-gnu
|  status = 
|  major = 2
|  minor = 5.1
|  year = 2007
|  month = 06
|  day = 27
|  svn rev = 42083
|  language = R
|  version.string = R version 2.5.1 (2007-06-27)
| 
| Locale:
| 
LC_CTYPE=fr_CA.UTF-8;LC_NUMERIC=C;LC_TIME=fr_CA.UTF-8;LC_COLLATE=fr_CA.UTF-8;LC_MONETARY=fr_CA.UTF-8;LC_MESSAGES=fr_CA.UTF-8;LC_PAPER=fr_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=fr_CA.UTF-8;LC_IDENTIFICATION=C
| 
| Search Path:
|  .GlobalEnv, package:stats, package:utils, package:datasets, fp.etc, 
package:graphics, package:grDevices, package:methods, Autoloads, package:base
| 
| -- 
| François Pinard   http://pinard.progiciels-bpi.ca
| 
| __
| R-help@stat.math.ethz.ch mailing list
| https://stat.ethz.ch/mailman/listinfo/r-help
| PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
| and provide commented, minimal, self-contained, reproducible code.

-- 
Three out of two people have difficulties with fractions.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] validate (package Design): error message subscript out of bounds

2007-08-27 Thread Wentzel-Larsen, Tore
Dear R users 

I use Windows XP, R2.5.1 (I have read the posting guide, I have 
contacted the package maintainer first, it is not homework).

In a research project on renal cell carcinoma we want to compute 
Harrell's c index, with optimism correction, for a multivariate 
Cox regression and also for some univariate Cox models.
For some of these univariate models I have encountered an error
message (and no result produced) from the function validate i 
Frank Harrell's Design package:

Error in Xb(x[, xcol, drop = FALSE], coef, non.slopes, non.slopes.in.x,  : 
subscript out of bounds

The following is an artificial example wherein I have been able to 
reproduce this error message (actual data has been changed to preserve
confidentiality):

library(Design)

# an example data frame:
frame.bc - data.frame(time1 = c(9,24,28,43,58,62,66,107,116,118,123,
127,129,131,137,138,139,140,148,169,176,179,188,196,210,218,
1,1,1,2,2,3,4,8,23,32,33,34,43,44,48,51,52,54,59,59,60,60,62,
65,65,68,70,72,73,74,81,84,88,98,99,106,107,115,115,117,119,
120,122,122,122,122,126,128,130,135,136,136,138,149,151,154,
157,159,161,164,164,164,166,172,172,176,179,180,183,183,184,
187,190,197,201,201,203,203,203,209,210,214,219,227,233,4,18,
49,113,147,1,1,2,2,2,2,2,3,4,6,6,6,6,6,6,6,6,9,9,9,9,9,10,10,
10,11,12,12,12,13,14,14,17,18,18,19,19,20,20,21,21,21,21,22,23,
23,24,28,28,29,29,32,34,35,38,38,48,48,52,52,54,54,56,64,67,67,
69,70,70,72,84,88,90,114,115,140,142,154,171,195),
status1 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1),
bc1 = factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
labels=c('bc.1','bc.2')),
age = c(58,68,23,20,50,43,41,69,20,48,19,27,39,20,65,49,70,59,31,43,25,
61,60,45,34,59,32,58,30,62,26,44,52,29,40,57,33,18,50,50,55,51,38,34,
69,56,67,38,66,21,48,39,62,62,29,68,66,19,60,39,55,42,24,29,56,61,40,
52,19,40,33,67,66,51,48,63,60,58,68,60,53,20,45,62,37,38,61,63,43,67,
49,39,43,67,49,69,32,37,32,63,33,47,66,39,23,57,26,61,20,49,69,30,40,
29,38,66,60,69,69,44,65,25,41,53,18,55,45,59,49,27,51,29,67,26,24,26,
47,23,50,27,35,45,32,26,45,45,63,39,39,22,38,27,31,27,49,65,66,49,39,
21,51,49,55,63,19,26,50,21,24,34,65,33,55,33,36,53,48,25,54,58,60,34,
47,23,34,60,39,34,22,30,41,55,64,48,34,54))
frame.bc

# preparing for a simple univariate Cox regression:
dd.bc - datadist(frame.bc[, c('bc1','age')], adjto.cat='first')
options(datadist = 'dd.bc')

# a univariate Cox regression:
cph.bc - cph(formula = Surv(time1,status1)~bc1,
data = frame.bc, x=TRUE, y=TRUE, surv=TRUE)
anova(cph.bc)
cph.bc
summary(cph.bc)

# the validate command for the Cox model:
val.cph.bc - validate(cph.bc, B=200, dxy=TRUE , pr=TRUE)

--
Output from the validate command:

   training   test
Dxy   -0.124360 -0.1423409
R2 1.00  1.000
Slope  1.00  0.7919584
D  0.016791  0.0147536
U -0.002395  0.0006448
Q  0.019186  0.0141088
   training   test
Dxy   -0.191875 -0.1423409
R2 1.00  1.000
Slope  1.00  0.8936724
D  0.022397  0.0147536
U -0.002339  0.0001367
Q  0.024736  0.0146169
   training   test
Dxy   -0.199514 -0.1423409
R2 1.00  1.000
Slope  1.00  0.8075246
D  0.025717  0.0147536
U -0.002447  0.0005348
Q  0.028163  0.0142188
Error in Xb(x[, xcol, drop = FALSE], coef, non.slopes, non.slopes.in.x,  : 
subscript out of bounds


Any help/suggestions will be highly appreciated.


Sincerely,
Tore Wentzel-Larsen
statistician
Centre for Clinical research
Armauer Hansen house 
Haukeland University Hospital
N-5021 Bergen
tlf   +47 55 97 55 39 (a)
faks  +47 55 97 60 88 (a)
email [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset using noncontiguous variables by name (not index)

2007-08-27 Thread Thomas Lumley
On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:

 Gabor, That works great!

 I think this would be a very helpful addition to the main R
 distribution. Perhaps with a single colon representing numerical order
 (exactly as you have written it) and two colons representing the order
 of the variables as they appear in the data frame (your first example).
 That's analogous to SAS' x1-xN, which you know gets those N variables,
 and a--z, which selects an unknown number of variables a through z. How
 many that is depends upon their order in the data frame. That would not
 only be very useful in general, but it would also make transitioning to
 R from SAS or SPSS less confusing.

 Is R still being extended in such basic ways, or does that muck up
 existing programs too much?


In principle base R can be extended like that, but a strong case is needed 
for non-standard evaluation rules and for depleting the restricted supply 
of short binary operator names.

The reason for subset() and its behaviour is that 'variables as they 
appear the in data frame' is typically ambiguous -- which data frame?  In 
SPSS you have only one and in SAS there is a default one, so there is no 
ambiguity in X1--Y2, but in R it needs another argument specifying the 
data frame, so it can't really be a binary operator.

The double colon :: and triple colon ::: are already used for namespaces, 
and a search of r-help reveals two previous, different, suggestions for 
%:%.


-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R-2.5.1 RedHat EL5 compilation failed

2007-08-27 Thread Stefan Grosse
 Original Message  
Subject: [R] R-2.5.1 RedHat EL5 compilation failed
From: Wang Chengbin [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Date: 26.08.2007 15:22
 I can't get R-2.5.1 compiled under RedHat EL5 with gcc 4.1.1. Configure
 failed at the following:
   
You don't need to compile, you could also use the Fedora Core 6 Extras
repository package(s) of R (current: is R-2.5.1-2.fc6.i386.rpm) to
install the necessary rpm packages from there. (Best is to use the smart
package manager, there you can easily activate channels which are
repositories.) As far as I understood FC6 is the base of RHEL 5.

Stefan
-=-=-
... Time is an illusion, lunchtime doubly so. (Ford Prefect)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] p-Value

2007-08-27 Thread Daniel Lakeland
On Mon, Aug 27, 2007 at 11:49:19AM +0500, amna khan wrote:
 Hi Sir
 
 When we use Kendall Package to obtain Kendall's Tau statistic.
 Then we also get two-sided p value. What does two-sided p-value mean?
 The word two-sided is confusing to understand.

Two-sided is sometimes also called two-tailed... It refers to the
probability if being farther away from 0 than the observed value *in
either direction*


-- 
Daniel Lakeland
[EMAIL PROTECTED]
http://www.street-artists.org/~dlakelan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Robust Standar Errors in Zero-Truncated Poisson

2007-08-27 Thread Pedro Mota Veiga

Hi.

I would like to know if is it possible to estimate zero-truncated count
models with robust standard errors in R. In Stata that is possible. I
already made some searches and attempts but not obtained it. In R I made the
estimation of the truncated poisson by the vglm command of VGAM package .
-- 
View this message in context: 
http://www.nabble.com/Robust-Standar-Errors-in-Zero-Truncated-Poisson-tf4336437.html#a12351638
Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Sequential Rank Test

2007-08-27 Thread Bernardo Rangel Tura
Hi R-Masters


I need use a sequential approach in serie of cases, but may data  is not
normal.

If data is normal distribution is very easy create analysis using
likelihood ratio like of Wald test.

But in my case I need use a non-parametric test (Mann-Whitney).

I was use:  RSiteSearch(sequential rank test) but not solve my
problem.

Do you know routine or package implement sequential rank test in R?

Thanks in advance


-- 
Bernardo Rangel Tura, M.D,Ph.D
National Institute of Cardiology
Brazil

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sequential Rank Test

2007-08-27 Thread Henrique Dallazuanna
Hi Bernardo,

I think that ?wilcox.test will help you.


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

On 27/08/07, Bernardo Rangel Tura [EMAIL PROTECTED] wrote:

 Hi R-Masters


 I need use a sequential approach in serie of cases, but may data  is not
 normal.

 If data is normal distribution is very easy create analysis using
 likelihood ratio like of Wald test.

 But in my case I need use a non-parametric test (Mann-Whitney).

 I was use:  RSiteSearch(sequential rank test) but not solve my
 problem.

 Do you know routine or package implement sequential rank test in R?

 Thanks in advance


 --
 Bernardo Rangel Tura, M.D,Ph.D
 National Institute of Cardiology
 Brazil

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sequential Rank Test

2007-08-27 Thread Birgit Lemcke
I looked for the same topic today and found ?wilcox.test in the stats  
package.

B

Am 27.08.2007 um 17:33 schrieb Bernardo Rangel Tura:

 Hi R-Masters


 I need use a sequential approach in serie of cases, but may data   
 is not
 normal.

 If data is normal distribution is very easy create analysis using
 likelihood ratio like of Wald test.

 But in my case I need use a non-parametric test (Mann-Whitney).

 I was use:  RSiteSearch(sequential rank test) but not solve my
 problem.

 Do you know routine or package implement sequential rank test in R?

 Thanks in advance


 -- 
 Bernardo Rangel Tura, M.D,Ph.D
 National Institute of Cardiology
 Brazil

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

Birgit Lemcke
Institut für Systematische Botanik
Zollikerstrasse 107
CH-8008 Zürich
Switzerland
Ph: +41 (0)44 634 8351
[EMAIL PROTECTED]






[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] FW: subset using noncontiguous variables by name (not index)

2007-08-27 Thread Muenchen, Robert A (Bob)
Thomas, that's a good point. I was thinking of anscombe[x1::y1] making
it clear which one, but you would then want just x1::y1 to have
unambiguous meaning on its own, which is impossible.

As for x1:xN, it's unambiguous on its own. I thought one of the great
advantages of R was that it could use different methods so that a new
operator would not be needed. The colon operator would just have a new
method for when stringN appeared. One that would be very useful  have
obvious meaning. 

Thanks,
Bob

 -Original Message-
 From: Thomas Lumley [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 27, 2007 10:25 AM
 To: Muenchen, Robert A (Bob)
 Cc: r-help@stat.math.ethz.ch
 Subject: Re: [R] subset using noncontiguous variables by name (not
 index)
 
 On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:
 
  Gabor, That works great!
 
  I think this would be a very helpful addition to the main R
  distribution. Perhaps with a single colon representing numerical
 order
  (exactly as you have written it) and two colons representing the
 order
  of the variables as they appear in the data frame (your first
 example).
  That's analogous to SAS' x1-xN, which you know gets those N
 variables,
  and a--z, which selects an unknown number of variables a through z.
 How
  many that is depends upon their order in the data frame. That would
 not
  only be very useful in general, but it would also make transitioning
 to
  R from SAS or SPSS less confusing.
 
  Is R still being extended in such basic ways, or does that muck up
  existing programs too much?
 
 
 In principle base R can be extended like that, but a strong case is
 needed
 for non-standard evaluation rules and for depleting the restricted
 supply
 of short binary operator names.
 
 The reason for subset() and its behaviour is that 'variables as they
 appear the in data frame' is typically ambiguous -- which data frame?
 In
 SPSS you have only one and in SAS there is a default one, so there is
 no
 ambiguity in X1--Y2, but in R it needs another argument specifying the
 data frame, so it can't really be a binary operator.
 
 The double colon :: and triple colon ::: are already used for
 namespaces,
 and a search of r-help reveals two previous, different, suggestions
for
 %:%.
 
 
   -thomas
 
 Thomas Lumley Assoc. Professor, Biostatistics
 [EMAIL PROTECTED] University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formatting Sweave in R-News

2007-08-27 Thread Arjun Narayan

 Thank you Paul for your response. Unfortunately that did not work. A
 figure environment frames it neatly, but still contained in only one column.
 I have tried various methods, but they all seem to not work, or if the
 solutions involve manually setting the size, the grey column separator still
 runs through the middle of the page.

 I know a solution exists, because on page 21, Vol 1/1 of R-News, there is
 an image that spans both columns. Do you know where I could get the Rnw
 source files for R-news articles? That would at least allow me to trawl for
 a solution.

 Best regards,
 Arjun

 On 8/22/07, Paul Murrell [EMAIL PROTECTED] wrote:
 
  Hi
 
 
  Arjun Ravi Narayan wrote:
   Hi,
  
   I am editing a document for submission to the R-news newsletter, and
   in my article my Sweave code inserts a dynamically generated PDF
   report that my R program generates.
  
   However, when I insert the PDF using the following Sweave code:
  
   \newpage
   \includegraphics[scale=1.0]{\Sexpr{print(location)}}
   \newpage
  
   (in tex this looks like):
   \newpage
   \includegraphics[scale=1.0]{/home/arjun/sample.pdf}
   \newpage
 
 
  Try putting your image in a figure* environment (should go full width of
  the page).
 
  Paul
 
 
  
   However, the r-news style package over-rides everything that I can set
   (including using the minipage option) to make my included PDF small
   sized. Part of the problem is that the R-news style specifies a
   two-column formatting, and so the PDF is shrunk to fit in one column.
   How can I, for just one page, over-ride the styles to include the PDF?
   Even if I hard-hack the graphics to be scaled up in size, that does
   not get rid of the vertical line that in between the two columns, and
   thus breaking my image.
  
   I realise that this is not an R problem, but more a latex problem, but
   I am hoping that somebody has faced similar problems with the Rnews
   styles and has an idea on how to do this.
  
  
   Thank you,
  
   Yours sincerely,
 
 
  --
  Dr Paul Murrell
  Department of Statistics
  The University of Auckland
  Private Bag 92019
  Auckland
  New Zealand
  64 9 3737599 x85392
  [EMAIL PROTECTED]
  http://www.stat.auckland.ac.nz/~paul/
 



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formatting Sweave in R-News

2007-08-27 Thread Arjun Narayan
Dear Paul,

I stand corrected. Your solution was the right way. The following code now
works:
(Apparently I still need to specify the width command as my pdf is
incorrectly sized by default)

\begin{figure*}[b]
\begin{center}
\includegraphics[width=8in]{generatedPDF.pdf}
\end{center}
\end{figure*}

There is a full explanation in the template.tex file which can be found in
the RNews tutorial here: http://cran.r-project.org/doc/Rnews/template.tex

Thank you for your time.

Best regards,
Arjun



 Try putting your image in a figure* environment (should go full width of
 the page).

 Paul

 Dr Paul Murrell
 Department of Statistics
 The University of Auckland
 Private Bag 92019
 Auckland
 New Zealand
 64 9 3737599 x85392
 [EMAIL PROTECTED]
 http://www.stat.auckland.ac.nz/~paul/


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Max vs summary inconsistency

2007-08-27 Thread Adam D. I. Kramer
Hello,

I'm having the following questionable behavior:

 summary(m)
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1   13000   26280   25890   38550   50910 
 max(m)
[1] 50912

 typeof(m)
[1] integer
 class(m)
[1] integer

...it seems to me like max() and summary(m)[6] ought to return the same
number. Am I doing something wrong?

I'm running R 2.5.1 (2007-06-27), installed on MacOSX from the dmg file
found on CRAN.

--
Adam D. I. Kramer
Ph.D. Student, University of Oregon
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] subset question

2007-08-27 Thread Kirsten Beyer
I would like to code records in a dataset with a 1 if any of the
columns 9-67 contain a particular code, and zero if they don't.  I've
been working with subset and it seems that something like
subset(data, data[9:67]--12345) would work, but I have been
unsuccessful so far.  It seems like a simple problem - any help is
appreciated!

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Max vs summary inconsistency

2007-08-27 Thread Thomas Lumley
On Mon, 27 Aug 2007, Adam D. I. Kramer wrote:

 Hello,

 I'm having the following questionable behavior:

 summary(m)
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1   13000   26280   25890   38550   50910
 max(m)
 [1] 50912

 typeof(m)
 [1] integer
 class(m)
 [1] integer

 ...it seems to me like max() and summary(m)[6] ought to return the same
 number. Am I doing something wrong?


They do return the same number, they just print it differently. summary() 
prints four significant digits by default.

  -thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] subset using noncontiguous variables by name (not index)

2007-08-27 Thread Muenchen, Robert A (Bob)
Thanks for helping me see why R doesn't have the obvious! -Bob

 -Original Message-
 From: Thomas Lumley [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 27, 2007 2:12 PM
 To: Muenchen, Robert A (Bob)
 Subject: RE: [R] subset using noncontiguous variables by name (not
 index)
 
 On Mon, 27 Aug 2007, Muenchen, Robert A (Bob) wrote:
 
  Thomas, that's a good point. I was thinking of anscombe[x1::y1]
 making
  it clear which one, but you would then want just x1::y1 to have
  unambiguous meaning on its own, which is impossible.
 
  As for x1:xN, it's unambiguous on its own.
 
 
 It actually isn't. We already have a meaning. Consider
x1-4
xN-6
x1:xN
 It also breaks R's argument passing rules by treating x1 as string
 rather than a name.
 
 What would be unambiguous at the moment is x1:x4, provided there
 was a sufficiently precise set of rules on what was allowed. Consider
   x1:x-1(negative?)
   x1:x3.14  (non-integer?)
   x3.12:x3.14 (is the prefix x or x3.?)
   x1:X4 (the prefix changes)
   01:14 (is the prefix empty or 0?)
   x09:xA2 (is this illegal decimal or legal hexadecimal?)
   IL23R1:IL23R4 (what is the prefix?)
   x1a:x4a(infix numbering?)
 
 
 
   -thomas
 
 Thomas Lumley Assoc. Professor, Biostatistics
 [EMAIL PROTECTED] University of Washington, Seattle


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Max vs summary inconsistency

2007-08-27 Thread François Pinard
[Adam D. I. Kramer]

I'm having the following questionable behavior:

 summary(m)
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1   13000   26280   25890   38550   50910 
 max(m)
[1] 50912

...it seems to me like max() and summary(m)[6] ought to return the same
number.  Am I doing something wrong?

Some may say that you did not scrutinize the documentation enough, as 
summary artificially limits the number of significant digits.

However, this question reoccurs often and regularly in these mailing 
lists, so at last, maybe something should be done about it, beyond 
documenting how it works.  Overall, too many users got mislead, that one 
may not so bluntly assert they are all wrong.

For example, resorting to scientific notation whenever non significant 
zero digits would have otherwise been printed.  This should clarify 
a bit that the printing precision got artificially limited.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Max vs summary inconsistency

2007-08-27 Thread Adam D. I. Kramer


On Mon, 27 Aug 2007, François Pinard wrote:


summary(m)

   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  1   13000   26280   25890   38550   50910 

max(m)

[1] 50912



...it seems to me like max() and summary(m)[6] ought to return the same
number.  Am I doing something wrong?


Some may say that you did not scrutinize the documentation enough, as
summary artificially limits the number of significant digits.


Indeed, several have said so in private email as well as email to the list.
Thanks to all, apologies for my lack of scrutiny.


However, this question reoccurs often and regularly in these mailing
lists, so at last, maybe something should be done about it, beyond
documenting how it works.  Overall, too many users got mislead, that one
may not so bluntly assert they are all wrong.


I would agree, and not only because I was misled: Several people are
scrutinizing the RESPONSE of summary()'s output, and noticing it is
incorrect.

However, it is very VERY likely that many more are NOT scrutinizing it, and
as such are forming false beliefs about their data sets, which may be
subsequently published or used in further analyses.

Taking a small step in the implementation of summary() to potentially
prevent the publication of incorrect data seems worthwhile. Certainly, any
researcher should check their output in many ways, but it makes no sense to
me that summary() would round its output to 4 significant digits by default.


For example, resorting to scientific notation whenever non significant
zero digits would have otherwise been printed.  This should clarify a bit
that the printing precision got artificially limited.


I think this is a great solution, though I'm not sure whether scripts that
use summary() would break if passed a number in scientific notation.

That said, scripts that use summary() are probably assuming that the number
reported is maximally precise, and thus are making the same mistake I
did...and thus should indeed break!

--
Adam Kramer
Ph.D. Student, Social Psychology
University of Oregon
[EMAIL PROTECTED]__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to include bar values in a barplot?

2007-08-27 Thread Frank E Harrell Jr
Donatas G. wrote:
 On Tuesday 07 August 2007 22:09:52 Donatas G. wrote:
 How do I include bar values in a barplot (or other R graphics, where this
 could be applicable)?

 To make sure I am clear I am attaching a barplot created with
 OpenOffice.org which has barplot values written on top of each barplot.
 
 Here is the barplot mentioned above:
 http://dg.lapas.info/wp-content/barplot-with-values.jpg
 
 it appeaars that this list does not allow attachments...
 
That is a TERRIBLE graphic.  Can't we finally leave this subject alone?

Frank Harrell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] use apply function with which

2007-08-27 Thread schuurmans
Dear R-users,

For a data frame (say in this example X) I want to look up the
corresponding value in a 'look-up data frame' (in this example Y). The
for-loop works but is very time-consuming because 'X' in reality is very
big.
Therefore I would like to have a solution with apply. However, I do not
succeed. Any suggestions?

Thanks in advance,

Hanneke

c1=c('a','a','b')
c2=c('j','k','k')

V1=c('a','a','a','a','b','b','b','b'))
V2=c('i','j','k','l','i','j','k','l')
V3=c(4,3,2,1,8,5,2,-1)


X=NULL
X$c1=c1
X$c2=c2
X=as.data.frame(X)
Y=NULL
Y$V1=V1
Y$V2=V2
Y$V3=V3
Y=as.data.frame(Y)

result=NULL
for (i in 1:dim(X)[1])
{
result=rbind(result, Y$V3[which(Y$V1==as.character(X[i,]$c1) 
Y$V2==as.character(X[i,]$c2))])
}

###
which.search=function(X,Y,c1,c2,V1,V2,V3)
Y$V3[which(Y$V1==as.character(X$c1)  Y$V2==as.character(X$c2))]

apply(X,1,which.search,X=X,Y=Y,c1='c1',c2='c2',V1='V1',V2='V2',V3='V3')

###
 sessionInfo()
R version 2.5.1 (2007-06-27)
i386-pc-mingw32

locale:
LC_COLLATE=Dutch_Netherlands.1252;LC_CTYPE=Dutch_Netherlands.1252;LC_MONETARY=Dutch_Netherlands.1252;LC_NUMERIC=C;LC_TIME=Dutch_Netherlands.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods 
 base

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] grouping scat1d/rug and plotting to 2 axes

2007-08-27 Thread Mike
Hi,

I'm wondering if anybody can offer a bit of  guidance on how to add a 
couple of features to a plot. 

I'm using Frank Harrell's Design library to model some survival data in 
R (2.3.1, windows platform).  I'm fairly comfortable with the survival 
modeling in Design, but am still at a frustratingly low level of 
competence when it comes to creating anything beyond simple plots in R.

A simplified version of the model is:

fit - cph(Surv(survtime,deceased) ~ rcs(smw,4), 
data=survdata,x=T,y=T,surv=T )

And the basic plot is:

plot(fit,smw=NA, fun=function(x) 1/(1+exp(-x)))

I know that if I add

scat1d(smw)

I get a nice jittered rug plot of all values of the predictor smw on the 
top axis.

What I'd like to do, however, is to plot on bottom axis the values of 
smw for only those participants who are alive, and then on the top axis, 
plot the values of smw for those who are deceased.  I'd appreciate any 
tips as to how I might approach this.

Thanks,

Mike Babyak

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use apply function with which

2007-08-27 Thread Charles C. Berry
On Mon, 27 Aug 2007, [EMAIL PROTECTED] wrote:

 Dear R-users,

 For a data frame (say in this example X) I want to look up the
 corresponding value in a 'look-up data frame' (in this example Y). The
 for-loop works but is very time-consuming because 'X' in reality is very
 big.
 Therefore I would like to have a solution with apply. However, I do not
 succeed. Any suggestions?

 Thanks in advance,

 Hanneke

 c1=c('a','a','b')
 c2=c('j','k','k')

 V1=c('a','a','a','a','b','b','b','b'))

You have a syntax error in the previous line - '))'


 V2=c('i','j','k','l','i','j','k','l')
 V3=c(4,3,2,1,8,5,2,-1)


 X=NULL
 X$c1=c1
 X$c2=c2
 X=as.data.frame(X)
 Y=NULL
 Y$V1=V1
 Y$V2=V2
 Y$V3=V3
 Y=as.data.frame(Y)

 result=NULL
 for (i in 1:dim(X)[1])
 {
 result=rbind(result, Y$V3[which(Y$V1==as.character(X[i,]$c1) 
 Y$V2==as.character(X[i,]$c2))])
 }

 ###
 which.search=function(X,Y,c1,c2,V1,V2,V3)
 Y$V3[which(Y$V1==as.character(X$c1)  Y$V2==as.character(X$c2))]

 apply(X,1,which.search,X=X,Y=Y,c1='c1',c2='c2',V1='V1',V2='V2',V3='V3')

^^^^...

You use X twice in this expression. If you delete 'X=X,' and revise 
which.search to

  which.search - function( X, Y, c1, c2, V1, V2, V3 )
 Y$V3[ which( Y$V1==as.character( X[c1] ) 
  Y$V2 == as.character( X[ c2 ] ) ) ]

to get rid of the $ operator which is deprecated for atomic vectors,

(and fix the above syntax error) then this expression agrees with 'result'

If you know that the matches are unique (only one row in Y will match any 
row of X), then

match( paste( X$c1, X$c2 ) , paste( Y$V1, Y$V2 ))

will be fast.

If nrow(Y) is small,

which(
outer(Y$V1, as.character(X$c1), == ) 
outer(Y$V2, as.character(X$c2), == ),
  arr.ind = TRUE )

will also be quick.


Otherwise something like

unlist( lapply( paste( X$c1, X$c2 ), match, paste( Y$V1, Y$V2 )) )

may be a good bet.


Please learn to use the space key to format your code in a more readable 
fashion!


HTH,

Chuck


 ###
 sessionInfo()
 R version 2.5.1 (2007-06-27)
 i386-pc-mingw32

 locale:
 LC_COLLATE=Dutch_Netherlands.1252;LC_CTYPE=Dutch_Netherlands.1252;LC_MONETARY=Dutch_Netherlands.1252;LC_NUMERIC=C;LC_TIME=Dutch_Netherlands.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods
 base

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rmpi and x86

2007-08-27 Thread Edna Bell
Dear R Gurus:

Is there a problem with Rmpi on x86 with SUSE 10.1, please?

I've tried everything and it still won't load.

Has anyone else dealt with this please?

Thanks,
Edna Bell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grouping scat1d/rug and plotting to 2 axes

2007-08-27 Thread Frank E Harrell Jr
Mike wrote:
 Hi,
 
 I'm wondering if anybody can offer a bit of  guidance on how to add a 
 couple of features to a plot. 
 
 I'm using Frank Harrell's Design library to model some survival data in 
 R (2.3.1, windows platform).  I'm fairly comfortable with the survival 
 modeling in Design, but am still at a frustratingly low level of 
 competence when it comes to creating anything beyond simple plots in R.
 
 A simplified version of the model is:
 
 fit - cph(Surv(survtime,deceased) ~ rcs(smw,4), 
 data=survdata,x=T,y=T,surv=T )
 
 And the basic plot is:
 
 plot(fit,smw=NA, fun=function(x) 1/(1+exp(-x)))

or plot(fit, smw=NA, fun=plogis).  But what does the logistic model have 
to do with the Cox model you fitted?  You can instead do plot(fit, 
smw=NA, time=1) to plot estimated 1-year survival prob.

 
 I know that if I add
 
 scat1d(smw)
 
 I get a nice jittered rug plot of all values of the predictor smw on the 
 top axis.
 
 What I'd like to do, however, is to plot on bottom axis the values of 
 smw for only those participants who are alive, and then on the top axis, 
 plot the values of smw for those who are deceased.  I'd appreciate any 
 tips as to how I might approach this.

That isn't so well defined because of variable follow-up time.  I would 
not get very much out of such a plot.

Frank

 
 Thanks,
 
 Mike Babyak
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] oddity with method definition

2007-08-27 Thread Faheem Mitha

Just wondered about this curious behaviour. I'm trying to learn about 
classes. Basically setMethod works the first time, but does not seem to 
work the second time.
 Faheem.
*
setClass(foo, representation(x=numeric))

bar - function(object)
   {
 return(0)
   }

bar.foo - function(object)
   {
 print([EMAIL PROTECTED])
   }
setMethod(bar, foo, bar.foo)

bar(f)

# bar(f) gives 1.

bar - function(object)
   {
 return(0)
   }

bar.foo - function(object)
   {
 print([EMAIL PROTECTED])
   }
setMethod(bar, foo, bar.foo)

f = new(foo, x= 1)

bar(f)

# bar(f) gives 0, not 1.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] oddity with method definition

2007-08-27 Thread Duncan Murdoch
On 27/08/2007 5:47 PM, Faheem Mitha wrote:
 Just wondered about this curious behaviour. I'm trying to learn about 
 classes. Basically setMethod works the first time, but does not seem to 
 work the second time.
  Faheem.
 *
 setClass(foo, representation(x=numeric))
 
 bar - function(object)
{
  return(0)
}
 
 bar.foo - function(object)
{
  print([EMAIL PROTECTED])
}
 setMethod(bar, foo, bar.foo)

This changes the definition of bar:  now it becomes a generic function 
instead of a simple function.

 
 bar(f)
 
 # bar(f) gives 1.

(You forgot the f = new(foo, x= 1) line, but that's somewhat obvious.)
 
 bar - function(object)
{
  return(0)
}

Now bar is a regular function again.
 
 bar.foo - function(object)
{
  print([EMAIL PROTECTED])
}
 setMethod(bar, foo, bar.foo)

Now the generic would call that method, but you've wiped out the generic.

 
 f = new(foo, x= 1)
 
 bar(f)
 
 # bar(f) gives 0, not 1.

The problem is that setting a method on a regular function automagically 
creates a generic for it, but redefining a function doesn't remove the 
generic.  It's still there, somewhere in R's insides, and if you could 
find it to call it your method would get called.  But you're calling the 
plain old bar() instead.

This behaviour makes more sense if you think about generics in other 
packages.  There's a generic called show in the methods package.  But 
you can define your own function called show, and in your workspace, 
you'd want to call that, not the one from methods.

I'd recommend using setGeneric() to create a generic, rather than 
depending on the automatic creation, to avoid this kind of confusion.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] oddity with method definition

2007-08-27 Thread Thomas Lumley
On Mon, 27 Aug 2007, Faheem Mitha wrote:


 Just wondered about this curious behaviour. I'm trying to learn about
 classes. Basically setMethod works the first time, but does not seem to
 work the second time.
 Faheem.
 *
 setClass(foo, representation(x=numeric))

 bar - function(object)
   {
 return(0)
   }

 bar.foo - function(object)
   {
 print([EMAIL PROTECTED])
   }
 setMethod(bar, foo, bar.foo)

 bar(f)

 # bar(f) gives 1.

Not for me. It gives
 bar(f)
Error: object f not found
Error in bar(f) : error in evaluating the argument 'object' in selecting a
method for function 'bar'

However, if I do
f = new(foo, x= 1)
first, it gives 1.

 bar - function(object)
   {
 return(0)
   }

Here you have masked the generic bar() with a new function bar(). Redefining 
bar() is the problem, not the second setMethod().

 bar.foo - function(object)
   {
 print([EMAIL PROTECTED])
   }
 setMethod(bar, foo, bar.foo)

Because there was a generic bar(), even though it is overwritten by the new 
bar(), setMethod() doesn't automatically create another generic.

 f = new(foo, x= 1)

 bar(f)

 # bar(f) gives 0, not 1.


Because bar() isn't a generic function
 bar
function(object)
   {
 return(0)
   }


If you had used setGeneric() before setMethod(), as recommended, your example 
would have done what you expected, but it would still have wiped out any 
previous methods for bar() -- eg, try
  setMethod(bar,baz, function(object) print(baz))
before you redefine bar(), and notice that getMethod(bar,baz) no longer 
finds it.



-thomas

Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Calculating diameters of cirkels in a picture.

2007-08-27 Thread Moshe Olshansky
Hi Bart,

Let's assume that you situation was simpler - you have
a BW (Black and White) image containing circles (in
white) and you need to find the diameter of each
circle (and of course to know how many circles you
have). This can be done with labeling of connected
components. You say that two pixels are neighbors if
they have common edge (4-connectivity) or at least a
common vertex (8-connectivity). So now you can treat
your image (white pixels) as a graph (with edges
connecting any two neighbors). Then each connected
component of that graph corresponds to a circle. There
exists a well know algorithm to do this. It takes the
original BW image (where every image pixel has the
value of 1 and background pixel the value of 0) and
produces an image where every background pixel still
has the value of 0, every pixel of the first connected
component has the value of 1, every pixel of the
second connected component has the value of 2, etc.
So no you can process each connected component (circle
in your case) separately.
Basically this is all you need. You can either count
the number of pixels having the value of k to find the
area (and then the diameter) or just take (maximal x
value) - (minimal x value) + 1.
In your case it can happen that after you convert your
image into BW image some circles will have holes
inside with some small objects inside these holes, and
you do not want to consider these small objects as
additional circles. So I thought of using
morphological closing to get rid of small holes, but
as I wrote in the following note you do not need this.
When you get the BW image take the complimentary one
(i.e. background pixels have the value of 1 and image
pixels the value of 0). Label the connected components
of the background. Only one of them is real background
- all others are inside circles. Real background
touches the image boundaries. Now go to the original
BW image and give all the pixels outside the real
background the value of 1. Now all your circles are
full (no holes) and you can proceed as above.

Best regards,

Moshe.

--- Bartjoosen [EMAIL PROTECTED] wrote:

 
 Hi All,
 
 I really like to thank you for the answers, while I
 was searching for some
 edge detection and clustering algorithms, Moshe came
 with a simple but
 effective solution: use the area to find the
 diameter!
 
 But I tried Moshe's solution, but I couldn't figure
 out what you mean with
 morphological closing and the labeling to split the
 images.
 Could you please clarify this a bit?
 
 Thanks for your support
 
 
 Bart
 
 
 Moshe Olshansky-2 wrote:
  
  Hi Bart,
  
  One more comment:
  
  You do not really need the morphological closing
 to
  close the holes inside the circles. Another
  possibility is to reverse the black-and-withe
 picture,
  i.e. make the holes and background be 1 and the
  circles 0, label the connected components and then
  only the component which touches the boundaries is
 the
  background while all other components are holes
 and
  you can make them white (1) in the original
  black-and-white image.
  
  --- Moshe Olshansky [EMAIL PROTECTED] wrote:
  
  Hi Bart,
  
  I have never used image processing software in R
 (I
  was doing this with Matlab), but here is what I
  would
  have done algorithmically:
  1) convert the picture to gray-scale
  2) find a threshold value which separates the
  circles
  from the background and convert your image to
 black
  and white
  3) if the circles are far apart use morphological
  closing to fill in small holes inside the circles
  (may
  be do this several times)
  4) use labeling to split the image into connected
  components
  5) for each connected component get it's area
 (the
  number of pixels) and use the formula S = Pi*R^2
 to
  find the approximate radii.
  
  Regards,
  
  Moshe.
  
  --- Julian Burgos [EMAIL PROTECTED]
 wrote:
  
   Hi Bart,
   
   If you only have 36 circles, the fastest way
 would
   be to use some image 
   processing software and measure the circles by
   hand.  One option is to 
   use ImageJ, which you can download here
   
   http://rsb.info.nih.gov/ij/
   
   Julian
   
   Bart Joosen wrote:
Hi,
   
Maybe this is more a programming questions
 than
  a
   specific R-project question, but maybe there is
   someone who can point me in the right
 direction.
   
I have a picture of cirkels which I took with
 a
   digital camera.
Now I want to use the diameter of the cirkels
 on
   the picture for analysis in R.
I can use pixmap to import the picture, but
 how
  do
   I find the outside cirkels and calculate the
   diameter?
I pointed out that I can use the edci
 package,
  but
   then I need to preprocess the data to reduce
 the
   points, otherwise it takes a long time, and my
   computer crashes.
   
If you want to see such a picture, I cropped
 a
   larger one, and highlighted the cirkel which is
 of
   interest.
In a real world, this is a plate with 36
  cirkels,
   which all should be 

Re: [R] subset question

2007-08-27 Thread jim holtman
Here is one way of checking to see if a row contains a particular
value and setting the contents of a new column:

n - 20
# create test data
x - 
data.frame(sample(letters,n),sample(letters,n),sample(letters,n),sample(letters,n))
# add a column indicating if the row contains 'a', 'b' or 'c'
x$a - apply(x[, 1:4], 1, function(.row) any(.row %in% c('a','b','c'))) + 0


On 8/27/07, Kirsten Beyer [EMAIL PROTECTED] wrote:
 I would like to code records in a dataset with a 1 if any of the
 columns 9-67 contain a particular code, and zero if they don't.  I've
 been working with subset and it seems that something like
 subset(data, data[9:67]--12345) would work, but I have been
 unsuccessful so far.  It seems like a simple problem - any help is
 appreciated!

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] validate (package Design): error message subscript out of bounds

2007-08-27 Thread Frank E Harrell Jr
Wentzel-Larsen, Tore wrote:
 Dear R users 
 
 I use Windows XP, R2.5.1 (I have read the posting guide, I have 
 contacted the package maintainer first, it is not homework).
 
 In a research project on renal cell carcinoma we want to compute 
 Harrell's c index, with optimism correction, for a multivariate 
 Cox regression and also for some univariate Cox models.
 For some of these univariate models I have encountered an error
 message (and no result produced) from the function validate i 
 Frank Harrell's Design package:
 
 Error in Xb(x[, xcol, drop = FALSE], coef, non.slopes, non.slopes.in.x,  : 
 subscript out of bounds
 
 The following is an artificial example wherein I have been able to 
 reproduce this error message (actual data has been changed to preserve
 confidentiality):

I could not reproduce the error on R 2.5.1 on linux using version 2.0-12 
of Design (you did not provide this information).

Your code involved a good deal of extra typing.  Here is a streamlined 
version:

bc - data.frame(time1 = c(9,24,28,43,58,62,66,107,116,118,123,
127,129,131,137,138,139,140,148,169,176,179,188,196,210,218,

bc

library(Design)

dd - with(bc, datadist(bc1, age, adjto.cat='first'))
options(datadist = 'dd')

f - cph(Surv(time1,status1) ~ bc1,
  data = bc, x=TRUE, y=TRUE, surv=TRUE)
anova(f)
f
summary(f)

val - validate(f, B=200, dxy=TRUE)

I don't get much value of putting the type of an object as part of the 
object's name, as information within objects defines the object type/class.

There is little reason to validate a one degree of freedom model.

Frank

 
 library(Design)
 
 # an example data frame:
 frame.bc - data.frame(time1 = c(9,24,28,43,58,62,66,107,116,118,123,
   127,129,131,137,138,139,140,148,169,176,179,188,196,210,218,
   1,1,1,2,2,3,4,8,23,32,33,34,43,44,48,51,52,54,59,59,60,60,62,
   65,65,68,70,72,73,74,81,84,88,98,99,106,107,115,115,117,119,
   120,122,122,122,122,126,128,130,135,136,136,138,149,151,154,
   157,159,161,164,164,164,166,172,172,176,179,180,183,183,184,
   187,190,197,201,201,203,203,203,209,210,214,219,227,233,4,18,
   49,113,147,1,1,2,2,2,2,2,3,4,6,6,6,6,6,6,6,6,9,9,9,9,9,10,10,
   10,11,12,12,12,13,14,14,17,18,18,19,19,20,20,21,21,21,21,22,23,
   23,24,28,28,29,29,32,34,35,38,38,48,48,52,52,54,54,56,64,67,67,
   69,70,70,72,84,88,90,114,115,140,142,154,171,195),
   status1 = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
   0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
   1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
   1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
   1,1,1,1,1),
   bc1 = factor(c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2,2,
   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,
   2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2),
   labels=c('bc.1','bc.2')),
   age = c(58,68,23,20,50,43,41,69,20,48,19,27,39,20,65,49,70,59,31,43,25,
   61,60,45,34,59,32,58,30,62,26,44,52,29,40,57,33,18,50,50,55,51,38,34,
   69,56,67,38,66,21,48,39,62,62,29,68,66,19,60,39,55,42,24,29,56,61,40,
   52,19,40,33,67,66,51,48,63,60,58,68,60,53,20,45,62,37,38,61,63,43,67,
   49,39,43,67,49,69,32,37,32,63,33,47,66,39,23,57,26,61,20,49,69,30,40,
   29,38,66,60,69,69,44,65,25,41,53,18,55,45,59,49,27,51,29,67,26,24,26,
   47,23,50,27,35,45,32,26,45,45,63,39,39,22,38,27,31,27,49,65,66,49,39,
   21,51,49,55,63,19,26,50,21,24,34,65,33,55,33,36,53,48,25,54,58,60,34,
   47,23,34,60,39,34,22,30,41,55,64,48,34,54))
 frame.bc
 
 # preparing for a simple univariate Cox regression:
 dd.bc - datadist(frame.bc[, c('bc1','age')], adjto.cat='first')
 options(datadist = 'dd.bc')
 
 # a univariate Cox regression:
 cph.bc - cph(formula = Surv(time1,status1)~bc1,
   data = frame.bc, x=TRUE, y=TRUE, surv=TRUE)
 anova(cph.bc)
 cph.bc
 summary(cph.bc)
 
 # the validate command for the Cox model:
 val.cph.bc - validate(cph.bc, B=200, dxy=TRUE , pr=TRUE)
 
 --
 Output from the validate command:
 
training   test
 Dxy   -0.124360 -0.1423409
 R2 1.00  1.000
 Slope  1.00  0.7919584
 D  0.016791  0.0147536
 U -0.002395  0.0006448
 Q  0.019186  0.0141088
training   test
 Dxy   -0.191875 -0.1423409
 R2 1.00  1.000
 Slope  1.00  0.8936724
 D  0.022397  0.0147536
 U -0.002339  0.0001367
 Q  0.024736  0.0146169
training   test
 Dxy   -0.199514 -0.1423409
 R2 1.00  

Re: [R] How to provide argument when opening RGui from an external application

2007-08-27 Thread Sébastien
Thanks everyone. I actually thought about ?Rscript.exe but, having used 
only Rgui, I thought it was a instruction specific to this interface. I 
will look into it.

Sebastien

Gabor Grothendieck a écrit :
 There are also some batch files that can be used with Rscript on XP and info
 in the README here:

http://batchfiles.googlecode.com


 On 8/26/07, Sébastien [EMAIL PROTECTED] wrote:
   
 Thanks for your reply.
 When you say look into Rscript.exe, do you have a specific document in
 mind ? I tried to google it but could not find much... I forgot to
 mention in my first email that I am working under the Windows XP
 environment.

 Prof Brian Ripley a écrit :
 
 Look into Rscript.exe (on Windows), which is a flexible way to run
 scripts.  Neither using a GUI nor using source() are recommended.

 On Fri, 24 Aug 2007, Sébastien wrote:

   
 Dear R-users,

 I have written a small application (in visual basic) that automatically
 generate some R scripts. I would like to execute these scripts when my
 application is being closed.
 My problem is that I don't know how to pass the
 'source(c:/.../myscript.r)' instruction when I programmatically start
 RGui. Tinn-R is capable of doing such things, so I guess there must be a
 way to pass arguments to RGui.

 Any advice or link to relevant references would be greatly appreciated.

 Sebastien
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 


   

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Excel

2007-08-27 Thread David Scott

A common process when data is obtained in an Excel spreadsheet is to save 
the spreadsheet as a .csv file then read it into R. Experienced users 
might have learned to be wary of dates (as I have) but possibly have not 
experienced what just happened to me. I thought I might just share it with 
r-help as a cautionary tale.

I received an Excel file giving patient details. Each patient had an ID 
code in the form of three letters followed by four digits. (Actually a New 
Zealand National Health Identification.) I saved the .xls file as .csv. 
Then I opened up the .csv (with Excel) to look at it. In the column of ID 
codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.

In a column of character data, Excel had interpreted AUG2699 as a date.

The .csv did not actually have a date in that cell, but if I had saved the 
.csv file it would have.

David Scott

_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Problem with lme using glht for multiple comparisons

2007-08-27 Thread Christian Kost
Hi everyone,

I am new to R and have a question that relates to unplanned post-hoc 
comparisons using the multcomp package after a mixed effects model. I couldn't 
find the answer to it in the archive or in any manual. 

I have a dataset in which several plants have been treated in a particular way 
and a continuous response variable has been measured depending on several 
leaves per plant. I am now interested in the effect of the treatment depending 
on the age of the leaves examined. So the dataset (L1) consists of a continuous 
response variable (EFN), a fixed factor (Leafage), and a random factor (Plant).

I have set up the following mixed effects model, which works fine:


  LM-lme(EFN~Leafage,L1,~1|Plant)
   

Now all I want to do is a post-hoc analysis (multiple comparisons) for the 
fixed factor EFN. I tried the following code. According to the documentation 
this should work:


  Post - glht(LM, linfct = mcp(Leafage = Tukey))
   

However, I get this error message and don't know what to do:

Error in mcp2matrix(model, linfct = linfct) : 
Factor(s) Leafage have been specified in ‘linfct’ but cannot be found 
in ‘model’!


The factor is specified, right? So what is the problem? If I do the same with 
an normal Anova (command: aov), it works. What is the problem with the lme 
command?

Thank you very much in advance for your help.

Cheers,


Christian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel

2007-08-27 Thread Robert A LaBudde
If you format the column as Text, you won't have this problem. By 
leaving the cells as General, you leave it up to Excel to guess at 
the correct interpretation.

You will note that the conversion to a date occurs immediately in 
Excel when you enter the value. There are many formats to enter dates.

Either pre-format the column as Text, or prefix the individual entry 
with an ' to indicate text.

A similar problem occurs in R's read.table() function when a factor 
has levels that can be interpreted as numbers.

At 10:11 PM 8/27/2007, David wrote:

A common process when data is obtained in an Excel spreadsheet is to save
the spreadsheet as a .csv file then read it into R. Experienced users
might have learned to be wary of dates (as I have) but possibly have not
experienced what just happened to me. I thought I might just share it with
r-help as a cautionary tale.

I received an Excel file giving patient details. Each patient had an ID
code in the form of three letters followed by four digits. (Actually a New
Zealand National Health Identification.) I saved the .xls file as .csv.
Then I opened up the .csv (with Excel) to look at it. In the column of ID
codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.

In a column of character data, Excel had interpreted AUG2699 as a date.

The .csv did not actually have a date in that cell, but if I had saved the
.csv file it would have.

David Scott


Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
Least Cost Formulations, Ltd.URL: http://lcfltd.com/
824 Timberlake Drive Tel: 757-467-0954
Virginia Beach, VA 23464-3239Fax: 757-467-2947

Vere scire est per causas scire

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel

2007-08-27 Thread David Scott
On Tue, 28 Aug 2007, Robert A LaBudde wrote:

 If you format the column as Text, you won't have this problem. By
 leaving the cells as General, you leave it up to Excel to guess at
 the correct interpretation.


Not true actually. I had converted the column to Text because I saw the 
interpretation as a date in the .xls file. I saved the .csv file *after* 
the column had been converted to Text. Looking at the .csv file in a text 
editor, the entry is correct.

I have just rechecked this.

On reopening the .csv using Excel, the entry AUG2699 had been interpreted 
as a date, and was showing as Aug-99. Most bizarre is that the NHI value 
of AUG1838 has *not* been interpreted as a date.

David Scott


 You will note that the conversion to a date occurs immediately in
 Excel when you enter the value. There are many formats to enter dates.

 Either pre-format the column as Text, or prefix the individual entry
 with an ' to indicate text.

 A similar problem occurs in R's read.table() function when a factor
 has levels that can be interpreted as numbers.

 At 10:11 PM 8/27/2007, David wrote:

 A common process when data is obtained in an Excel spreadsheet is to save
 the spreadsheet as a .csv file then read it into R. Experienced users
 might have learned to be wary of dates (as I have) but possibly have not
 experienced what just happened to me. I thought I might just share it with
 r-help as a cautionary tale.

 I received an Excel file giving patient details. Each patient had an ID
 code in the form of three letters followed by four digits. (Actually a New
 Zealand National Health Identification.) I saved the .xls file as .csv.
 Then I opened up the .csv (with Excel) to look at it. In the column of ID
 codes I saw: Aug-99. Clicking on that entry it showed 1/08/2699.

 In a column of character data, Excel had interpreted AUG2699 as a date.

 The .csv did not actually have a date in that cell, but if I had saved the
 .csv file it would have.

 David Scott

 
 Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail: [EMAIL PROTECTED]
 Least Cost Formulations, Ltd.URL: http://lcfltd.com/
 824 Timberlake Drive Tel: 757-467-0954
 Virginia Beach, VA 23464-3239Fax: 757-467-2947

 Vere scire est per causas scire

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


_
David Scott Department of Statistics, Tamaki Campus
The University of Auckland, PB 92019
Auckland 1142,NEW ZEALAND
Phone: +64 9 373 7599 ext 86830 Fax: +64 9 373 7000
Email:  [EMAIL PROTECTED]

Graduate Officer, Department of Statistics
Director of Consulting, Department of Statistics

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel

2007-08-27 Thread Moshe Olshansky
As far as I understand, changing the format changes
the way data is displayed by Excel but this does not
change the data itself - if while reading the data
Excel decided that it was a date, it is being
converted to an integer (the number of days since
January 1, 1900 - and they mistakenly think that 1900
was a leap year) and it is stored this way.

--- David Scott [EMAIL PROTECTED] wrote:

 On Tue, 28 Aug 2007, Robert A LaBudde wrote:
 
  If you format the column as Text, you won't have
 this problem. By
  leaving the cells as General, you leave it up to
 Excel to guess at
  the correct interpretation.
 
 
 Not true actually. I had converted the column to
 Text because I saw the 
 interpretation as a date in the .xls file. I saved
 the .csv file *after* 
 the column had been converted to Text. Looking at
 the .csv file in a text 
 editor, the entry is correct.
 
 I have just rechecked this.
 
 On reopening the .csv using Excel, the entry AUG2699
 had been interpreted 
 as a date, and was showing as Aug-99. Most bizarre
 is that the NHI value 
 of AUG1838 has *not* been interpreted as a date.
 
 David Scott
 
 
  You will note that the conversion to a date occurs
 immediately in
  Excel when you enter the value. There are many
 formats to enter dates.
 
  Either pre-format the column as Text, or prefix
 the individual entry
  with an ' to indicate text.
 
  A similar problem occurs in R's read.table()
 function when a factor
  has levels that can be interpreted as numbers.
 
  At 10:11 PM 8/27/2007, David wrote:
 
  A common process when data is obtained in an
 Excel spreadsheet is to save
  the spreadsheet as a .csv file then read it into
 R. Experienced users
  might have learned to be wary of dates (as I
 have) but possibly have not
  experienced what just happened to me. I thought I
 might just share it with
  r-help as a cautionary tale.
 
  I received an Excel file giving patient details.
 Each patient had an ID
  code in the form of three letters followed by
 four digits. (Actually a New
  Zealand National Health Identification.) I saved
 the .xls file as .csv.
  Then I opened up the .csv (with Excel) to look at
 it. In the column of ID
  codes I saw: Aug-99. Clicking on that entry it
 showed 1/08/2699.
 
  In a column of character data, Excel had
 interpreted AUG2699 as a date.
 
  The .csv did not actually have a date in that
 cell, but if I had saved the
  .csv file it would have.
 
  David Scott
 
 


  Robert A. LaBudde, PhD, PAS, Dpl. ACAFS  e-mail:
 [EMAIL PROTECTED]
  Least Cost Formulations, Ltd.URL:
 http://lcfltd.com/
  824 Timberlake Drive Tel:
 757-467-0954
  Virginia Beach, VA 23464-3239Fax:
 757-467-2947
 
  Vere scire est per causas scire
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained,
 reproducible code.
 
 

_
 David Scott   Department of Statistics, Tamaki Campus
   The University of Auckland, PB 92019
   Auckland 1142,NEW ZEALAND
 Phone: +64 9 373 7599 ext 86830   Fax: +64 9 373 7000
 Email:[EMAIL PROTECTED]
 
 Graduate Officer, Department of Statistics
 Director of Consulting, Department of Statistics
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.