Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards

2007-03-09 Thread hpages
Quoting Petr Pikal [EMAIL PROTECTED]:

 Hi
 
 its a bit tricky but
 
 dup-apply(x, 2, duplicated) #which are dupplucated
 isna-apply(x, 2, is.na) #which are na
 check-dup|isna # which are both
 
 and here is your result
 
 x[rowSums(check)!=3,]
  [,1] [,2] [,3]
 [1,]132
 [2,]213
 [3,]32   NA

Hi,

The above doesn't work. No need to have NAs in x:

   x - matrix(c(2,2,1,3,2,3), ncol=2, byrow=TRUE)
   x
   [,1] [,2]
  [1,]22
  [2,]13
  [3,]23

   dup - apply(x, 2, duplicated)
   x[rowSums(check)!=2 ,]
   [,1] [,2]
  [1,]22
  [2,]13

Look at 'dup':

   dup
[,1]  [,2]
  [1,] FALSE FALSE
  [2,] FALSE FALSE
  [3,]  TRUE  TRUE

Yes, each element in the last row is a duplicate in its own col,
but this doesn't mean that the row as a whole is a duplicate.

Cheers,
H.


 
 
 Regards
 Petr
 
 
 
 
 On 8 Mar 2007 at 10:14, stacey thompson wrote:
 
 Date sent:Thu, 8 Mar 2007 10:14:37 -0500
 From: stacey thompson [EMAIL PROTECTED]
 To:   r-help@stat.math.ethz.ch
 Subject:  [R] Removing duplicated rows within a matrix,
   with missing data as wildcards
 
  I'd like to remove duplicated rows within a matrix, with missing data
  being treated as wildcards.
  
  For example
  
   x - matrix((1:3), 5, 3)
   x[4,2] = NA
   x[3,3] = NA
   x
  
   [,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA
  [4,]1   NA2
  [5,]213
  
  I would like to obtain
  
[,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA
  
  From the R-help archives, I learned about unique(x) and
  duplicated(x).
  However, unique(x) returns
  
   unique(x)
  
   [,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA
  [4,]1   NA2
  
  and duplicated(x) gives
  
   duplicated(x)
  
  [1] FALSE FALSE FALSE FALSE  TRUE
  
  I have tried various na.action 's but with unique(x) I get errors at
  best.
  
  e.g.
   unique(x, na.omit(x))
  
  Error: argument 'incomparables != FALSE' is not used (yet)
  
  How I might tackle this?
  
  Thanks,
  
  -stacey
  
  -- 
  -stacey lee thompson-
  Stagiaire post-doctorale
  Institut de recherche en biologie végétale
  Université de Montréal
  4101 Sherbrooke Est
  Montréal, Québec H1X 2B2 Canada
  [EMAIL PROTECTED]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html and provide commented,
  minimal, self-contained, reproducible code.
 
 Petr Pikal
 [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pdf device bounding box?

2007-03-09 Thread Michael Toews
I apologize if I don't fully understand your question, but the pdf 
device has a MediaBox, which is equivalent to the BoundingBox in EPS 
file. The PDFs from R are defined nicely using height/width dimensions, 
and work well with embedding in pdflatex, etc. For example:

pdf(test.pdf,height=3,width=3)
plot(1:10)

and view the (partially binary) output in your shell:
less -N test.pdf

on line 117 of this file, I see /MediaBox [0 0 216 216] which is a 3in 
by 3in box measured in PostScript points.

I don't understand how you are mixing this in with the epstopdf command. 
If you want to make both a PDF and EPS, my best advice is to do both 
directly from R (see ?postscript for EPS file generation .. the same 
example as above will have %%BoundingBox: 0 0 216 216 on line 10), and 
your output  for both formats should be clean, simple, and good enough 
for publishers and everyone else to use.

Just one caution, if you have a Windows computer and R  2.5.1 (which is 
most of us), make sure you write EPS files before loading up a PDF 
device (PR#9517).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help with zicounts

2007-03-09 Thread Achim Zeileis
Jaap:

 I have simulated data from a zero-inflated Poisson model, and would like
 to use a package like zicounts to test my code of fitting the model.
 My question is: can I use zicounts directly with the following simulated
 data?

I guess you can use zicounts, but personally I'm more familiar with 
zeroinfl() from package pscl (because I have written this function :)). 
With that you can easily do:

 beta.true-1.0
 gamma.true-1.0
 n-1000
 x-matrix(rnorm(n),n,1)
 pi-expit(x*beta.true)
 mu-exp(x*gamma.true)
 y-numeric(n) # blank vector
 z-(runif(n)pi) # logical: T with prob p_i, F otherwise
 y[z]-rpois(sum(z),mu[z]) # draw y_i ~ Poisson(mu_i) where z_i = T
 y[!z]-0 # set y_i = 0 where z_i = F

library(pscl)
zeroinfl(y ~ 0 + x | 0 + x)

which by default fits a ZIP (with log link and logit inflation).

hth,
Z

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GLMM in lme4 and Tweedie dist.

2007-03-09 Thread [EMAIL PROTECTED]
Hi there,
I've been wanting to fit a GLMM and I'm not completely sure I'm doing 
things right. As I said in a previous message my response variable is 
continuous with many zeros, so I was having a hard time finding an 
appropriate error distribution. I read some previous help mails given to 
other people advising them to use the Tweedie distribution. I'm still 
not sure if this would be appropriate for my data set, for I'm a 
beginner and really don't follow all the details. So I ran a GLMM using 
this distribution. I ran it for several models to do later model 
selection with AIC. I used the following script, where the file 
GLMM_tweedie (line 2) has a list of all the models I want to run, each 
one in the form [ x=lmer(GGgiv ~ Rank_1 + Rank_diff + DAI + 
Gen_dy*Rank_diff + Gen_dy*DAI + Gen_dy + (1| D_1) + (1| D_2), family = 
tweedie(var.power=1,link.power=0), offset=log(Dt), data=data) ]

  data=read.csv(file=GLMM_data.csv)
  models-read.table(GLMM_tweedie.txt, sep=\t)
  data$Ggrec_Dtlog = log(data$Ggrec_Dt+1)
  models-as.vector(models[,1])
  totres=c()
  for (i in 1:79) {model=models[i]
+ res=eval(parse(text=model))
+ res=AIC(logLik(x))
+ res=as.vector(res)
+ totres=rbind(totres,res)}

The output would then be just a list of all the AIC of each model. For 1 
of the models (the one in the [] above) I'm getting the following error 
message, which I don't know what it means:

CHOLMOD warning: matrix not positive definite
Error in objective(.par, ...) : Cholmod error `matrix not positive 
definite' at file:../Supernodal/t_cholmod_super_numeric.c, line 614

Could anybody give me some advice on using Tweedie distributions and 
does anybody have an idea what this error message means.
Thanks a lot in advance,
Cheers,
Cristina.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dendrogram / clusteranalysis plotting

2007-03-09 Thread Gavin Simpson
On Fri, 2007-03-09 at 01:01 +0100, bunny , lautloscrew.com wrote:
 Dear all,
 
 i performed a clusteranalysis - which worked so far...
 i plotted the dendrogram and sooo many branches, a rough sketch would  
 be enough ;)
 
 i tried max.levels therefore which worked, but not for the plot...

(re-)read ?dendrogram. function cut.dendrogram() can prune a tree's
lower branches. You can plot the returned object's $upper component,
which is itself an object of class dendrogram.

There is an example in ?dendrogram of using cut.

HTH

G

 
 i used the following
 
 plot(hcd,nodePar =nP, str(hcd,max.level=1))
 
 the output on the terminal was:
 
 --[dendrogram w/ 2 branches and 196 members at h = 2.70]
|--[dendrogram w/ 2 branches and 34 members at h = 1.79] ..
`--[dendrogram w/ 2 branches and 162 members at h = 1.95] ..
 
 which is great !
 
 but i cant get it done for the plot, the plot always shows all the  
 branches...!
 does anybody know how to fix this one ?
 
 thx in advance
 
 -m.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC  [f] +44 (0)20 7679 0565
UCL Department of Geography
Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street
London, UK[w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT  [w] http://www.freshwaters.org.uk/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how can i group branches of a dendrogram

2007-03-09 Thread Gavin Simpson
On Fri, 2007-03-09 at 02:00 +0100, bunny , lautloscrew.com wrote:
 Hi all,
 
 how can i group branches of a dendrogram ?

Err... you'll need to give us more than that to go on. What do you mean
by group? Draw a marker round broad clusters, or prune them? Or
something else? I just replied with an answer that deals with pruning
back objects of class dendrogram, but if this is not what you mean in
this mail, reply with an example of what you tried and a description of
exactly what you want to do, and maybe someone can help.

G

 
 thx in advance
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC  [f] +44 (0)20 7679 0565
UCL Department of Geography
Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street
London, UK[w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT  [w] http://www.freshwaters.org.uk/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards

2007-03-09 Thread hpages
Hi again,

Your problem as you formulated it is not clearly defined.
For example, what do you want to do with this matrix:

   x - matrix(c(1, NA, 3, NA, 2, 3), ncol=3, byrow=TRUE)
   x
   [,1] [,2] [,3]
  [1,]1   NA3
  [2,]   NA23

Remove row 1, row 2 or nothing?

Maybe you want to proceed in 2 steps:
  (1) remove strict duplicated rows
  (2) remove rows with at least 1 NA that match a row with no NAs

In this case you would not remove any row from x.

The removeLooseDupRows() function below does (2) only. If you
want (1) and (2), you need to combine it with unique() by doing
either removeLooseDupRows(unique(x)) or unique(removeLooseDupRows(x))
(both should always give the same result).

removeLooseDupRows - function(x)
{
if (nrow(x) = 1)
return(x)
ii - do.call(order,
  args=lapply(seq_len(ncol(x)),
  function(col) x[ , col]))
dup_index - logical(nrow(x))
i0 - -1
for (k in 1:length(ii)) {
i - ii[k]
if (any(is.na(x[i, ]))) {
if (i0 == -1)
next
if (any(x[i, ] != x[i0, ], na.rm=TRUE))
next
dup_index[i] - TRUE
} else {
i0 - i
}
}
x[!dup_index, ]
}

   x - matrix((1:3), 5, 3)
   x[4,2] = NA
   x[3,3] = NA
   x
   [,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA
  [4,]1   NA2
  [5,]213

   removeLooseDupRows(x)
   [,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA
  [4,]213

   removeLooseDupRows(unique(x))
   [,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA


Cheers,
H.


Quoting [EMAIL PROTECTED]:

 Quoting Petr Pikal [EMAIL PROTECTED]:
 
  Hi
  
  its a bit tricky but
  
  dup-apply(x, 2, duplicated) #which are dupplucated
  isna-apply(x, 2, is.na) #which are na
  check-dup|isna # which are both
  
  and here is your result
  
  x[rowSums(check)!=3,]
   [,1] [,2] [,3]
  [1,]132
  [2,]213
  [3,]32   NA
 
 Hi,
 
 The above doesn't work. No need to have NAs in x:
 
x - matrix(c(2,2,1,3,2,3), ncol=2, byrow=TRUE)
x
[,1] [,2]
   [1,]22
   [2,]13
   [3,]23
 
dup - apply(x, 2, duplicated)
x[rowSums(check)!=2 ,]
[,1] [,2]
   [1,]22
   [2,]13
 
 Look at 'dup':
 
dup
 [,1]  [,2]
   [1,] FALSE FALSE
   [2,] FALSE FALSE
   [3,]  TRUE  TRUE
 
 Yes, each element in the last row is a duplicate in its own col,
 but this doesn't mean that the row as a whole is a duplicate.
 
 Cheers,
 H.
 
 
  
  
  Regards
  Petr
  
  
  
  
  On 8 Mar 2007 at 10:14, stacey thompson wrote:
  
  Date sent:  Thu, 8 Mar 2007 10:14:37 -0500
  From:   stacey thompson [EMAIL PROTECTED]
  To: r-help@stat.math.ethz.ch
  Subject:[R] Removing duplicated rows within a matrix,
  with missing data as wildcards
  
   I'd like to remove duplicated rows within a matrix, with missing data
   being treated as wildcards.
   
   For example
   
x - matrix((1:3), 5, 3)
x[4,2] = NA
x[3,3] = NA
x
   
[,1] [,2] [,3]
   [1,]132
   [2,]213
   [3,]32   NA
   [4,]1   NA2
   [5,]213
   
   I would like to obtain
   
 [,1] [,2] [,3]
   [1,]132
   [2,]213
   [3,]32   NA
   
   From the R-help archives, I learned about unique(x) and
   duplicated(x).
   However, unique(x) returns
   
unique(x)
   
[,1] [,2] [,3]
   [1,]132
   [2,]213
   [3,]32   NA
   [4,]1   NA2
   
   and duplicated(x) gives
   
duplicated(x)
   
   [1] FALSE FALSE FALSE FALSE  TRUE
   
   I have tried various na.action 's but with unique(x) I get errors at
   best.
   
   e.g.
unique(x, na.omit(x))
   
   Error: argument 'incomparables != FALSE' is not used (yet)
   
   How I might tackle this?
   
   Thanks,
   
   -stacey
   
   -- 
   -stacey lee thompson-
   Stagiaire post-doctorale
   Institut de recherche en biologie végétale
   Université de Montréal
   4101 Sherbrooke Est
   Montréal, Québec H1X 2B2 Canada
   [EMAIL PROTECTED]
   
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
   http://www.R-project.org/posting-guide.html and provide commented,
   minimal, self-contained, reproducible code.
  
  Petr Pikal
  [EMAIL PROTECTED]
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 

[R] Problem with ci.lmer() in package:gmodels

2007-03-09 Thread Michael Kubovy
Dear Friends,

Please note that in the following CI lower  CI higher:

  require(lmer)
  require(gmodels)
  fm2 - lmer(Reaction ~ Days + (1|Subject) + (0+Days|Subject),  
sleepstudy)
  ci(fm2)
  Estimate  CI lower   CI upper Std. Error p-value
(Intercept) 251.66693 266.06895 238.630280   7.056447   0
Days 10.52773  13.63372   7.389946   1.646900   0



_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Is the gmodels package being maintained?

2007-03-09 Thread Michael Kubovy
Dear r-helpers,

I sent  a cc of a recent message about a problem with ci.lmer() in  
the gmodels package to the author (Gregory R Warnes), and the message  
bounced. If the author or someone else is maintaining this package or  
this function, would you kindly supplement the author's name and/or  
address with a current maintainer and/or provide a current email  
address?
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dendrogram again

2007-03-09 Thread bunny , lautloscrew.com
Hi all,

ok, i know i can cut a dendrogram, which i did.
all i get is three objects that a dendrograms itself.

for example:
myd$upper, myd$lower[[1]], myd$lower[[2]]
and so on. of course i can plot them seperately now.

but the lower parts still have hundreds of branches. i´ll need a 30   
widescreen to watch the whole picture.
what i´d like to is group the lower branches , so that i get a  
dendrogram with a few branches, splitting only in the upper levels.  
In terms of the cluster analysis, i just want to have a few bigger  
clusters.

thx,

m.

P.S.:
putting parts of a cutted dendrogram back into to one could be an  
idea ? is it somehow possible ?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off topic:Spam on R-help increase?

2007-03-09 Thread Horacio Castellini
Ya me parecía que no me pasaba solo a mi... :)  uhmmm falló el spam-filter?

 Folks:

 In the past 2 days I have seen a large increase of  spam getting into
 R-help. Are others experiencing this problem? If so, has there been some
 change to the spam filters on the R-servers? If not, is the problem on my
 end?

 Feel free to reply privately.

 Thanks.

 Bert Gunter
 Genentech Nonclinical Statistics
 South San Francisco, CA 94404
 650-467-7374



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Installing R on Ubuntu 6.10 via apt-get

2007-03-09 Thread Antonio Olinto
Hi

I'm using Linux Ubuntu 6.10 on a Pentium D 2.8.

Well, following http://cran.r-project.org/bin/linux/ubuntu/README

I wrote in the sources.list

# R
deb http://CRAN.R-project.org/bin/linux/ubuntu edgy/
deb http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu edgy/

But after type apt-get update I got

Falha ao baixar 
http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu/edgy/Release Unable to 
find expected entry Packages in Meta-index file (malformed Release file?)
Falha ao baixar http://CRAN.R-project.org/bin/linux/ubuntu/edgy/Release 
Unable to find expected entry Packages in Meta-index file (malformed 
Release file?)
W: Conflicting distribution: http://www.vps.fmvz.usp.br edgy/ Release 
(expected edgy but got )
W: Conflicting distribution: http://CRAN.R-project.org edgy/ Release 
(expected edgy but got )

(PS. Falha ao baixar = Fail to download)

The key of Vincent Goulet seems to be OK.

Am I doing something wrong or there's really a problem with the Release 
file?

Many thanks!

Antonio Olinto

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing R on Ubuntu 6.10 via apt-get

2007-03-09 Thread Henrique Dallazuanna
Hi Antonio

Look this http://help.nceas.ucsb.edu/index.php/Installing_R_on_Ubuntu

-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22
Ohttp://maps.google.com/maps?f=qhl=enq=Curitiba,+Brazillayer=ie=UTF8z=18ll=-25.448315,-49.276916spn=0.002054,0.005407t=kom=1

On 09/03/07, Antonio Olinto [EMAIL PROTECTED] wrote:

 Hi

 I'm using Linux Ubuntu 6.10 on a Pentium D 2.8.

 Well, following http://cran.r-project.org/bin/linux/ubuntu/README

 I wrote in the sources.list

 # R
 deb http://CRAN.R-project.org/bin/linux/ubuntu edgy/
 deb http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu edgy/

 But after type apt-get update I got

 Falha ao baixar
 http://www.vps.fmvz.usp.br/CRAN/bin/linux/ubuntu/edgy/Release Unable to
 find expected entry Packages in Meta-index file (malformed Release file?)
 Falha ao baixar http://CRAN.R-project.org/bin/linux/ubuntu/edgy/Release
 Unable to find expected entry Packages in Meta-index file (malformed
 Release file?)
 W: Conflicting distribution: http://www.vps.fmvz.usp.br edgy/ Release
 (expected edgy but got )
 W: Conflicting distribution: http://CRAN.R-project.org edgy/ Release
 (expected edgy but got )

 (PS. Falha ao baixar = Fail to download)

 The key of Vincent Goulet seems to be OK.

 Am I doing something wrong or there's really a problem with the Release
 file?

 Many thanks!

 Antonio Olinto

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off topic:Spam on R-help increase?

2007-03-09 Thread Jim Lemon
Horacio Castellini wrote:
 Ya me parecía que no me pasaba solo a mi... :)  uhmmm falló el spam-filter?
 
Si, es todos que lee R-news, pero maravillo, que es el palabra para 
spam en español?

Jim

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Deconvolution of a spectrum

2007-03-09 Thread Lukasz Komsta

Dear useRs,

I have a curve which is a mixture of Gaussian curves (for example UV
emission or absorption spectrum). Do you have any suggestions how to
implement searching for optimal set of Gaussian peaks to fit the curve?
I know that it is very complex problem, but maybe it is a possibility
to do it? First supposement is to use a nls() with very large functions,
and compare AIC value, but it is very difficult to suggest any starting
points for algotirithm.

Searching google I have found only a description of commercial software
for doing such deconvolution (Origin, PeakFit) without any information
about used algorithms. No ready-to-use function in any language.

I have tried to use a Mclust workaround for this problem, by generating a
large dataset for which the spectrum is a histogram and feed it into
the Mclust. The results seem to be serious, but this is very ugly and
imprecise method.

Thanks for any help,

Luke

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deconvolution of a spectrum

2007-03-09 Thread Joerg van den Hoff
On Fri, Mar 09, 2007 at 01:25:24PM +0100, Lukasz Komsta wrote:
 
 Dear useRs,
 
 I have a curve which is a mixture of Gaussian curves (for example UV
 emission or absorption spectrum). Do you have any suggestions how to
 implement searching for optimal set of Gaussian peaks to fit the curve?
 I know that it is very complex problem, but maybe it is a possibility
 to do it? First supposement is to use a nls() with very large functions,
 and compare AIC value, but it is very difficult to suggest any starting
 points for algotirithm.
 
 Searching google I have found only a description of commercial software
 for doing such deconvolution (Origin, PeakFit) without any information
 about used algorithms. No ready-to-use function in any language.
 
 I have tried to use a Mclust workaround for this problem, by generating a
 large dataset for which the spectrum is a histogram and feed it into
 the Mclust. The results seem to be serious, but this is very ugly and
 imprecise method.
 
 Thanks for any help,
 
 Luke
 
I would try `nls'. we have used `nls' for fitting magnetic resonance spectra
consisting of =~ 10 gaussian peaks. this works OK, if the input data are
reasonable (not too noisy, peak amplitudes above noise level, peak distance
not unreasonably smaller than peak width, i.e peak overlap such that peaks are
still more or less identifiable visually). 

of course you must invest effort in getting the start values (automatically or
manually) right. if your data are good, you might get good start values for the
positions (the means of the gaussians) with an approach that was floating around
the r-help list in 11/2005, which I adopted as follows:


peaks - function (series, span = 3, what = c(max, min), do.pad = TRUE, 
   add.to.plot = FALSE, ...) 
{
if ((span - as.integer(span))%%2 != 1) 
stop('span' must be odd)
if (!is.numeric(series)) 
stop(`peaks' needs numeric input)
what - match.arg(what)
if (is.null(dim(series)) || min(dim(series)) == 1) {
series - as.numeric(series)
x - seq(along = series)
y - series
}
else if (nrow(series) == 2) {
x - series[1, ]
y - series[2, ]
}
else if (ncol(series) == 2) {
x - series[, 1]
y - series[, 2]
}
if (span == 1) 
return(list(x = x, y = y, pos = rep(TRUE, length(y))), 
span = span, what = what, do.pad = do.pad)
if (what == min) 
z - embed(-y, span)
else z - embed(y, span)
s - span%/%2
s1 - s + 1
v - max.col(z, first) == s1
if (do.pad) {
pad - rep(FALSE, s)
v - c(pad, v, pad)
idx - v
}
else idx - c(rep(FALSE, s), v)
val - list(x = x[idx], y = y[idx], pos = v, span = span, 
what = what, do.pad = do.pad)
if (add.to.plot == TRUE) 
points(val, ...)
val
}

this looks for local maxima in the vector (y-values) or 2-dim array
(x/y-matrix) `series'in a neighborhood of each point defined by `span'. 
if you first plot your data and then call the above on the data with
'add.to.plot = TRUE', the results of the peak search are added to your plot (and
you can modify this plotting via the `...' argument).

maybe this works for your data to get the peak position estimates (and the
amplitudes in the next step) right. frequently the standard deviations
estimates can be set to some fixed value for any given experiment.

and of course distant parts of your spectrum won't have anything to do which
each other, so you can split up the fitting to help `nls' along a bit.

joerg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is the gmodels package being maintained?

2007-03-09 Thread Peter Dalgaard
Michael Kubovy wrote:
 Dear r-helpers,

 I sent  a cc of a recent message about a problem with ci.lmer() in  
 the gmodels package to the author (Gregory R Warnes), and the message  
 bounced. If the author or someone else is maintaining this package or  
 this function, would you kindly supplement the author's name and/or  
 address with a current maintainer and/or provide a current email  
 address?
   
Haven't heard that Greg should be out of circulation. You might try the
address from his homepage:

[EMAIL PROTECTED]


-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dendrogram again

2007-03-09 Thread Gavin Simpson
On Fri, 2007-03-09 at 12:17 +0100, bunny , lautloscrew.com wrote:
 Hi all,
 
 ok, i know i can cut a dendrogram, which i did.
 all i get is three objects that a dendrograms itself.
 
 for example:
 myd$upper, myd$lower[[1]], myd$lower[[2]]
 and so on. of course i can plot them seperately now.
 
 but the lower parts still have hundreds of branches. i´ll need a 30   
 widescreen to watch the whole picture.
 what i´d like to is group the lower branches , so that i get a  
 dendrogram with a few branches, splitting only in the upper levels.  
 In terms of the cluster analysis, i just want to have a few bigger  
 clusters.
 
 thx,
 
 m.
 
 P.S.:
 putting parts of a cutted dendrogram back into to one could be an  
 idea ? is it somehow possible ?

Again, perhaps I'm missing something, but if I understand you correctly
(again no example I can follow - what is myd and how did you create
it?), you only want to plot the upper part of the dendrogram and not the
lower branches. If so, then this /is/ on ?dendrogram and you /do/ use
cut() to do it ...:

'cut.dendrogram()' returns a list with components '$upper' and
 '$lower', the first is a truncated version of the original tree,
 also of class 'dendrogram', the latter a list with the branches
 obtained from cutting the tree, each a 'dendrogram'.

So to only show the pruned tree, you just plot $upper - it does say that
$upper is a dendrogram and that it is the truncated version of the
original tree - which is what I understand you to be asking for. This
example shows it in action - this is what I mean by a reproducible
example - (I'm using package vegan as I am familiar with this data set):

require(vegan) ## if false install it
data(varespec)

hc - hclust(vegdist(varespec, bray), method = ward)
hc - as.dendrogram(hc)

## this is the full dendrogram - too many nodes, so prune
plot(hc)

## lets take four clusters and prune it back
hc.pruned - cut(hc, h = 1) # can't specify k so read height of first
# plot - cutting at h = 1 gives 4 clusters

# plot only the upper part of the tree showing only the 4 clusters
plot(hc.pruned$upper, center = TRUE)

Is this what you want? If not, using the example I provide above, tell
us exactly what you want to achieve.

HTH

G

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R and clinical studies

2007-03-09 Thread Delphine Fontaine
Does anyone know if for clinical studies the FDA would accept  
statistical analyses performed with R ?

Delphine Fontaine

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] color key of heatmap.2

2007-03-09 Thread James W. MacDonald
XinMeng wrote:
 Hi all:
 The color key of heatmap.2 is as follows if I use redgreen style:
 low level:red
 high leve:green
 
 And what I want is:
 low level:green
 hight level:red

?colorpanel

Best,

Jim


 
 How can I do it then?
 
 Thanks a lot for your help!
 
 My best!
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623


**
Electronic Mail is not secure, may not be read every day, and should not be 
used for urgent or sensitive issues.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is the gmodels package being maintained?

2007-03-09 Thread Michael Kubovy
Hi,

Finding his email address was not immediate, but I finally did, and  
did bring the problem to Greg's attention @ rochester, and the  
message didn't bounce this time.

On Mar 9, 2007, at 7:43 AM, Peter Dalgaard wrote:

 Michael Kubovy wrote:
 Dear r-helpers,

 I sent  a cc of a recent message about a problem with ci.lmer() in
 the gmodels package to the author (Gregory R Warnes), and the message
 bounced. If the author or someone else is maintaining this package or
 this function, would you kindly supplement the author's name and/or
 address with a current maintainer and/or provide a current email
 address?

 Haven't heard that Greg should be out of circulation. You might try  
 the
 address from his homepage:

 [EMAIL PROTECTED]
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards

2007-03-09 Thread stacey thompson
Hi H.,

Your response has improved the clarity of my thinking.  Kind thanks.
Also, your use of seq_len() prompted me to update from R version 2.3.1
on this machine.

For your matrix

  x - matrix(c(1, NA, 3, NA, 2, 3), ncol=3, byrow=TRUE)
  x
  [,1] [,2] [,3]
 [1,]1   NA3
 [2,]   NA23

I would want to delete either x[1,] or x[2,] but not both.
Practically, your removeLooseDupRows(x)

removeLooseDupRows - function(x)
{
   if (nrow(x) = 1)
   return(x)
   ii - do.call(order,
 args=lapply(seq_len(ncol(x)),
 function(col) x[ , col]))
   dup_index - logical(nrow(x))
   i0 - -1
   for (k in 1:length(ii)) {
   i - ii[k]
   if (any(is.na(x[i, ]))) {
   if (i0 == -1)
   next
   if (any(x[i, ] != x[i0, ], na.rm=TRUE))
   next
   dup_index[i] - TRUE
   } else {
   i0 - i
   }
   }
   x[!dup_index, ]
}

should leave no such ambiguous cases for my data, as the nrow(x) are
very high with few NA in each x.  For example, a row of (1, 2, 3) is
very likely to exist in my data.

However, to find the row numbers of any remaining ambiguous matches,
should they exist, using example:

 x - matrix(c(1, NA, 3, NA, 2, 3, 1, 3, 2, 2, 1, 3, 1, NA, 2, 2, 1, 3), 
 ncol=3, byrow=TRUE)
 x
 [,1] [,2] [,3]
[1,]1   NA3
[2,]   NA23
[3,]132
[4,]213
[5,]1   NA2
[6,]213

after your suggested

 removeLooseDupRows(x)
 [,1] [,2] [,3]
[1,]1   NA3
[2,]   NA23
[3,]132
[4,]213
[5,]213

 q - removeLooseDupRows(unique(x))
 q
 [,1] [,2] [,3]
[1,]1   NA3
[2,]   NA23
[3,]132
[4,]213

I could

 # ambiguous matches in matrix form
 apply(q, 1, function(row1) apply(q, 1, function(row2) all(is.na(row1) | 
 is.na(row2) | row1==row2)))

  [,1]  [,2]  [,3]  [,4]
[1,]  TRUE  TRUE FALSE FALSE
[2,]  TRUE  TRUE FALSE FALSE
[3,] FALSE FALSE  TRUE FALSE
[4,] FALSE FALSE FALSE  TRUE

 # indices of ambiguous matches
 m - which(apply(q, 1, function(row1) apply(q, 1, function(row2) 
 all(is.na(row1) | is.na(row2) | row1==row2))), arr=T)
 m
 row col
[1,]   1   1
[2,]   2   1
[3,]   1   2
[4,]   2   2
[5,]   3   3
[6,]   4   4

 #put in order and omit duplicates
 m2 - unique(t(apply(m, 1, sort)))
 m2
 [,1] [,2]
[1,]11
[2,]12
[3,]22
[4,]33
[5,]44

 # show the ambiguous matches
 m2[m2[,1]!=m2[,2], drop=F]
[1] 1 2

...and procede from there.

This solution came from another helpful R-help respondant to my
poorly-defined problem.

Appreciative thanks to everyone for your instructive help.

Cheers,
stacey

-- 
-stacey lee thompson-
Stagiaire post-doctorale
Institut de recherche en biologie végétale
Université de Montréal
4101 Sherbrooke Est
Montréal, Québec H1X 2B2 Canada
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] autoload libraries at startup

2007-03-09 Thread toby909
Hi All

I was wondering if there is a way I can specify in R that it should load 
libraries automatically at startup, so that I do not have to manually issue the 
command.

Thanks Toby

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] autoload libraries at startup

2007-03-09 Thread john seers \(IFR\)

Hi 

I do not know if this is the best way, but have a look at .Rprofile - a
text file that lives in the R root directory ans is executed at startup.
You could put library() commands in that.

See ?Startup for more information.

Regards


JS



 
---

John Seers
Institute of Food Research
Norwich Research Park
Colney
Norwich
NR4 7UA
 

tel +44 (0)1603 251497
fax +44 (0)1603 507723
e-mail [EMAIL PROTECTED] 
e-disclaimer at http://www.ifr.ac.uk/edisclaimer/ 
 
Web sites:

www.ifr.ac.uk   
www.foodandhealthnetwork.com


-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of [EMAIL PROTECTED]
Sent: 08 March 2007 20:46
To: r-help@stat.math.ethz.ch
Subject: [R] autoload libraries at startup


Hi All

I was wondering if there is a way I can specify in R that it should load

libraries automatically at startup, so that I do not have to manually
issue the 
command.

Thanks Toby

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] understanding print.summary.lm and perhaps print/show in general

2007-03-09 Thread Paul Bailey
I'm trying to understand how R prints summary.lm objects and
trying to change it slightly for a summary function that
calculates standard errors using an alternative method.

I've found that I can modify a summary.lm object and then it
prints the modified way but I want to change a few things in
the print method that I think I might just be able to do. One
is that I want the coefficients table to print a different
header (other than Std. Error). I've tried changing the
column name of the summary$coef matrix and this works for
calls to printCoefmat but it still prints out Std. Error
when I pass  the summary.lm to the command line by itself. I
don't understand this behavior. When I do this (enter an
object on the command line by itself), does it then calls the
print / show method associated with that objects class, in
this case, summary.lm? Below is some sample code to reproduce
the behavior I don't understand and a comment regarding the
result I don't understand.

Cheers,
Paul

#
lma - lm(dist ~ speed, data=cars)
suma - summary(lma)
colnames(suma$coef) - c(LETTERS[1:4])
printCoefmat(suma$coef) # prints what I expect
suma
# the above is the print behavior question regards,
# why does the coefficients matrix have in its header
# the usual Estimate Std. Error t value Pr(|t|)
# I expect A B C D as above in the call to printCoefmat

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] understanding print.summary.lm and perhaps print/show in general

2007-03-09 Thread Christos Hatzis
Paul,

Usually summary methods perform some computations if needed and then change
the class of the original object so that a print method can be called for
the new summary object.

In this case, this is done at the end of the summary.lm method:

...
if (!is.null(z$na.action)) 
ans$na.action - z$na.action
class(ans) - summary.lm
^^
ans
}

So then print.summary.lm does all the job displaying the summary.lm object.
To see that function do

getAnywhere(print.summary.lm)

Then you can then modify that function as needed.

-Christos

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
 


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Bailey
 Sent: Friday, March 09, 2007 9:34 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] understanding print.summary.lm and perhaps 
 print/show in general
 
 I'm trying to understand how R prints summary.lm objects and 
 trying to change it slightly for a summary function that 
 calculates standard errors using an alternative method.
 
 I've found that I can modify a summary.lm object and then it 
 prints the modified way but I want to change a few things in 
 the print method that I think I might just be able to do. One 
 is that I want the coefficients table to print a different 
 header (other than Std. Error). I've tried changing the 
 column name of the summary$coef matrix and this works for 
 calls to printCoefmat but it still prints out Std. Error
 when I pass  the summary.lm to the command line by itself. I 
 don't understand this behavior. When I do this (enter an 
 object on the command line by itself), does it then calls the 
 print / show method associated with that objects class, in 
 this case, summary.lm? Below is some sample code to reproduce 
 the behavior I don't understand and a comment regarding the 
 result I don't understand.
 
 Cheers,
 Paul
 
 #
 lma - lm(dist ~ speed, data=cars)
 suma - summary(lma)
 colnames(suma$coef) - c(LETTERS[1:4])
 printCoefmat(suma$coef) # prints what I expect suma # the 
 above is the print behavior question regards, # why does the 
 coefficients matrix have in its header # the usual Estimate 
 Std. Error t value Pr(|t|)
 # I expect A B C D as above in the call to printCoefmat
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] convert pixels into axis coordinates in R

2007-03-09 Thread P. Stencel
Dear R users,

I've two questions:

1) Does anybody have a clue how to convert pixel from a jpeg graphic 
(e.g. something like a square of 100x100 pxs)  into axis coordinate 
values in R? 
/
//2)// Is there any possibility to extend the locator function in a way 
that //locator( ) outputs all coordinates from a plot at once, without 
clicking on the graph?

Thanks for any hints.

Regards,
P. Stencel



/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Removing duplicated rows within a matrix, with missing data as wildcards

2007-03-09 Thread Dimitris Rizopoulos
you could also try something like the following:

x - matrix(c(1, NA, 3, NA, 2, 3, 1, 3, 2, 2, 1, 3, 1, NA, 2, 2, 1, 
3), ncol=3, byrow=TRUE)

wildcardVals - 1:3 # possible wildcard values
ind - complete.cases(x)
nc - ncol(x)
nr - nrow(x[ind, ])
nwld - length(wildcardVals)
posb - apply(x[!ind, , drop = FALSE], 1, function(y){
out - matrix(y, nwld, nc, by = TRUE)
out[, is.na(y)] - wildcardVals
t(out)
})
posb - matrix(c(posb), ncol = nc, by = TRUE)
keep.ind - duplicated(rbind(x[ind, ], posb))
keep.ind[-(1:nr)] - apply(matrix(keep.ind[-(1:nr)], nc = nwld, by = 
TRUE),
1, function(x) if(any(x)) rep(TRUE, length(x)) else x)
out - rbind(x[ind, ], matrix(rep(x[!ind, ], each = nwld), nc = nc))
unique(out[!keep.ind, ])


I hope it works ok.

Best,
Dimitris


Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
 http://www.student.kuleuven.be/~m0390867/dimitris.htm

- Original Message - 
From: stacey thompson [EMAIL PROTECTED]
To: [EMAIL PROTECTED]; r-help@stat.math.ethz.ch
Cc: [EMAIL PROTECTED]
Sent: Friday, March 09, 2007 3:09 PM
Subject: Re: [R] Removing duplicated rows within a matrix,with missing 
data as wildcards


 Hi H.,

 Your response has improved the clarity of my thinking.  Kind thanks.
 Also, your use of seq_len() prompted me to update from R version 
 2.3.1
 on this machine.

 For your matrix

  x - matrix(c(1, NA, 3, NA, 2, 3), ncol=3, byrow=TRUE)
  x
  [,1] [,2] [,3]
 [1,]1   NA3
 [2,]   NA23

 I would want to delete either x[1,] or x[2,] but not both.
 Practically, your removeLooseDupRows(x)

 removeLooseDupRows - function(x)
 {
   if (nrow(x) = 1)
   return(x)
   ii - do.call(order,
 args=lapply(seq_len(ncol(x)),
 function(col) x[ , col]))
   dup_index - logical(nrow(x))
   i0 - -1
   for (k in 1:length(ii)) {
   i - ii[k]
   if (any(is.na(x[i, ]))) {
   if (i0 == -1)
   next
   if (any(x[i, ] != x[i0, ], na.rm=TRUE))
   next
   dup_index[i] - TRUE
   } else {
   i0 - i
   }
   }
   x[!dup_index, ]
 }

 should leave no such ambiguous cases for my data, as the nrow(x) are
 very high with few NA in each x.  For example, a row of (1, 2, 3) is
 very likely to exist in my data.

 However, to find the row numbers of any remaining ambiguous matches,
 should they exist, using example:

 x - matrix(c(1, NA, 3, NA, 2, 3, 1, 3, 2, 2, 1, 3, 1, NA, 2, 2, 1, 
 3), ncol=3, byrow=TRUE)
 x
 [,1] [,2] [,3]
 [1,]1   NA3
 [2,]   NA23
 [3,]132
 [4,]213
 [5,]1   NA2
 [6,]213

 after your suggested

 removeLooseDupRows(x)
 [,1] [,2] [,3]
 [1,]1   NA3
 [2,]   NA23
 [3,]132
 [4,]213
 [5,]213

 q - removeLooseDupRows(unique(x))
 q
 [,1] [,2] [,3]
 [1,]1   NA3
 [2,]   NA23
 [3,]132
 [4,]213

 I could

 # ambiguous matches in matrix form
 apply(q, 1, function(row1) apply(q, 1, function(row2) 
 all(is.na(row1) | is.na(row2) | row1==row2)))

  [,1]  [,2]  [,3]  [,4]
 [1,]  TRUE  TRUE FALSE FALSE
 [2,]  TRUE  TRUE FALSE FALSE
 [3,] FALSE FALSE  TRUE FALSE
 [4,] FALSE FALSE FALSE  TRUE

 # indices of ambiguous matches
 m - which(apply(q, 1, function(row1) apply(q, 1, function(row2) 
 all(is.na(row1) | is.na(row2) | row1==row2))), arr=T)
 m
 row col
 [1,]   1   1
 [2,]   2   1
 [3,]   1   2
 [4,]   2   2
 [5,]   3   3
 [6,]   4   4

 #put in order and omit duplicates
 m2 - unique(t(apply(m, 1, sort)))
 m2
 [,1] [,2]
 [1,]11
 [2,]12
 [3,]22
 [4,]33
 [5,]44

 # show the ambiguous matches
 m2[m2[,1]!=m2[,2], drop=F]
 [1] 1 2

 ...and procede from there.

 This solution came from another helpful R-help respondant to my
 poorly-defined problem.

 Appreciative thanks to everyone for your instructive help.

 Cheers,
 stacey

 -- 
 -stacey lee thompson-
 Stagiaire post-doctorale
 Institut de recherche en biologie végétale
 Université de Montréal
 4101 Sherbrooke Est
 Montréal, Québec H1X 2B2 Canada
 [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Right truncation data

2007-03-09 Thread SUBIRANA CACHINERO, ISAAC
Hi,
 
Does anybody know how to perform a Cox model analysis for right
truncated data? 
All my data is right truncated, since I only have patients who entered
in a hospital for a particular desease, and I would like to modelate the
age at hospital entrance as the response variable.
 
 
Thanks in advance.
 
 
Isaac Subirana. [EMAIL PROTECTED]



La informació continguda en aquest missatge i en qualsevol fitxer 
adjunt és confidencial, privada i d'ús exclusiu per al destinatari.
Si no és la persona a la qual anava dirigida aquesta informació, si us 
plau, notifiqui immediatament l'enviament erroni al remitent i esborri 
el missatge. Qualsevol còpia, divulgació, distribució o utilització no 
autoritzada d'aquest correu electrònic i dels seus adjunts està 
prohibida en virtut de la legislació vigent.

La información contenida en este mensaje y en cualquier fichero 
adjunto es confidencial, privada y de uso exclusivo para el 
destinatario. Si usted no es la persona a la cual iba dirigida esta 
información, por favor, notifique inmediatamente el envío erróneo al 
remitente y borre el mensaje. Cualquier copia, divulgación, 
distribución o utilización no autorizada de este correo electrónico y 
de sus adjuntos está prohibida en virtud de la legislación vigente.

The information included in this e-mail and any attached files are 
confidential and private. If you are not the intended recipient, 
please notify the error to the sender and delete this message 
immediately. Dissemination, forwarding or copying of this e-mail and 
its associated attachments is strictly prohibited according with 
current legislation.



[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deconvolution of a spectrum

2007-03-09 Thread Earl F. Glynn
Lukasz Komsta [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 I have a curve which is a mixture of Gaussian curves (for example UV
 emission or absorption spectrum). Do you have any suggestions how to
 implement searching for optimal set of Gaussian peaks to fit the curve?
 I know that it is very complex problem, but maybe it is a possibility
 to do it? First supposement is to use a nls() with very large functions,
 and compare AIC value, but it is very difficult to suggest any starting
 points for algotirithm.

Perhaps these notes will be helpful if you don't have too much noise in your 
data:
http://research.stowers-institute.org/efg/R/Statistics/MixturesOfDistributions/index.htm

efg

Earl F. Glynn
Stowers Institute for Medical Research

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] color key of heatmap.2

2007-03-09 Thread Scionforbai
mycolors - rev(heatmap.2(length))

where length is the number of colours you wants.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix conversion question

2007-03-09 Thread Johannes Graumann
Hello,

Please help - I'm blanking on this ...

I have a matrix like this:

 [,1] [,2]
[1,]12
[2,]13
[3,]23

and would like to have a list of vectors, where a vector contains the
entries in a matrix row ...

Can somebody nudge me to the place I need to go?

Thanks, Joh

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R and clinical studies

2007-03-09 Thread Soukup, Mat
Delphine,

Please see the following message posted a week ago:
http://comments.gmane.org/gmane.comp.lang.r.general/80175.

HTH,

-Mat 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Delphine Fontaine
Sent: Friday, March 09, 2007 8:29 AM
To: r-help@stat.math.ethz.ch
Subject: [R] R and clinical studies

Does anyone know if for clinical studies the FDA would accept  
statistical analyses performed with R ?

Delphine Fontaine

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix conversion question

2007-03-09 Thread Christos Hatzis
Try

split(x, row(x)) 

-Christos

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Johannes Graumann
 Sent: Friday, March 09, 2007 10:30 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Matrix conversion question
 
 Hello,
 
 Please help - I'm blanking on this ...
 
 I have a matrix like this:
 
  [,1] [,2]
 [1,]12
 [2,]13
 [3,]23
 
 and would like to have a list of vectors, where a vector 
 contains the entries in a matrix row ...
 
 Can somebody nudge me to the place I need to go?
 
 Thanks, Joh
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] understanding print.summary.lm and perhaps print/show in general

2007-03-09 Thread Petr Klasterecky
Another solution is to look into the code of summary.lm a few lines 
above where the (dim)names are assigned. Based on this, you may try

lma - lm(dist ~ speed, data=cars)
suma - summary(lma)
colnames(suma$coef) - c(LETTERS[1:4])
printCoefmat(suma$coef) # prints what I expect
suma

dimnames(suma$coefficients) - list(names(suma$coefficients), 
c(LETTERS[1:4]))
suma

You might also find reading the chapter on generic functions in the 
R-lang (R language definition) manual useful.
Petr

Christos Hatzis napsal(a):
 Paul,
 
 Usually summary methods perform some computations if needed and then change
 the class of the original object so that a print method can be called for
 the new summary object.
 
 In this case, this is done at the end of the summary.lm method:
 
 ...
 if (!is.null(z$na.action)) 
 ans$na.action - z$na.action
 class(ans) - summary.lm
 ^^
 ans
 }
 
 So then print.summary.lm does all the job displaying the summary.lm object.
 To see that function do
 
 getAnywhere(print.summary.lm)
 
 Then you can then modify that function as needed.
 
 -Christos
 
 Christos Hatzis, Ph.D.
 Nuvera Biosciences, Inc.
 400 West Cummings Park
 Suite 5350
 Woburn, MA 01801
 Tel: 781-938-3830
 www.nuverabio.com
  
 
 
 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Paul Bailey
 Sent: Friday, March 09, 2007 9:34 AM
 To: r-help@stat.math.ethz.ch
 Subject: [R] understanding print.summary.lm and perhaps 
 print/show in general

 I'm trying to understand how R prints summary.lm objects and 
 trying to change it slightly for a summary function that 
 calculates standard errors using an alternative method.

 I've found that I can modify a summary.lm object and then it 
 prints the modified way but I want to change a few things in 
 the print method that I think I might just be able to do. One 
 is that I want the coefficients table to print a different 
 header (other than Std. Error). I've tried changing the 
 column name of the summary$coef matrix and this works for 
 calls to printCoefmat but it still prints out Std. Error
 when I pass  the summary.lm to the command line by itself. I 
 don't understand this behavior. When I do this (enter an 
 object on the command line by itself), does it then calls the 
 print / show method associated with that objects class, in 
 this case, summary.lm? Below is some sample code to reproduce 
 the behavior I don't understand and a comment regarding the 
 result I don't understand.

 Cheers,
 Paul

 #
 lma - lm(dist ~ speed, data=cars)
 suma - summary(lma)
 colnames(suma$coef) - c(LETTERS[1:4])
 printCoefmat(suma$coef) # prints what I expect suma # the 
 above is the print behavior question regards, # why does the 
 coefficients matrix have in its header # the usual Estimate 
 Std. Error t value Pr(|t|)
 # I expect A B C D as above in the call to printCoefmat

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to create a list that grows automatically

2007-03-09 Thread Young-Jin Lee
Dear R users

I would like to know if there is a way to create a list or an array (or
anything) which grows automatically as more elements are put into it. What I
want to find is something equivalent to an ArrayList object of Java
language. In Java, I can do the following thing:

// Java code
ArrayList myArray = new ArrayList();
myArray.add(object1);
myArray.add(object2);

// End of java code

Thanks in advance.

Young-Jin Lee

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lpSolve space problem in R 2.4.1 on Windows XP

2007-03-09 Thread Talbot Katz
Hi.

I am trying to use the linear optimizer from package lpSolve in R 2.4.1 on 
Windows XP (Version 5.1).  The problem I am trying to solve has 2843 
variables (2841 integer, 2 continuous) and 8524 constraints, and I have 2 Gb 
of memory.  After I load the input data into R, I have at most 1.5 Gb of 
memory available.  If I start the lp with significantly less memory 
available (say 1 Gb), I get an error message from R:

Error: cannot allocate vector of size 189459 Kb

If I close all my other windows and try to maximize the available memory to 
the full 1.5 Gb, I can watch the memory get filled up until only about 400 
Mb is left, at which point I get a Windows error message:

R for Windows GUI front-end has encountered a problem and needs to close.  
We are sorry for the inconvenience.

This behavior persists even when I relax the integer constraints, and 
eliminate the 2841 constraints that restrict the integer variables to values 
= 1, so I'm just running a standard lp with 2843 variables and 5683 
constraints.

I have been able to get the full MIP formulation to work correctly on some 
very small problems (~10 variables and 25 constraints).

Here is the code for a working example:

library(lpSolve)
(v1=rev(1:8))
[1] 8 7 6 5 4 3 2 1
(csv1=cumsum(as.numeric(v1)))
[1]  8 15 21 26 30 33 35 36
(lencsv1=length(csv1))
[1] 8
(Nm1=lencsv1-1)
[1] 7
(Np1=lencsv1+1)
[1] 9
ngp=3
f.obj=c(1,1,rep(0,Nm1))
f.int=3:Np1
bin.con=cbind(rep(0,Nm1),rep(0,Nm1),diag(Nm1))
bin.dir=rep(=,Nm1)
bin.rhs=rep(1,Nm1)
gp.con=c(0,0,rep(1,Nm1))
gp.dir==
(gp.rhs=ngp-1)
[1] 2
ub.con=cbind(rep(-1,rep(Nm1)),rep(0,Nm1),!upper.tri(matrix(nrow=Nm1,ncol=Nm1)))
ub.dir=rep(=,Nm1)
(ub.rhs=csv1[1:Nm1]*ngp/csv1[lencsv1])
[1] 0.667 1.250 1.750 2.167 2.500 2.750 2.917
lb.con=cbind(rep(0,Nm1),rep(1,rep(Nm1)),!upper.tri(matrix(nrow=Nm1,ncol=Nm1)))
lb.dir=rep(=,Nm1)
lb.rhs=ub.rhs
f.con=rbind(bin.con,gp.con,ub.con,lb.con)
f.dir=c(bin.dir,gp.dir,ub.dir,lb.dir)
f.rhs=c(bin.rhs,gp.rhs,ub.rhs,lb.rhs)
lglp=lp(min,f.obj,f.con,f.dir,f.rhs,int.vec=f.int)
lglp$objval
[1] 0.917
lglp$solution
[1] 0.000 0.917 0.000 1.000 0.000 1.000 0.000
[8] 0.000 0.000


What this is doing is taking the points of v1 and dividing them into 
contiguous groups (the variable ngp is the number of groups) such that the 
sums of the v1 values are as close as possible to equal within the three 
groups.  So, for v1 = c(8,7,6,5,4,3,2,1), the groups c(8,7), c(6,5), 
c(4,3,2,1), with sums 15,11,10 is the best such split, and the solution 
vector shows that the splitting occurs after the second and fourth elements.


Anyway, I am wondering...  Are 3000 variables and 8500 constraints usually 
too much for lpSolve to handle in 1.5 Gb of memory?  Is there a possible bug 
(in R or in Windows) that leads to the Windows error when the memory falls 
below 400 Mb?  Is there a problem with my formulation that makes it unstable 
even after the integer constraints are removed?

Thanks!


--  TMK  --
212-460-5430home
917-656-5351cell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting the p of F statistics from lm

2007-03-09 Thread Cressoni, Massimo \(NIH/NHLBI\) [F]
I need to extract the p value from a ANOVA done with lm model

fitting - lm(var ~ group)
Sfitting - summary(fitting)

Sfitting[10][1] gives the F value and the degrees of freedom but I am not able 
to get the
p value. 
The function df should give a p value given a F but I am not 
able to make it work.

I found only something about aov in the R help and I am not able
to make it work

Massimo Cressoni

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix conversion question

2007-03-09 Thread Johannes Graumann
Christos Hatzis wrote:

 Try
 
 split(x, row(x))

H! THE ELEGANCE! Thanks a lot!

Joh

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R GUI in Ubuntu?

2007-03-09 Thread Andy Weller
OK, so I did:
sudo R CMD javareconf

followed by the following in R as root:
install.packages(JGR,dep = TRUE)

which I think went OK because if I do:
library()

then JGR is listed. From the terminal:
Packages in library '/usr/local/lib/R/site-library':

JavaGD  Java Graphics Device
JGR JGR - Java Gui for R
rJava   Low-level R to Java interface

BUT, if I then run:
JGR()

then I get:
Error: could not find function JGR

I am confused...?!?

Thanks in advance, Andy

Dirk Eddelbuettel wrote:
 On Thu, Mar 08, 2007 at 07:05:15PM +0100, Andy Weller wrote:
 Dear all,

 I am very new to R and find the terminal-based UI a little daunting. 
 (That's probably the wrong thing to say!) Having searched the Packages 
 it seems that I can have either a Gnome-based or Java-based GUI for my 
 Ubuntu machine. However, I can get neither to work.

 Having run R as root, I then run the following command:
 install.packages(gnomeGUI, dependencies=TRUE)

 The output of which is:
 checking for gnomeConf.sh file in /usr/local/lib... not found
 configure: error: conditional HAVE_ORBIT was never defined.
 Usually this means the macro was only invoked conditionally.
 ERROR: configuration failed for package 'gnomeGUI'
 * Removing '/usr/local/lib/R/site-library/gnomeGUI'

 I have checked to see if I have all dependencies installed - it seems as 
 though I have. No luck! So I try the Java-based GUI with:
 install.packages(JGR,dep=TRUE)
 library(JGR)
 JGR()

 No luck. So, out of R I try:
 sudo R CMD javareconf
 
 I think you are close. Do the JGR install _after_ the javareconf as it
 needs the correct values.
 
 Also make sure you use the Sun Java packages you get for Ubuntu.
 
 Hope this helps, Dirk
 
 Then in R, if I check the library with:
 library(JGR)

 I get:
 Error: .onLoad failed in 'loadNamespace' for 'rJava'
 Error: package 'rJava' could not be loaded

 HMMmmm - still no joy! I guess I am missing something very basic here?!

 Thanks in advance, Andy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the p of F statistics from lm

2007-03-09 Thread Giovanni Petris

 Date: Fri, 09 Mar 2007 11:18:46 -0500
 From: Cressoni, Massimo (NIH/NHLBI) [F] [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]
 Precedence: list
 Thread-topic: Extracting the p of F statistics from lm
 Thread-index: AcdiZp1fE6s6LWieSsaL2EpoQP/shg==
 
 I need to extract the p value from a ANOVA done with lm model
 
 fitting - lm(var ~ group)
 Sfitting - summary(fitting)
 
 Sfitting[10][1] gives the F value and the degrees of freedom but I am not 
 able to get the
 p value. 
 The function df should give a p value given a F but I am not 
 able to make it work.

The function pf should.

 
 I found only something about aov in the R help and I am not able
 to make it work
 
 Massimo Cressoni
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 

Giovanni Petris  [EMAIL PROTECTED]
Department of Mathematical Sciences
University of Arkansas - Fayetteville, AR 72701
Ph: (479) 575-6324, 575-8630 (fax)
http://definetti.uark.edu/~gpetris/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lpSolve space problem in R 2.4.1 on Windows XP

2007-03-09 Thread Uwe Ligges
If R is closed that way (i.e. crashes), it is a bug by definition: 
either in R or (more probable) in the package. Can you please contact 
the package maintainer to sort things out.

Thanks,
Uwe Ligges





Talbot Katz wrote:
 Hi.
 
 I am trying to use the linear optimizer from package lpSolve in R 2.4.1 on 
 Windows XP (Version 5.1).  The problem I am trying to solve has 2843 
 variables (2841 integer, 2 continuous) and 8524 constraints, and I have 2 Gb 
 of memory.  After I load the input data into R, I have at most 1.5 Gb of 
 memory available.  If I start the lp with significantly less memory 
 available (say 1 Gb), I get an error message from R:
 
 Error: cannot allocate vector of size 189459 Kb
 
 If I close all my other windows and try to maximize the available memory to 
 the full 1.5 Gb, I can watch the memory get filled up until only about 400 
 Mb is left, at which point I get a Windows error message:
 
 R for Windows GUI front-end has encountered a problem and needs to close.  
 We are sorry for the inconvenience.
 
 This behavior persists even when I relax the integer constraints, and 
 eliminate the 2841 constraints that restrict the integer variables to values 
 = 1, so I'm just running a standard lp with 2843 variables and 5683 
 constraints.
 
 I have been able to get the full MIP formulation to work correctly on some 
 very small problems (~10 variables and 25 constraints).
 
 Here is the code for a working example:
 
 library(lpSolve)
 (v1=rev(1:8))
 [1] 8 7 6 5 4 3 2 1
 (csv1=cumsum(as.numeric(v1)))
 [1]  8 15 21 26 30 33 35 36
 (lencsv1=length(csv1))
 [1] 8
 (Nm1=lencsv1-1)
 [1] 7
 (Np1=lencsv1+1)
 [1] 9
 ngp=3
 f.obj=c(1,1,rep(0,Nm1))
 f.int=3:Np1
 bin.con=cbind(rep(0,Nm1),rep(0,Nm1),diag(Nm1))
 bin.dir=rep(=,Nm1)
 bin.rhs=rep(1,Nm1)
 gp.con=c(0,0,rep(1,Nm1))
 gp.dir==
 (gp.rhs=ngp-1)
 [1] 2
 ub.con=cbind(rep(-1,rep(Nm1)),rep(0,Nm1),!upper.tri(matrix(nrow=Nm1,ncol=Nm1)))
 ub.dir=rep(=,Nm1)
 (ub.rhs=csv1[1:Nm1]*ngp/csv1[lencsv1])
 [1] 0.667 1.250 1.750 2.167 2.500 2.750 2.917
 lb.con=cbind(rep(0,Nm1),rep(1,rep(Nm1)),!upper.tri(matrix(nrow=Nm1,ncol=Nm1)))
 lb.dir=rep(=,Nm1)
 lb.rhs=ub.rhs
 f.con=rbind(bin.con,gp.con,ub.con,lb.con)
 f.dir=c(bin.dir,gp.dir,ub.dir,lb.dir)
 f.rhs=c(bin.rhs,gp.rhs,ub.rhs,lb.rhs)
 lglp=lp(min,f.obj,f.con,f.dir,f.rhs,int.vec=f.int)
 lglp$objval
 [1] 0.917
 lglp$solution
 [1] 0.000 0.917 0.000 1.000 0.000 1.000 0.000
 [8] 0.000 0.000
 
 What this is doing is taking the points of v1 and dividing them into 
 contiguous groups (the variable ngp is the number of groups) such that the 
 sums of the v1 values are as close as possible to equal within the three 
 groups.  So, for v1 = c(8,7,6,5,4,3,2,1), the groups c(8,7), c(6,5), 
 c(4,3,2,1), with sums 15,11,10 is the best such split, and the solution 
 vector shows that the splitting occurs after the second and fourth elements.
 
 
 Anyway, I am wondering...  Are 3000 variables and 8500 constraints usually 
 too much for lpSolve to handle in 1.5 Gb of memory?  Is there a possible bug 
 (in R or in Windows) that leads to the Windows error when the memory falls 
 below 400 Mb?  Is there a problem with my formulation that makes it unstable 
 even after the integer constraints are removed?
 
 Thanks!
 
 
 --  TMK  --
 212-460-5430  home
 917-656-5351  cell
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] time demean model matrix

2007-03-09 Thread Doran, Harold
Suppose I have longitudinal data and want to use the econometric strategy of 
de-meaning a model matrix by time. For sake of illustration 'mat' is a model 
matrix for 3 individuals each with 3 observations where ``1'' denotes that 
individual i was in group j at time t or ``0'' otherwise.
 
mat - matrix(c(1,1,0,0,0,0,0,0,1,0,0,0,1,1,1,0,0,0,0,0,1,0,0,0,1,1,0), ncol=3)
mat - data.frame(mat, id=gl(3,3))
 
I can conceive of two ways of de-meaning: either use an explicit loop or use 
mapply, both of which are below.
 
# put this in a loop over each column to create the de-meaned X matrix
mat2 - matrix(0, 9,3)
for(i in 1:3){
 mat2[,i] - mat[,i] - ave(mat[,i], mat$id)
}
# Or use mapply as follows
mat[,1:3]-mapply(ave, mat[,1:3], MoreArgs=list(mat$id))
 
Both work, but they require that the model matrix is explictly created and then 
used in the regression. For example, assume I am using the star data in the 
mlmRev package
 
data(star, package='mlmRev')
 
I would first need to explictly create the model matrix for the fixed effects 
as follows and then use the strategy above to de-mean this matrix.
 
mat -model.matrix(lm(math~ -1 + sch, star))
 
Of course in R, this is rather inefficient as one generally only needs to have 
a factor for any independent variables and the model matrix is created for you 
when using lm(). So, my question is whether there is a more efficient way of 
creating the time de-meaned model matrix? Or, is the solution above the kind of 
strategy that must be used for this situation?
 
Harold
 
 

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R GUI in Ubuntu?

2007-03-09 Thread Andy Weller
I should also add that:
library(JGR)

gives me the following output:
Loading required package: rJava
Error in dyn.load(x, as.logical(local), as.logical(now)) :
 unable to load shared library 
'/usr/local/lib/R/site-library/rJava/libs/rJava.so':
   /usr/local/lib/R/site-library/rJava/libs/rJava.so: undefined symbol: 
JNI_GetCreatedJavaVMs
Error: .onLoad failed in 'loadNamespace' for 'rJava'
Error: package 'rJava' could not be loaded

I have Sun's Java installed and thought rJava installed without problems...

Thanks, Andy

Andy Weller wrote:
 OK, so I did:
 sudo R CMD javareconf
 
 followed by the following in R as root:
 install.packages(JGR,dep = TRUE)
 
 which I think went OK because if I do:
 library()
 
 then JGR is listed. From the terminal:
 Packages in library '/usr/local/lib/R/site-library':
 
 JavaGD  Java Graphics Device
 JGR JGR - Java Gui for R
 rJava   Low-level R to Java interface
 
 BUT, if I then run:
 JGR()
 
 then I get:
 Error: could not find function JGR
 
 I am confused...?!?
 
 Thanks in advance, Andy
 
 Dirk Eddelbuettel wrote:
 On Thu, Mar 08, 2007 at 07:05:15PM +0100, Andy Weller wrote:
 Dear all,

 I am very new to R and find the terminal-based UI a little daunting. 
 (That's probably the wrong thing to say!) Having searched the 
 Packages it seems that I can have either a Gnome-based or Java-based 
 GUI for my Ubuntu machine. However, I can get neither to work.

 Having run R as root, I then run the following command:
 install.packages(gnomeGUI, dependencies=TRUE)

 The output of which is:
 checking for gnomeConf.sh file in /usr/local/lib... not found
 configure: error: conditional HAVE_ORBIT was never defined.
 Usually this means the macro was only invoked conditionally.
 ERROR: configuration failed for package 'gnomeGUI'
 * Removing '/usr/local/lib/R/site-library/gnomeGUI'

 I have checked to see if I have all dependencies installed - it seems 
 as though I have. No luck! So I try the Java-based GUI with:
 install.packages(JGR,dep=TRUE)
 library(JGR)
 JGR()

 No luck. So, out of R I try:
 sudo R CMD javareconf

 I think you are close. Do the JGR install _after_ the javareconf as it
 needs the correct values.

 Also make sure you use the Sun Java packages you get for Ubuntu.

 Hope this helps, Dirk

 Then in R, if I check the library with:
 library(JGR)

 I get:
 Error: .onLoad failed in 'loadNamespace' for 'rJava'
 Error: package 'rJava' could not be loaded

 HMMmmm - still no joy! I guess I am missing something very basic here?!

 Thanks in advance, Andy

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a list that grows automatically

2007-03-09 Thread Alberto Monteiro
Young-Jin Lee asked:
 
 I would like to know if there is a way to create a list or an array (or
 anything) which grows automatically as more elements are put into 
 it. 

???

I think this is the default behaviour of R arrays:

x - vector(length=0)  # create a vector of zero length
x[1] - 2
x[10] - 3
x[length(x) + 1] - 4
x # 2 NA NA ... 3 4


 What I want to find is something equivalent to an ArrayList 
 object of Java language. In Java, I can do the following thing:
 
 // Java code
 ArrayList myArray = new ArrayList();
 myArray.add(object1);
 myArray.add(object2);
 
 // End of java code
 
myArray - vector(length=0)
myArray - c(myArray, object1)
myArray - c(myArray, object2)
myArray # array with 2 strings

Alberto Monteiro

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Off topic:Spam on R-help increase?

2007-03-09 Thread Martin Maechler
 DB == Douglas Bates [EMAIL PROTECTED]
 on Tue, 6 Mar 2007 11:57:28 -0600 writes:

DB On 3/6/07, Bert Gunter [EMAIL PROTECTED] wrote:
 In the past 2 days I have seen a large increase of  spam getting into
 R-help. Are others experiencing this problem? If so, has there been some
 change to the spam filters on the R-servers? If not, is the problem on my
 end?

DB There has indeed been an increase in the amount of spam making it
DB through to the list.  We apologize for the inconvenience.  Regretably
DB we will not be able to do much about it until the beginning of next
DB week.

DB Martin Maechler is on vacation at present and I am administering the
DB lists until he returns.  Most of the time this works even though the
DB mail servers are in Zurich Switzerland and I am in Madison, WI, USA.
DB However, in the last two days we have had a surge in spam and quite a
DB bit of it is getting through the filters.

DB The filters are catching some of the spam.  I think the main
DB difference in the last two days has been that the level of spam to the
DB lists has increased but it could be that something has happened to the
DB filters too.

I've been back today, well relaxed and tanned from the nice
vacation; thanks to all of you for taking such an interest in it:-) ;-) 

With a work back-log of almost 4 weeks, I hadn't dared to look
into my R-lists inbox of  2400 messages until about an hour ago.

Fortunately it's not the spammers that would have become
smarter (well they have or their hired geeks, but already a few
months ago, not just now).
The main problem has ``just'' been disk-server, then network and
file mount problems on the mail server that were unfortunately
not seen at first by our IT staff. 
As a consequence, there had also been enormous ( 24 hours)
delays in mail delivery, maybe less visible on the mailing list
side of it.
As far as I can see/guess now, the spam problem should have
lasted only about one to two days --- too long of course for
you, but at least not till I had returned to work.

Yes indeed, we are sorry for this, but no, we cannot promise it
won't happen again :-\

Martin

DB All the lists except R-help only allow postings from subscribers so
DB there should very little spam on the other lists.

DB This subscriber-only policy can be difficult for people like me who
DB receive email at one address but send it from another.  Either the
DB sender must remember to use the account that is registered for the
DB list or the list administrator must manually approve the posting.
DB Even worse, such a policy dissuades new useRs from posting because
DB they get a response that their message has been held pending manual
DB approval by the administrator.  Sometimes they react by reposting the
DB message, then re-reposting, then ...

DB We have avoided instituting such a policy on R-help because of the
DB level of administrative work that will be involved and our desire not
DB to dissuade new useRs from posting to the list.

DB However, if this keeps up we may need to reconsider.

DB I would ask for the list subscribers to bear with us until Martin
DB returns and can check on whether something has gone wrong with the
DB filters.

DB __
DB R-help@stat.math.ethz.ch mailing list
DB https://stat.ethz.ch/mailman/listinfo/r-help
DB PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
DB and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Duplicate rows of matrix

2007-03-09 Thread Bruno C\.
Hello my problem is the following:

I have a matrix A and a vector B which contains as many rows as A.

I need to build a matrix C which contains B[i]-times the row A[i,] and this for 
each line of A.

if for example A is

[1][2]
[1]  8  9.4
[2]  4.21.1

and B is (3,1). Then C will be:
[1][2]
[1]  8  9.4
[2]  8  9.4
[3]  8  9.4
[4]  4.21.1


I have some working code which go through all the lines of A and for each line 
does a rbind(C, A[i,]) B[i]-times 
But this is quite time consuming given that each rbind rebuild a new matrix ... 
is there any faster way?
I can think of some minor improvements like building a matrix C of zeros, 
containing as many columns as A and as many columns as the sum of elements of B 
... and the filing it.

But I was more looking for some already implemented function/package, is there 
any?

Thanx



--
Leggi GRATIS le tue mail con il telefonino i-mode™ di Wind
http://i-mode.wind.it

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] GLM: order of terms in model

2007-03-09 Thread Christian Landry
Dear R-helpers,

I have been analysing data using a GLM. My model is as follows:

mod - glm (V ~ T + as.factor(A) + N, family=gaussian)

and using

anova(mod, test=F)

to get the analysis of deviance table and the fraction of deviance  
explained by each term.

T and A dominate with respect to their Deviance, with T having a  
larger effect than A (about twice)

However, if I reverse T and A in the model, I get that A now explains  
more deviance than T.

My questions are: 1) What is it due to?
2) Is there any way around this? How do I find 
which model is  
best and/or can I use another method that won't be sensitive to the  
order of the terms.

Thanks,

Christian

Reply to: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicate rows of matrix

2007-03-09 Thread Christos Hatzis
Try this:

a - matrix(c(8, 4.2, 9.4, 1.1),2) 
b - c(3,1)

a[rep(1:nrow(a), b), ] 

-Christos

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com
 


 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bruno C.
 Sent: Friday, March 09, 2007 12:17 PM
 To: r-help
 Subject: [R] Duplicate rows of matrix
 
 Hello my problem is the following:
 
 I have a matrix A and a vector B which contains as many rows as A.
 
 I need to build a matrix C which contains B[i]-times the row 
 A[i,] and this for each line of A.
 
 if for example A is
 
 [1][2]
 [1]  8  9.4
 [2]  4.21.1
 
 and B is (3,1). Then C will be:
 [1][2]
 [1]  8  9.4
 [2]  8  9.4
 [3]  8  9.4
 [4]  4.21.1
 
 
 I have some working code which go through all the lines of A 
 and for each line does a rbind(C, A[i,]) B[i]-times But this 
 is quite time consuming given that each rbind rebuild a new 
 matrix ... is there any faster way?
 I can think of some minor improvements like building a matrix 
 C of zeros, containing as many columns as A and as many 
 columns as the sum of elements of B ... and the filing it.
 
 But I was more looking for some already implemented 
 function/package, is there any?
 
 Thanx
 
 
 
 --
 Leggi GRATIS le tue mail con il telefonino i-modeT di Wind 
 http://i-mode.wind.it
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a list that grows automatically

2007-03-09 Thread bogdan romocea
This is a bad idea as it can greatly slow things down (the details
were discussed several times on this list). What you want to do is
define from the start the length of your vector/list, then grow it (by
a large margin) only if it becomes full.
lst - vector(mode=list, length=10)  #assuming 100k nodes are enough
#populate the list, then remove the unused nodes if you care to
lst - lst[sapply(lst, function(x) {!is.null(x)})]


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Young-Jin Lee
 Sent: Friday, March 09, 2007 11:08 AM
 To: r-help
 Subject: [R] How to create a list that grows automatically

 Dear R users

 I would like to know if there is a way to create a list or an
 array (or
 anything) which grows automatically as more elements are put
 into it. What I
 want to find is something equivalent to an ArrayList object of Java
 language. In Java, I can do the following thing:

 // Java code
 ArrayList myArray = new ArrayList();
 myArray.add(object1);
 myArray.add(object2);
 
 // End of java code

 Thanks in advance.

 Young-Jin Lee

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Duplicate rows of matrix

2007-03-09 Thread Sebastian P. Luque
On Fri,  9 Mar 2007 18:17:04 +0100,
Bruno C\. [EMAIL PROTECTED] wrote:

 Hello my problem is the following: I have a matrix A and a vector B
 which contains as many rows as A.

 I need to build a matrix C which contains B[i]-times the row A[i,] and
 this for each line of A.

How about:


C - A[rep(seq(nrow(A)), B), ]


Completely untested, because you didn't provide example code.


-- 
Seb

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lpSolve space problem in R 2.4.1 on Windows XP

2007-03-09 Thread Talbot Katz
Hello Sam Buttrey.

Uwe Ligges from the r-help list asked me to forward this message to the 
maintainer of the lpSolve package, because R 2.4.1 is crashing when I run 
lp.  I saw your name listed in the lpSolve help file.  If you need more 
detail, please let me know.  Thanks!

--  TMK  --
212-460-5430home
917-656-5351cell



From: Uwe Ligges [EMAIL PROTECTED]
To: Talbot Katz [EMAIL PROTECTED]
CC: r-help@stat.math.ethz.ch
Subject: Re: [R] lpSolve space problem in R 2.4.1 on Windows XP
Date: Fri, 09 Mar 2007 17:51:30 +0100

If R is closed that way (i.e. crashes), it is a bug by definition: either 
in R or (more probable) in the package. Can you please contact the package 
maintainer to sort things out.

Thanks,
Uwe Ligges





Talbot Katz wrote:
Hi.

I am trying to use the linear optimizer from package lpSolve in R 2.4.1 on 
Windows XP (Version 5.1).  The problem I am trying to solve has 2843 
variables (2841 integer, 2 continuous) and 8524 constraints, and I have 2 
Gb of memory.  After I load the input data into R, I have at most 1.5 Gb 
of memory available.  If I start the lp with significantly less memory 
available (say 1 Gb), I get an error message from R:

Error: cannot allocate vector of size 189459 Kb

If I close all my other windows and try to maximize the available memory 
to the full 1.5 Gb, I can watch the memory get filled up until only about 
400 Mb is left, at which point I get a Windows error message:

R for Windows GUI front-end has encountered a problem and needs to close. 
  We are sorry for the inconvenience.

This behavior persists even when I relax the integer constraints, and 
eliminate the 2841 constraints that restrict the integer variables to 
values = 1, so I'm just running a standard lp with 2843 variables and 
5683 constraints.

I have been able to get the full MIP formulation to work correctly on some 
very small problems (~10 variables and 25 constraints).

Here is the code for a working example:

library(lpSolve)
(v1=rev(1:8))
[1] 8 7 6 5 4 3 2 1
(csv1=cumsum(as.numeric(v1)))
[1]  8 15 21 26 30 33 35 36
(lencsv1=length(csv1))
[1] 8
(Nm1=lencsv1-1)
[1] 7
(Np1=lencsv1+1)
[1] 9
ngp=3
f.obj=c(1,1,rep(0,Nm1))
f.int=3:Np1
bin.con=cbind(rep(0,Nm1),rep(0,Nm1),diag(Nm1))
bin.dir=rep(=,Nm1)
bin.rhs=rep(1,Nm1)
gp.con=c(0,0,rep(1,Nm1))
gp.dir==
(gp.rhs=ngp-1)
[1] 2
ub.con=cbind(rep(-1,rep(Nm1)),rep(0,Nm1),!upper.tri(matrix(nrow=Nm1,ncol=Nm1)))
ub.dir=rep(=,Nm1)
(ub.rhs=csv1[1:Nm1]*ngp/csv1[lencsv1])
[1] 0.667 1.250 1.750 2.167 2.500 2.750 2.917
lb.con=cbind(rep(0,Nm1),rep(1,rep(Nm1)),!upper.tri(matrix(nrow=Nm1,ncol=Nm1)))
lb.dir=rep(=,Nm1)
lb.rhs=ub.rhs
f.con=rbind(bin.con,gp.con,ub.con,lb.con)
f.dir=c(bin.dir,gp.dir,ub.dir,lb.dir)
f.rhs=c(bin.rhs,gp.rhs,ub.rhs,lb.rhs)
lglp=lp(min,f.obj,f.con,f.dir,f.rhs,int.vec=f.int)
lglp$objval
[1] 0.917
lglp$solution
[1] 0.000 0.917 0.000 1.000 0.000 1.000 0.000
[8] 0.000 0.000

What this is doing is taking the points of v1 and dividing them into 
contiguous groups (the variable ngp is the number of groups) such that the 
sums of the v1 values are as close as possible to equal within the three 
groups.  So, for v1 = c(8,7,6,5,4,3,2,1), the groups c(8,7), c(6,5), 
c(4,3,2,1), with sums 15,11,10 is the best such split, and the solution 
vector shows that the splitting occurs after the second and fourth 
elements.


Anyway, I am wondering...  Are 3000 variables and 8500 constraints usually 
too much for lpSolve to handle in 1.5 Gb of memory?  Is there a possible 
bug (in R or in Windows) that leads to the Windows error when the memory 
falls below 400 Mb?  Is there a problem with my formulation that makes it 
unstable even after the integer constraints are removed?

Thanks!


--  TMK  --
212-460-5430  home
917-656-5351  cell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] GLM: order of terms in model

2007-03-09 Thread Thomas Lumley

This is a FAQ

7.18 Why does the output from anova() depend on the order of factors in 
the model?

-thomas

On Fri, 9 Mar 2007, Christian Landry wrote:

 Dear R-helpers,

 I have been analysing data using a GLM. My model is as follows:

 mod - glm (V ~ T + as.factor(A) + N, family=gaussian)

 and using

 anova(mod, test=F)

 to get the analysis of deviance table and the fraction of deviance
 explained by each term.

 T and A dominate with respect to their Deviance, with T having a
 larger effect than A (about twice)

 However, if I reverse T and A in the model, I get that A now explains
 more deviance than T.

 My questions are: 1) What is it due to?
   2) Is there any way around this? How do I find 
 which model is
 best and/or can I use another method that won't be sensitive to the
 order of the terms.

 Thanks,

 Christian

 Reply to: [EMAIL PROTECTED]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reformulated matrices dimensions limitation problem

2007-03-09 Thread Maciej Radziejewski
Have a look at the help page for memory.size and memory.limit. The help says
you can use these functions on Windows and another approach with Unix. Once
you know the available memory you can calculate the total matrix size that
fits in it (knowing that a real number takes 8 bytes). I would recommend
using up to 70-80% of the available memory for your matrix.

Maciej.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] use nnet

2007-03-09 Thread Aimin Yan
I want to adjust weight decay and number of hidden units for nnet by 
a loop like
for(decay)
{
  for(number of unit)
   {
for(#run)
 {model-nnet()
   test.error-
 }
   }
}

for example:
I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and 
calculate test error

after that I want to get a matrix like this

decay  size   maxit  #run  test_error
0.13200   1   1.2
0.13200   2   1.1
0.13200   3   1.0
0.13200   4   3.4
0.13200   5..
0.13200   6 ..
0.13200   7   ..
0.13200   8  ..
0.13200   9   ..
0.13200   10   ..
0.23200   1   1.2
0.23200   2   1.1
0.23200   3   1.0
0.23200   4   3.4
0.23200   5..
0.23200   6 ..
0.23200   7   ..
0.23200   8  ..
0.23200   9   ..
0.23200   10   ..

I am not sure if this is correct way to do this?
Does anyone tune these parameters like this before?
thanks,

Aimin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use nnet

2007-03-09 Thread Wensui Liu
AM,
I have a pieice of junk on my blog. Here it is.
#
# USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR  #
# THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER #
# OF HIDDEN UNITS) OF NEURAL NETS   #
#

library(nnet);
library(MASS);
data(Boston);
X - I(as.matrix(Boston[-14]));
# STANDARDIZE PREDICTORS
st.X - scale(X);
Y - I(as.matrix(Boston[14]));
boston - data.frame(X = st.X, Y);

# DIVIDE DATA INTO TESTING AND TRAINING SETS
set.seed(2005);
test.rows - sample(1:nrow(boston), 100);
test.set - boston[test.rows, ];
train.set - boston[-test.rows, ];

# INITIATE A NULL TABLE
sse.table - NULL;

# SEARCH FOR OPTIMAL WEIGHT DECAY
# RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY
for (w in c(0.0001, 0.001, 0.01))
{
  # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS
  for (n in 1:10)
  {
# UNITIATE A NULL VECTOR
sse - NULL;
# FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES
for (i in 1:10)
{
  # INITIATE THE RANDOM STATE FOR EACH NET
  set.seed(i);
  # TRAIN NEURAL NETS
  net - nnet(Y~X, size = n, data = train.set, rang = 0.1,
   linout = TRUE, maxit = 1, decay = w,
   skip = FALSE, trace = FALSE);
  # CALCULATE SSE FOR TESTING SET
  test.sse - sum((test.set$Y - predict(net, test.set))^2);
  # APPEND EACH SSE TO A VECTOR
  if (i == 1) sse - test.sse else sse - rbind(sse, test.sse);
}
# APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE
sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse)));
  }
}
# PRINT OUT THE RESULT
print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry


On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote:
 I want to adjust weight decay and number of hidden units for nnet by
 a loop like
 for(decay)
 {
   for(number of unit)
{
 for(#run)
  {model-nnet()
test.error-
  }
}
 }

 for example:
 I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and
 calculate test error

 after that I want to get a matrix like this

 decay  size   maxit  #run  test_error
 0.13200   1   1.2
 0.13200   2   1.1
 0.13200   3   1.0
 0.13200   4   3.4
 0.13200   5..
 0.13200   6 ..
 0.13200   7   ..
 0.13200   8  ..
 0.13200   9   ..
 0.13200   10   ..
 0.23200   1   1.2
 0.23200   2   1.1
 0.23200   3   1.0
 0.23200   4   3.4
 0.23200   5..
 0.23200   6 ..
 0.23200   7   ..
 0.23200   8  ..
 0.23200   9   ..
 0.23200   10   ..

 I am not sure if this is correct way to do this?
 Does anyone tune these parameters like this before?
 thanks,

 Aimin

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use nnet

2007-03-09 Thread Wensui Liu
AM,
Sorry. please ignore the top box in the code. It is not actually a cv
validation but just a simple split-sample validation.
sorry for confusion.

On 3/9/07, Wensui Liu [EMAIL PROTECTED] wrote:
 AM,
 I have a pieice of junk on my blog. Here it is.
 #
 # USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR  #
 # THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER #
 # OF HIDDEN UNITS) OF NEURAL NETS   #
 #

 library(nnet);
 library(MASS);
 data(Boston);
 X - I(as.matrix(Boston[-14]));
 # STANDARDIZE PREDICTORS
 st.X - scale(X);
 Y - I(as.matrix(Boston[14]));
 boston - data.frame(X = st.X, Y);

 # DIVIDE DATA INTO TESTING AND TRAINING SETS
 set.seed(2005);
 test.rows - sample(1:nrow(boston), 100);
 test.set - boston[test.rows, ];
 train.set - boston[-test.rows, ];

 # INITIATE A NULL TABLE
 sse.table - NULL;

 # SEARCH FOR OPTIMAL WEIGHT DECAY
 # RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY
 for (w in c(0.0001, 0.001, 0.01))
 {
   # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS
   for (n in 1:10)
   {
 # UNITIATE A NULL VECTOR
 sse - NULL;
 # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES
 for (i in 1:10)
 {
   # INITIATE THE RANDOM STATE FOR EACH NET
   set.seed(i);
   # TRAIN NEURAL NETS
   net - nnet(Y~X, size = n, data = train.set, rang = 0.1,
linout = TRUE, maxit = 1, decay = w,
skip = FALSE, trace = FALSE);
   # CALCULATE SSE FOR TESTING SET
   test.sse - sum((test.set$Y - predict(net, test.set))^2);
   # APPEND EACH SSE TO A VECTOR
   if (i == 1) sse - test.sse else sse - rbind(sse, test.sse);
 }
 # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE
 sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse)));
   }
 }
 # PRINT OUT THE RESULT
 print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry


 On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote:
  I want to adjust weight decay and number of hidden units for nnet by
  a loop like
  for(decay)
  {
for(number of unit)
 {
  for(#run)
   {model-nnet()
 test.error-
   }
 }
  }
 
  for example:
  I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and
  calculate test error
 
  after that I want to get a matrix like this
 
  decay  size   maxit  #run  test_error
  0.13200   1   1.2
  0.13200   2   1.1
  0.13200   3   1.0
  0.13200   4   3.4
  0.13200   5..
  0.13200   6 ..
  0.13200   7   ..
  0.13200   8  ..
  0.13200   9   ..
  0.13200   10   ..
  0.23200   1   1.2
  0.23200   2   1.1
  0.23200   3   1.0
  0.23200   4   3.4
  0.23200   5..
  0.23200   6 ..
  0.23200   7   ..
  0.23200   8  ..
  0.23200   9   ..
  0.23200   10   ..
 
  I am not sure if this is correct way to do this?
  Does anyone tune these parameters like this before?
  thanks,
 
  Aimin
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 WenSui Liu
 A lousy statistician who happens to know a little programming
 (http://spaces.msn.com/statcompute/blog)



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Applying some equations over all unique combinations of 4 variables

2007-03-09 Thread John Kane
#I have a data set that looks like this.   A bit more
complicated actually with
# three factor levels but these calculations need to
be done on one factor at a
#I then have a set of different rates that are applied
#to it.

#dataset
cata - c( 1,1,6,1,1,2)
catb - c( 1,2,3,4,5,6)
doga - c(3,5,3,6,4, 0)

data1 - data.frame(cata, catb, doga)
rm(cata,catb,doga)
data1

# start rates
# names for lists
fnams  - c(af, pf, cf, mf)
mnams  -  c(am, pm, cm, mm)

# Current layout of the rate data frames
alphahill - list(af - c(a1,a2,a3), pf -
c(d1,d2,d3),
  cf - c(f1,f2), mf - c(h1,h2))
names(alphahill)  -  fnams
   
betahill - list(am - c(b1,b2,b3), pm-
c(e1,e2,e3),
 cm - c(g1,g2), mm - c(j1, j2))
names(betahill) - mnams

hilltop - list(af - data.frame(a1 - 1:4 , a2 -
2:5, a3 - 3:6),
pf - data.frame(d1 - 4:1, d2 - 5:2,
d3 - 6:3),
cf - data.frame(f1 - 1:4, f2 -
3:6),
mf - data.frame(h1 - 1:4,  h2 -
2:5))

hilldown - list(am - data.frame(b1 - 4:1, b2 -
5:2, b3 - 6:3),
 pm - data.frame(e1 - 5:1, e2 -
1:5,e3 - 6:2),
 cm - data.frame (g1 - 5:1, g2 -
1:5),
 mm  - data.frame(j1 - 1:4,  j2 -
5:2))
names(hilltop) - fnams
names(hilldown) - mnams
for (i in 1:4) {
  names(hilltop[[i]]) - alphahill[[i]]
  names(hilldown[[i]]) - betahill[[i]]
}

rm(a1,a2,a3,b1,b2,b3,d1,d2,d3,e1,e2,e3,f1,f2,g1,g2,h1,h2,j1,j2,
fnams, mnams,
af, am,cf,cm,mf, mm,pf, pm)
# Now that's out of the way

#Assuming I am reading this problem correctly I should
have
#648  possible combinations for each row of data that
is:
#unique combinations where I need
#   (af*am) * (pf*pm) * (cf*cm) *  mf * mm
# ie (3*3)  *  (3* 3)  * (2*2)  *   2*2)
# based on the idea that there are  9  unique
combination for af  am and so
# on.

# af   am
# 1   a1   b1
# 2   a2   b1
# 3   a3   b1
# 4   a1   b2
# 5   a2   b2
# 6   a3   b2
# 7   a1   b3
# 8   a2   b3
# 9   a3   b3

# I have a set of equations of the form :

#P1 - af*cata + pf*catb^cf + mf*doga
#S1 - am*cata + pm*catb^cm + mm*doga

#Is there any way that I can do something like this
and keep track of
#what condition is what since I need to be able to sum
the P1s and P2, for
# for each combination (or a subset of them) ?  I
suspect it may be a fairly
# straight-forward apply  problem but I am having a
real problem with it.

# I am only likely to need to report, perhaps. 15
combinations but at the
# moment Idon't see any easy way to do them and doing
all possible outcomes and
# extracting the required ones looks like a better and
safer approach if it can
# be done.And will save a lot of time if we suddenly
need a few new comparisons.

# Any help would be greatly appreciated.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a list that grows automatically

2007-03-09 Thread hadley wickham
 I would like to know if there is a way to create a list or an array (or
 anything) which grows automatically as more elements are put into it. What I
 want to find is something equivalent to an ArrayList object of Java
 language. In Java, I can do the following thing:

 // Java code
 ArrayList myArray = new ArrayList();
 myArray.add(object1);
 myArray.add(object2);
 
 // End of java code

As others have mentioned, you can do this with lists in R.

However, there is an important difference between ArrayLists in Java
and Lists in R.  In Java, when an ArrayList grows past its bound, it
doesn't allocate just enough space, it allocates a lot more, so the
next time you allocate past the end of the array, there's space
already reserved.  This gives (IIRC) amortised O(n) behaviour.  R
doesn't do this however, so has to copy the entire array every time
giving O(n^2) behaviour.

Hadley

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Applying some equations over all unique combinations of 4 variables

2007-03-09 Thread John Kane
I just realised after posting I have two vectors of
the  wrong length.  The corrected program is:

#dataset
cata - c( 1,1,6,1,1,2)
catb - c( 1,2,3,4,5,6)
doga - c(3,5,3,6,4, 0)

data1 - data.frame(cata, catb, doga)
rm(cata,catb,doga)
data1

# start rates
# names for lists
fnams  - c(af, pf, cf, mf)
mnams  -  c(am, pm, cm, mm)

# Current layout of the rate data frames
alphahill - list(af - c(a1,a2,a3), pf -
c(d1,d2,d3),
  cf - c(f1,f2), mf - c(h1,h2))
names(alphahill)  -  fnams
   
betahill - list(am - c(b1,b2,b3), pm-
c(e1,e2,e3),
 cm - c(g1,g2), mm - c(j1, j2))
names(betahill) - mnams

hilltop - list(af - data.frame(a1 - 1:4 , a2 -
2:5, a3 - 3:6),
pf - data.frame(d1 - 4:1, d2 - 5:2,
d3 - 6:3),
cf - data.frame(f1 - 1:4, f2 -
3:6),
mf - data.frame(h1 - 1:4,  h2 -
3:6))

hilldown - list(am - data.frame(b1 - 4:1, b2 -
5:2, b3 - 6:3),
 pm - data.frame(e1 - 4:1, e2 -
1:4,e3 - 6:3),
 cm - data.frame (g1 - 4:1, g2 -
1:4),
 mm  - data.frame(j1 - 1:4,  j2 -
4:1))
names(hilltop) - fnams
names(hilldown) - mnams
for (i in 1:4) {
  names(hilltop[[i]]) - alphahill[[i]]
  names(hilldown[[i]]) - betahill[[i]]
}

rm(a1,a2,a3,b1,b2,b3,d1,d2,d3,e1,e2,e3,f1,f2,g1,g2,h1,h2,j1,j2,
fnams, mnams,
af, am,cf,cm,mf, mm,pf, pm)
--- John Kane [EMAIL PROTECTED] wrote:

 #I have a data set that looks like this.   A bit
 more
 complicated actually with
 # three factor levels but these calculations need to
 be done on one factor at a
 #I then have a set of different rates that are
 applied
 #to it.
 
 #dataset
 cata - c( 1,1,6,1,1,2)
 catb - c( 1,2,3,4,5,6)
 doga - c(3,5,3,6,4, 0)
 
 data1 - data.frame(cata, catb, doga)
 rm(cata,catb,doga)
 data1
 
 # start rates
 # names for lists
 fnams  - c(af, pf, cf, mf)
 mnams  -  c(am, pm, cm, mm)
 
 # Current layout of the rate data frames
 alphahill - list(af - c(a1,a2,a3), pf -
 c(d1,d2,d3),
   cf - c(f1,f2), mf -
 c(h1,h2))
 names(alphahill)  -  fnams

 betahill - list(am - c(b1,b2,b3), pm-
 c(e1,e2,e3),
  cm - c(g1,g2), mm - c(j1,
 j2))
 names(betahill) - mnams
 
 hilltop - list(af - data.frame(a1 - 1:4 , a2 -
 2:5, a3 - 3:6),
 pf - data.frame(d1 - 4:1, d2 -
 5:2,
 d3 - 6:3),
 cf - data.frame(f1 - 1:4, f2 -
 3:6),
 mf - data.frame(h1 - 1:4,  h2 -
 2:5))
 
 hilldown - list(am - data.frame(b1 - 4:1, b2 -
 5:2, b3 - 6:3),
  pm - data.frame(e1 - 5:1, e2 -
 1:5,e3 - 6:2),
  cm - data.frame (g1 - 5:1, g2 -
 1:5),
  mm  - data.frame(j1 - 1:4,  j2 -
 5:2))
 names(hilltop) - fnams
 names(hilldown) - mnams
 for (i in 1:4) {
   names(hilltop[[i]]) - alphahill[[i]]
   names(hilldown[[i]]) - betahill[[i]]
 }
 

rm(a1,a2,a3,b1,b2,b3,d1,d2,d3,e1,e2,e3,f1,f2,g1,g2,h1,h2,j1,j2,
 fnams, mnams,
 af, am,cf,cm,mf, mm,pf, pm)
 # Now that's out of the way
 
 #Assuming I am reading this problem correctly I
 should
 have
 #648  possible combinations for each row of data
 that
 is:
 #unique combinations where I need
 #   (af*am) * (pf*pm) * (cf*cm) *  mf * mm
 # ie (3*3)  *  (3* 3)  * (2*2)  *   2*2)
 # based on the idea that there are  9  unique
 combination for af  am and so
 # on.
 
 # af   am
 # 1   a1   b1
 # 2   a2   b1
 # 3   a3   b1
 # 4   a1   b2
 # 5   a2   b2
 # 6   a3   b2
 # 7   a1   b3
 # 8   a2   b3
 # 9   a3   b3
 
 # I have a set of equations of the form :
 
 #P1 - af*cata + pf*catb^cf + mf*doga
 #S1 - am*cata + pm*catb^cm + mm*doga
 
 #Is there any way that I can do something like this
 and keep track of
 #what condition is what since I need to be able to
 sum
 the P1s and P2, for
 # for each combination (or a subset of them) ?  I
 suspect it may be a fairly
 # straight-forward apply  problem but I am having
 a
 real problem with it.
 
 # I am only likely to need to report, perhaps. 15
 combinations but at the
 # moment Idon't see any easy way to do them and
 doing
 all possible outcomes and
 # extracting the required ones looks like a better
 and
 safer approach if it can
 # be done.And will save a lot of time if we suddenly
 need a few new comparisons.
 
 # Any help would be greatly appreciated.
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reg. strings and numeric data in matrix.

2007-03-09 Thread Mallika Veeramalai

Hi All,

Sorry for this basic question as I am new to this R. I would like to know,
is it possible to consider a matrix with some columns having numeric data
and some other's with characters (strings) data?  How do I get this type of
data from a flat file.

Thanks very much,
mallika 


Mallika Veeramalai, Ph.D.,
Postdoctoral Associate,
Bioinformatics  Systems Biology,
Burnham Institute for Medical Research,
La Jolla,  CA 92037, USA.
phone : +1 858 646 3100 ext: 3627
Fax   : +1 858 795 5249
Web   : http://bioinformatics.burnham.org/~mallika/
Email : [EMAIL PROTECTED] (or) [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About cex=: how to improve resolution?

2007-03-09 Thread Luca Quaglia
Hi,

I need to plot a graph with a fixed circle and with a
series of point of different size. Here is a
simplified example:

angle-pi/180*c(0:360)
x-seq(0,2,by=0.2)
y-seq(0,2,by=0.2)
z-seq(0,1,by=0.1)
par(pty=s)
plot(-2:2,-2:2,type=n)
lines(cos(angle),sin(angle))
points(x,y,cex=z)

The size of the points compared to the circle (of
radius 1) is important and bears a meaning. 

But instead of having 11 points with increasing size,
I only obtain points of the same size when
cex=0.1/0.2/0.3/0.4 or cex=0.5/0.6/0.7 or
cex=0.8/0.9/1.0.

Please, does anyone know if there is a way of
improving the resolution of cex= *without* changing
the size of the circle of radius 1 and keeping the
same axis?

Thanks in advance!!!

Luca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting text from a character string

2007-03-09 Thread Shawn Way
I have a set of character strings like below:
   
   data3[1]
[1] CB01_0171_03-27-2002-(Sample 26609)-(126)
 
   
  I am trying to extract the text 03-27-2002 and convert this into a date for 
the same record.  I keep looking at the grep function, however I cannot quite 
get it to work.
   
  grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)
   
  Any hints?
   
  Shawn Way

 
-
Sucker-punch spam with award-winning protection.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MCMC logit

2007-03-09 Thread Anamika Chaudhuri
Hi, 
I have a dataset with the binary outcome Y(0,1) and 4 covariates (X1,X@,X#,X$). 
I am trying to use MCMClogit to model logistic regression using MCMC. I am 
getting an error where it doesnt identify the covariates ,although its reading 
in correctly. The dataset is a sample of actual dataset. Below is my code:
 ###
 
 
 #retreive data
 # considering four covariates
 d.df=as.data.frame(read.table(c:/tina/phd/thesis/data/modified_data1.1.txt,header=T,sep=,))
 y=d.df[,ncol(d.df)]
 x=d.df[,1:4]
 c.df=cbind(y,x)
 #x=cbind(1,x)
 p - ncol(c.df)
 
 # marginal log-prior of beta[]
 logpriorfun - function(beta, mu, gshape, grate)
+ {
+ logprior = -p*log(2) + log(gamma(p+gshape)) - log(gamma(gshape))
+ + gshape*log(grate) - (p+gshape)* log(grate+sum(abs(beta)))
+ return(logprior)
+ }
 require(MCMCpack)
Loading required package: MCMCpack
Loading required package: coda
Loading required package: lattice
Loading required package: MASS
##
## Markov Chain Monte Carlo Package (MCMCpack)
## Copyright (C) 2003-2007 Andrew D. Martin and Kevin M. Quinn
##
## Support provided by the U.S. National Science Foundation
## (Grants SES-0350646 and SES-0350613)
##
[1] TRUE
Warning message:
package 'MASS' was built under R version 2.4.1 
 a0 = 0.5
 b0 = 1
 mu0 = 0
 beta.init=list(c(0, rep(0.1,4)), c(0, rep(-0.1,4)), c(0, rep(0, 4)))
 burnin.cycles = 1000
 mcmc.cycles = 25000
 # three chains
 post.list - lapply(beta.init, function(vec)
+ {
+ posterior - MCMClogit(y~x1+x2+x3+x4, data=c.df, burnin=burnin.cycles, 
mcmc=mcmc.cycles,
+ thin=5, tune=0.5, beta.start=vec, user.prior.density=logpriorfun, logfun=TRUE,
+ mu=mu0, gshape=a0, grate=b0)
+ return(posterior)
+ })
Error in eval(expr, envir, enclos) : object x1 not found
 
  Any suggestions will be greatly appreciated.
  Thanks,
Anamika

 
-
We won't tell. Get more on shows you hate to love

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting text from a character string

2007-03-09 Thread Shawn Way
 I have a set of character strings like below:
 
 data3[1]
[1] CB01_0171_03-27-2002-(Sample 26609)-(126)
 
 
I am trying to extract the text 03-27-2002 and convert this into a date 
for the same record.  I keep looking at the grep function, however I 
cannot quite get it to work.
 
grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)
 
Any hints?
 

---
Shawn Way
14 Cambridge Center
Cambridge, MA 02142

Ph:617-679-4488
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg. strings and numeric data in matrix.

2007-03-09 Thread Ben Bolker
Mallika Veeramalai mallikav at burnham.org writes:

I would like to know,
 is it possible to consider a matrix with some columns having numeric data
 and some other's with characters (strings) data?  How do I get this type of
 data from a flat file.

  It's called a data frame.  See the Introduction to R,
and help for read.table and read.csv.  (The character data
will get made into factors unless you use as.is=TRUE
or specify colClasses.)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About cex=: how to improve resolution?

2007-03-09 Thread Greg Snow
If the size of the circle is important, then you may want to use the
symbols function with the circle argument rather than points and cex.
Use the inches argument to set the size (in inches) of the largest
circle, then the other circles will be scalled accordingly.  Or if you
set inches=FALSE, then the circles will be scaled to the x-axis.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Luca Quaglia
 Sent: Friday, March 09, 2007 1:11 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] About cex=: how to improve resolution?
 
 Hi,
 
 I need to plot a graph with a fixed circle and with a series 
 of point of different size. Here is a simplified example:
 
 angle-pi/180*c(0:360)
 x-seq(0,2,by=0.2)
 y-seq(0,2,by=0.2)
 z-seq(0,1,by=0.1)
 par(pty=s)
 plot(-2:2,-2:2,type=n)
 lines(cos(angle),sin(angle))
 points(x,y,cex=z)
 
 The size of the points compared to the circle (of radius 1) 
 is important and bears a meaning. 
 
 But instead of having 11 points with increasing size, I only 
 obtain points of the same size when
 cex=0.1/0.2/0.3/0.4 or cex=0.5/0.6/0.7 or cex=0.8/0.9/1.0.
 
 Please, does anyone know if there is a way of improving the 
 resolution of cex= *without* changing the size of the 
 circle of radius 1 and keeping the same axis?
 
 Thanks in advance!!!
 
 Luca
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg. strings and numeric data in matrix.

2007-03-09 Thread John Kane

--- Mallika Veeramalai [EMAIL PROTECTED] wrote:

 
 Hi All,
 
 Sorry for this basic question as I am new to this R.
 I would like to know,
 is it possible to consider a matrix with some
 columns having numeric data
 and some other's with characters (strings) data? 
 How do I get this type of
 data from a flat file.
 
 Thanks very much,
 mallika 

If I understand the question the answer is NO. A
matrix must be of one type of data.  

I think that what you want is a data.frame wich allows
mixed categores of data.  
Try this to see the difference.

a - c('a','b','c')
b - c( 1,2,3)

aa - cbind(a,b)
aa
class(aa)

bb - data.frame(a,b)
bb
class(bb)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg. strings and numeric data in matrix.

2007-03-09 Thread Petr Klasterecky
See
?data.frame
?read.table
and please read (appropriate parts of) the Introduction to R manual.
Petr

Mallika Veeramalai napsal(a):
 Hi All,
 
 Sorry for this basic question as I am new to this R. I would like to know,
 is it possible to consider a matrix with some columns having numeric data
 and some other's with characters (strings) data?  How do I get this type of
 data from a flat file.
 
 Thanks very much,
 mallika 
 
 
 Mallika Veeramalai, Ph.D.,
 Postdoctoral Associate,
 Bioinformatics  Systems Biology,
 Burnham Institute for Medical Research,
 La Jolla,  CA 92037, USA.
 phone : +1 858 646 3100 ext: 3627
 Fax   : +1 858 795 5249
 Web   : http://bioinformatics.burnham.org/~mallika/
 Email : [EMAIL PROTECTED] (or) [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dendrogram again

2007-03-09 Thread bunny , lautloscrew.com
Though your example helped, i am still trying to get exactly what i  
want.
Here´s how i get my dendrogram. and my example... (using ward  
instead of average, doesnt make any difference for my case - i believe)

hc=hclust(dist(mymatrix),average)
hcd=as.dendrogram(hc)
plot(hcd)

the major problem in my case is that my matrix has 196 lines, that  
means the end of the dendrogram has almost 200 end nodes.
ofcourse i know cut meanwhile ;) ... but somehow the branches that  
i get by cut dont help me.

If i look at the full dendrogram i have about 25 nodes. i just want  
to label about 5 of them (from the upper end).
Everything that happens below is not of interest for me.  Do you know  
how to label some specific nodes ?

I´ll try max.levels...

thx in advance

m.-




Am 09.03.2007 um 13:52 schrieb Gavin Simpson:

 require(vegan) ## if false install it
 data(varespec)

 hc - hclust(vegdist(varespec, bray), method = ward)
 hc - as.dendrogram(hc)

 ## this is the full dendrogram - too many nodes, so prune
 plot(hc)

 ## lets take four clusters and prune it back
 hc.pruned - cut(hc, h = 1) # can't specify k so read height of first
 # plot - cutting at h = 1 gives 4 clusters

 # plot only the upper part of the tree showing only the 4 clusters
 plot(hc.pruned$upper, center = TRUE)


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reg. strings and numeric data in matrix.

2007-03-09 Thread Greg Snow
A matrix can only have 1 type of data, so if you try to include both
strings and numbers in a matrix, the numbers will be converted to
strings.

Another type of data object is a data frame, a data frame works much
like a matrix in many ways, but allows some columns to be numbers and
others to be strings (though usually strings are converted to factors).

You should read (or reread) the help page An Introduction to R,
section 5 talks about matricies, then section 6 talks about data frames
(and lists).  Section 7 shows how to read data from files into data
frames.  Those 3 sections should answer your questions below.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of 
 Mallika Veeramalai
 Sent: Friday, March 09, 2007 1:03 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Reg. strings and numeric data in matrix.
 
 
 Hi All,
 
 Sorry for this basic question as I am new to this R. I would 
 like to know, is it possible to consider a matrix with some 
 columns having numeric data and some other's with characters 
 (strings) data?  How do I get this type of data from a flat file.
 
 Thanks very much,
 mallika 
 
 __
 __
 Mallika Veeramalai, Ph.D.,
 Postdoctoral Associate,
 Bioinformatics  Systems Biology,
 Burnham Institute for Medical Research,
 La Jolla,  CA 92037, USA.
 phone : +1 858 646 3100 ext: 3627
 Fax   : +1 858 795 5249
 Web   : http://bioinformatics.burnham.org/~mallika/
 Email : [EMAIL PROTECTED] (or) [EMAIL PROTECTED]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting text from a character string

2007-03-09 Thread Greg Snow
Try replacing \d with \\d throughout your pattern.  The R parser is
trying to interpret the \ before the grep function ever sees it.  By
backslashing the backslashes, the parser ends up putting a single
backslash in the pattern for grep to see.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
(801) 408-8111
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Shawn Way
 Sent: Friday, March 09, 2007 1:12 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Extracting text from a character string
 
 I have a set of character strings like below:

data3[1]
 [1] CB01_0171_03-27-2002-(Sample 26609)-(126)
  

   I am trying to extract the text 03-27-2002 and convert this 
 into a date for the same record.  I keep looking at the grep 
 function, however I cannot quite get it to work.

   grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)

   Any hints?

   Shawn Way
 
  
 -
 Sucker-punch spam with award-winning protection.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting text from a character string

2007-03-09 Thread Marc Schwartz
On Fri, 2007-03-09 at 15:23 -0500, Shawn Way wrote:
  I have a set of character strings like below:
  
  data3[1]
 [1] CB01_0171_03-27-2002-(Sample 26609)-(126)
  
  
 I am trying to extract the text 03-27-2002 and convert this into a date 
 for the same record.  I keep looking at the grep function, however I 
 cannot quite get it to work.
  
 grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)
  
 Any hints?


At least two different ways:

Vec - CB01_0171_03-27-2002-(Sample 26609)-(126)


1. Using substr(), if your source vector is a fixed format

# Get the 11th thru the 20th character
 substr(Vec, 11, 20)
[1] 03-27-2002


2. Using sub() for a more generalized approach:

# Use a back reference, returning the value pattern within the 
# parens

 sub(.+([0-9]{2}-[0-9]{2}-[0-9]{4}).+, \\1, Vec)
[1] 03-27-2002


See ?substr, ?sub and ?regex

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Extracting text from a character string

2007-03-09 Thread Shawn Way
I have a set of character strings like below:
 
 data3[1]
[1] CB01_0171_03-27-2002-(Sample 26609)-(126)
 
 
I am trying to extract the text 03-27-2002 and convert
this into a date for the same record.  I keep looking
at the grep function, however I cannot quite get it to
work.
 
grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)
 
Any hints?
 
Shawn Way



 

We won't tell. Get more on shows you hate to love

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting text from a character string

2007-03-09 Thread Gabor Grothendieck
Try this:

library(gsubfn)
x - CB01_0171_03-27-2002-(Sample 26609)-(126)
unlist(strapply(x, ..-..-))

The gsubfn home page is at:
http://code.google.com/p/gsubfn/

On 3/9/07, Shawn Way [EMAIL PROTECTED] wrote:
 I have a set of character strings like below:

   data3[1]
 [1] CB01_0171_03-27-2002-(Sample 26609)-(126)
 

  I am trying to extract the text 03-27-2002 and convert this into a date for 
 the same record.  I keep looking at the grep function, however I cannot quite 
 get it to work.

  grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)

  Any hints?

  Shawn Way


 -
 Sucker-punch spam with award-winning protection.

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting text from a character string

2007-03-09 Thread Wensui Liu
actually, I am thinking of strsplit().

On 3/9/07, Shawn Way [EMAIL PROTECTED] wrote:
 I have a set of character strings like below:

data3[1]
 [1] CB01_0171_03-27-2002-(Sample 26609)-(126)
 

   I am trying to extract the text 03-27-2002 and convert this into a date for 
 the same record.  I keep looking at the grep function, however I cannot quite 
 get it to work.

   grep(\d\d-\d\d-\d\d\d\d,data3[1],perl=TRUE,value=TRUE)

   Any hints?

   Shawn Way


 -
 Sucker-punch spam with award-winning protection.

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] About cex=: how to improve resolution?

2007-03-09 Thread Luca Quaglia
Hi,

I need to plot a graph with a circle of radius 1 and
with a series of points of different size. The size of
these points compared to the fixed circle is important
and bears a meaning.

Here is the a simplified version of the code I'm
using:

x-seq(0,2,by=0.2)
y-x
z-seq(0,1,by=0.1)
angle-pi/180*c(0:359)
par(pty=s)
plot(-2:2,-2:2,type=n)
lines(cos(angle),sin(angle))
points(x,y,cex=z)

I obtain points of the same size when
cex=0.1/0.2/0.3/0.4 or cex=0.5/0.6/0.7 or
cex=0.8/0.9/1.0.

Please, does anyone know if there is a way of
improving the resolution of cex in order to have 10
points *all* of different size (respecting the above
written different values of cex)? The circle is fixed
of radius 1 and the values of cex are in relation with
that and they shouldn't be modified.

Thanks, Luca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dendrogram - got it , just need to label :)

2007-03-09 Thread bunny , lautloscrew.com
Hi all, Hi Gavin,

thx for your help i finally found out what i want to do and how to  
fix it.
just needed to get some more level my cut level was too small...

two question remain...

a) can i somehow scale the twigs after cutting ?
b) how can i label the nodes and how to label which one...

thx !!

-m.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reformulated matrices dimensions limitation problem

2007-03-09 Thread elw


 Have a look at the help page for memory.size and memory.limit. The help 
 says you can use these functions on Windows and another approach with 
 Unix. Once you know the available memory you can calculate the total 
 matrix size that fits in it (knowing that a real number takes 8 bytes). 
 I would recommend using up to 70-80% of the available memory for your 
 matrix.


But then there's the overhead for each R object, which is very 
non-trivial (but not so bad as to be completely depressing...).

--e

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MCMC logit

2007-03-09 Thread Ravi Varadhan
As the error message clearly indicates, the function MCMClogit is unable to
find the variable x1 (possibly x2,x3, and x4 also) in the data frame c.df.
Check the names of the variables in that data frame and make sure that the
names correspond to the formula specification.

Hope this helps,
Ravi.


---

Ravi Varadhan, Ph.D.

Assistant Professor, The Center on Aging and Health

Division of Geriatric Medicine and Gerontology 

Johns Hopkins University

Ph: (410) 502-2619

Fax: (410) 614-9625

Email: [EMAIL PROTECTED]

Webpage:  http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 





-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Anamika Chaudhuri
Sent: Friday, March 09, 2007 3:27 PM
To: r-help@stat.math.ethz.ch
Subject: [R] MCMC logit

Hi, 
I have a dataset with the binary outcome Y(0,1) and 4 covariates
(X1,X@,X#,X$). I am trying to use MCMClogit to model logistic regression
using MCMC. I am getting an error where it doesnt identify the covariates
,although its reading in correctly. The dataset is a sample of actual
dataset. Below is my code:
 ###
 
 
 #retreive data
 # considering four covariates

d.df=as.data.frame(read.table(c:/tina/phd/thesis/data/modified_data1.1.txt
,header=T,sep=,))
 y=d.df[,ncol(d.df)]
 x=d.df[,1:4]
 c.df=cbind(y,x)
 #x=cbind(1,x)
 p - ncol(c.df)
 
 # marginal log-prior of beta[]
 logpriorfun - function(beta, mu, gshape, grate)
+ {
+ logprior = -p*log(2) + log(gamma(p+gshape)) - log(gamma(gshape))
+ + gshape*log(grate) - (p+gshape)* log(grate+sum(abs(beta)))
+ return(logprior)
+ }
 require(MCMCpack)
Loading required package: MCMCpack
Loading required package: coda
Loading required package: lattice
Loading required package: MASS
##
## Markov Chain Monte Carlo Package (MCMCpack)
## Copyright (C) 2003-2007 Andrew D. Martin and Kevin M. Quinn
##
## Support provided by the U.S. National Science Foundation
## (Grants SES-0350646 and SES-0350613)
##
[1] TRUE
Warning message:
package 'MASS' was built under R version 2.4.1 
 a0 = 0.5
 b0 = 1
 mu0 = 0
 beta.init=list(c(0, rep(0.1,4)), c(0, rep(-0.1,4)), c(0, rep(0, 4)))
 burnin.cycles = 1000
 mcmc.cycles = 25000
 # three chains
 post.list - lapply(beta.init, function(vec)
+ {
+ posterior - MCMClogit(y~x1+x2+x3+x4, data=c.df, burnin=burnin.cycles,
mcmc=mcmc.cycles,
+ thin=5, tune=0.5, beta.start=vec, user.prior.density=logpriorfun,
logfun=TRUE,
+ mu=mu0, gshape=a0, grate=b0)
+ return(posterior)
+ })
Error in eval(expr, envir, enclos) : object x1 not found
 
  Any suggestions will be greatly appreciated.
  Thanks,
Anamika

 
-
We won't tell. Get more on shows you hate to love

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About cex=: how to improve resolution?

2007-03-09 Thread Richard M. Heiberger
replace
points(x,y,cex=z)
with
symbols(x, y, circles=z/10, inches=FALSE, add=TRUE)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using large datasets: can I overload the subscript operator?

2007-03-09 Thread Maciej Radziejewski
Hello,

I do some computations on datasets that come from climate models. These data
are huge arrays, significantly larger than typically available RAM, so they
have to be accessed row-by-row, or rather slice-by slice, depending on the
task. I would like to make an R package to easily access such datasets
within R. The C++ backend is ready and being used under Windows/.Net/Visual
Basic, but I have yet to learn the specifics of R programming to make a good
R interface.

I think it should be possible to make a package (call it slice) that could
be used like this:

library (slice)
dataset - load.virtualarray (dataset_definition.xml)
ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk
and extract it

In the above dataset is an object that holds a definition of a
3-dimensional large dataset, and ordinaryvector is an ordinary R vector.
The subscripting operator fetches necessary data from disk and extracts a
required slice, taking care of caching and other technical details. So, my
questions are:

Has anyone ever made a similar extension, with virtual (lazy) arrays?

Can the suscript operator be overloaded like that in R? (I know it can be in
S, at least for vectors.)

And a tough one: is it possible to make an expression like [1] (without
quoutes) meaningful in R? At the moment it results in a syntax error. I
would like to make it return an object of a special class that gets
interpreted when subscripting my virtual array as drop this dimension,
like this:

dataset [, 2, 3, drop = F]  # Return a 3-dimensional array
dataset [, [2], 3, drop = F]  # Return a 2-dimensional array
dataset [, [2], [3], drop = F]  # Return a 1-dimensional array, like dataset
[, 2, 3]

Thanks in advance for any help,

Maciej.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Extracting the p of F statistics from lm

2007-03-09 Thread Michael Kubovy
On Mar 9, 2007, at 11:18 AM, Cressoni, Massimo ((NIH/NHLBI)) [F] wrote:

 I need to extract the p value from a ANOVA done with lm model

 fitting - lm(var ~ group)
 Sfitting - summary(fitting)

 Sfitting[10][1] gives the F value and the degrees of freedom but I  
 am not able to get the
 p value.

try
Sfitting[4]$coefficients[,4]

I'm not sure that this is the best way, but it works with the example  
for lm()
  summary(lm.D9)[4]$coefficients[,4]
# (Intercept) groupTrt
# 9.547128e-15 2.490232e-01

_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] H0 and H1 probabilities in Cohen's Effect Size w for X2 test

2007-03-09 Thread Antti Arppe

Dear all,

I've been delighted to just notice that Cohen's formulas for 
Effect Size 'w' and the associated power have been implemented in 
the 'pwr' package (thanks to Stéphane Champely and others)..


There is one aspect, though, that perplexes me. I'm doing some last 
minute post hoc analyses, meaning that my sample size (N=3404) has 
been long fixed, and I'm interested in assessing the ES and Power 
after the fact..


As far as I can deduce from the implementation of the ES.w2 formula or 
Cohen's (1992) own article, it seems to me that the probabilities 
p(H0) and p(H1) would simply be the expected and observed absolute 
frequencies divided by the sample size N, in that the 'true' 
probablities are the observed proportions and the null probabilities 
the expected ones. If this is correct, then the effect size and the 
power statistics can naturally easily be calculated with the 'pwr' 
package. However, this entails that the noncentrality parameter 
lambda=N*w^2 is equal to the chi-squared statistic X^2.



observed

p   h   ma
X 119  64  36   37
Y 594 323 776 1455


expected

  p h m a
X  53.62162  29.10458  61.06698  112.2068
Y 659.37838 357.89542 750.93302 1379.7932


observed.p

   p  h  m  a
X 0.03495887 0.01880141 0.01057579 0.01086957
Y 0.17450059 0.09488837 0.22796710 0.42743831


expected.p

   p   h  m  a
X 0.01575253 0.008550112 0.01793977 0.03296322
Y 0.19370693 0.105139664 0.22060312 0.40534465


ES.w2(observed.p)

[1] 0.2406104


ES.w1(expected.p,observed.p)

[1] 0.2406104

pwr.chisq.test(w=ES.w1(expected.p,observed.p),N=3404,sig.level=.05, 

df=3)
 Chi squared power calculation

  w = 0.2406104
  N = 3404
 df = 3
  sig.level = 0.05
  power = 1

 NOTE: N is the number of observations


lambda - 3404*ES.w1(observed.p,expected.p)^2



lambda

[1] 240.9289


pchisq(qchisq(p=.05,df=3,lower.tail=F),ncp=lambda,df=3,lower=F)

[1] 1

Have I missed or misunderstood something here altogether? Should the 
alternative H0 probabilities be estimated by e.g. some sort of 
fitting? Any pointers, suggestions or assistance would be greatly 
appreciated.


-Antti Arppe
--
==
Antti Arppe - Master of Science (Engineering)
Researcher  doctoral student (Linguistics)
E-mail: [EMAIL PROTECTED]
WWW: http://www.ling.helsinki.fi/~aarppe
--
Work: Department of General Linguistics, University of Helsinki
Work address: P.O. Box 9 (Siltavuorenpenger 20 A)
   00014 University of Helsinki, Finland
Work telephone: +358 9 19129312 (int'l) 09-19129312 (in Finland)
Work telefax: +358 9 19129307 (int'l) 09-19129307 (in Finland)
--
Private address: Fleminginkatu 25 E 91, 00500 Helsinki, Finland
Private telephone: +358 50 5909015 (int'l) 050-5909015 (in Finland)
--__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Table Construction from calculations

2007-03-09 Thread Seth Imhoff
Hi-

I am trying to create a table of values by adding  pairs of vectors, but 
am running into some problems.  The problem is best expressed by a 
simple example.

Starting with a data table basis:
  atom   x   y   z
1   Cu 0.0 0.0 0.0
2   Cu 0.5 0.5 0.5

I want to add 0.5 0.5 0.5  (and also the 0 0 0 but it wouldn't change 
the values below so I won't refer to it in the rest of the example) to a 
list of vectors in the form of:
  latpoints
   V1 V2 V3
1   0  0  0
2   0  0  1
3   0  0  2
4   0  0  3
5   0  1  1

so that I end up with a table such as:
V1  V2  V3
0.5 0.5 0.5
0.5 0.5 1.5
0.5 0.5 2.5
0.5 0.5 3.5
0.5 1.5 1.5

I've tried many variations on the following: (not just cat, but most of 
the data/data.table options)

 test = for(i in 1:5) {cat(basis[1,2:4] + latticemultipliers[i,], 
append=TRUE)}

However, I either end up with an error telling me that cat doesn't 
handle type 'list'  or with a table with length of 1 such as:
 xyz
2  0.5 1.5 1.5

Which is simply the last value that the loop calculates.

Does anyone know what function handles lists of the form I am using, or 
have a better suggestion on how to get the form that I want.

Thanks in advance,
Seth Imhoff

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Table Construction from calculations

2007-03-09 Thread Christos Hatzis
Your data table basis is actually a dataframe, whose first column is
non-numeric.  That's what is causing the problem.
 
Try removing the first column of the dataframe before adding the row to your
matrix:

test - latpoints + basis[2, -1]

-Christos

Christos Hatzis, Ph.D.
Nuvera Biosciences, Inc.
400 West Cummings Park
Suite 5350
Woburn, MA 01801
Tel: 781-938-3830
www.nuverabio.com

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Seth Imhoff
 Sent: Friday, March 09, 2007 9:23 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Table Construction from calculations
 
 Hi-
 
 I am trying to create a table of values by adding  pairs of 
 vectors, but am running into some problems.  The problem is 
 best expressed by a simple example.
 
 Starting with a data table basis:
   atom   x   y   z
 1   Cu 0.0 0.0 0.0
 2   Cu 0.5 0.5 0.5
 
 I want to add 0.5 0.5 0.5  (and also the 0 0 0 but it 
 wouldn't change the values below so I won't refer to it in 
 the rest of the example) to a list of vectors in the form of:
   latpoints
V1 V2 V3
 1   0  0  0
 2   0  0  1
 3   0  0  2
 4   0  0  3
 5   0  1  1
 
 so that I end up with a table such as:
 V1  V2  V3
 0.5 0.5 0.5
 0.5 0.5 1.5
 0.5 0.5 2.5
 0.5 0.5 3.5
 0.5 1.5 1.5
 
 I've tried many variations on the following: (not just cat, 
 but most of the data/data.table options)
 
  test = for(i in 1:5) {cat(basis[1,2:4] + 
 latticemultipliers[i,], append=TRUE)}
 
 However, I either end up with an error telling me that cat 
 doesn't handle type 'list'  or with a table with length of 
 1 such as:
  xyz
 2  0.5 1.5 1.5
 
 Which is simply the last value that the loop calculates.
 
 Does anyone know what function handles lists of the form I am 
 using, or have a better suggestion on how to get the form that I want.
 
 Thanks in advance,
 Seth Imhoff
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using large datasets: can I overload the subscript operator?

2007-03-09 Thread Duncan Murdoch
On 3/9/2007 6:47 PM, Maciej Radziejewski wrote:
 Hello,
 
 I do some computations on datasets that come from climate models. These data
 are huge arrays, significantly larger than typically available RAM, so they
 have to be accessed row-by-row, or rather slice-by slice, depending on the
 task. I would like to make an R package to easily access such datasets
 within R. The C++ backend is ready and being used under Windows/.Net/Visual
 Basic, but I have yet to learn the specifics of R programming to make a good
 R interface.
 
 I think it should be possible to make a package (call it slice) that could
 be used like this:
 
 library (slice)
 dataset - load.virtualarray (dataset_definition.xml)
 ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk
 and extract it
 
 In the above dataset is an object that holds a definition of a
 3-dimensional large dataset, and ordinaryvector is an ordinary R vector.
 The subscripting operator fetches necessary data from disk and extracts a
 required slice, taking care of caching and other technical details. So, my
 questions are:
 
 Has anyone ever made a similar extension, with virtual (lazy) arrays?

Yes, e.g. the SQLiteDF package.
 
 Can the suscript operator be overloaded like that in R? (I know it can be in
 S, at least for vectors.)

Yes.
 
 And a tough one: is it possible to make an expression like [1] (without
 quoutes) meaningful in R? At the moment it results in a syntax error. I
 would like to make it return an object of a special class that gets
 interpreted when subscripting my virtual array as drop this dimension,
 like this:
 
 dataset [, 2, 3, drop = F]  # Return a 3-dimensional array
 dataset [, [2], 3, drop = F]  # Return a 2-dimensional array
 dataset [, [2], [3], drop = F]  # Return a 1-dimensional array, like dataset
 [, 2, 3]

No, that's not legal S or R syntax.  However, you might be able to 
define a special object D and use syntax like

dataset [, D[2], 3, drop = F]

Duncan Murdoch
 
 Thanks in advance for any help,
 
 Maciej.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Table Construction from calculations

2007-03-09 Thread Michael Kubovy
On Mar 9, 2007, at 9:23 PM, Seth Imhoff wrote:

 I am trying to create a table of values by adding  pairs of  
 vectors, but
 am running into some problems.  The problem is best expressed by a
 simple example.

 Starting with a data table basis:
   atom   x   y   z
 1   Cu 0.0 0.0 0.0
 2   Cu 0.5 0.5 0.5

 I want to add 0.5 0.5 0.5  (and also the 0 0 0 but it wouldn't change
 the values below so I won't refer to it in the rest of the example)  
 to a
 list of vectors in the form of:
 latpoints
V1 V2 V3
 1   0  0  0
 2   0  0  1
 3   0  0  2
 4   0  0  3
 5   0  1  1

 so that I end up with a table such as:
 V1  V2  V3
 0.5 0.5 0.5
 0.5 0.5 1.5
 0.5 0.5 2.5
 0.5 0.5 3.5
 0.5 1.5 1.5

 I've tried many variations on the following: (not just cat, but  
 most of
 the data/data.table options)

  test = for(i in 1:5) {cat(basis[1,2:4] + latticemultipliers[i,],
 append=TRUE)}

 However, I either end up with an error telling me that cat doesn't
 handle type 'list'  or with a table with length of 1 such as:
  xyz
 2  0.5 1.5 1.5

Is this what you want?

  (latpoints - data.frame(matrix(c(0,0,0,0,0,1,0,0,2,0,0,3,0,1,1),  
nrow = 5, byrow = T)))
   X1 X2 X3
1  0  0  0
2  0  0  1
3  0  0  2
4  0  0  3
5  0  1  1
  (latpoints - latpoints + c(0.5, 0.5, 0.5))
X1  X2  X3
1 0.5 0.5 0.5
2 0.5 0.5 1.5
3 0.5 0.5 2.5
4 0.5 0.5 3.5
5 0.5 1.5 1.5

This is an important feature of R called vectorization (see, .e.g,  
cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf or  
www.ms.washington.edu/stat390/winter07/R_primer.pdf) which allows you  
do avoid writing loops.
_
Professor Michael Kubovy
University of Virginia
Department of Psychology
USPS: P.O.Box 400400Charlottesville, VA 22904-4400
Parcels:Room 102Gilmer Hall
 McCormick RoadCharlottesville, VA 22903
Office:B011+1-434-982-4729
Lab:B019+1-434-982-4751
Fax:+1-434-982-4766
WWW:http://www.people.virginia.edu/~mk9y/

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using large datasets: can I overload the subscript operator?

2007-03-09 Thread Roy Mendelssohn
Look at the netcdf packages.  A lot of output from climate models is  
in netcdf anyway.  It can take all sorts of slices and strides.

-Roy M.


On Mar 9, 2007, at 6:54 PM, Duncan Murdoch wrote:

 On 3/9/2007 6:47 PM, Maciej Radziejewski wrote:
 Hello,

 I do some computations on datasets that come from climate models.  
 These data
 are huge arrays, significantly larger than typically available  
 RAM, so they
 have to be accessed row-by-row, or rather slice-by slice,  
 depending on the
 task. I would like to make an R package to easily access such  
 datasets
 within R. The C++ backend is ready and being used under  
 Windows/.Net/Visual
 Basic, but I have yet to learn the specifics of R programming to  
 make a good
 R interface.

 I think it should be possible to make a package (call it slice)  
 that could
 be used like this:

 library (slice)
 dataset - load.virtualarray (dataset_definition.xml)
 ordinaryvector - dataset [ , 2, 3] # Load a portion of the data  
 from disk
 and extract it

 In the above dataset is an object that holds a definition of a
 3-dimensional large dataset, and ordinaryvector is an ordinary R  
 vector.
 The subscripting operator fetches necessary data from disk and  
 extracts a
 required slice, taking care of caching and other technical  
 details. So, my
 questions are:

 Has anyone ever made a similar extension, with virtual (lazy) arrays?

 Yes, e.g. the SQLiteDF package.

 Can the suscript operator be overloaded like that in R? (I know it  
 can be in
 S, at least for vectors.)

 Yes.

 And a tough one: is it possible to make an expression like  
 [1] (without
 quoutes) meaningful in R? At the moment it results in a syntax  
 error. I
 would like to make it return an object of a special class that gets
 interpreted when subscripting my virtual array as drop this  
 dimension,
 like this:

 dataset [, 2, 3, drop = F]  # Return a 3-dimensional array
 dataset [, [2], 3, drop = F]  # Return a 2-dimensional array
 dataset [, [2], [3], drop = F]  # Return a 1-dimensional array,  
 like dataset
 [, 2, 3]

 No, that's not legal S or R syntax.  However, you might be able to
 define a special object D and use syntax like

 dataset [, D[2], 3, drop = F]

 Duncan Murdoch

 Thanks in advance for any help,

 Maciej.

  [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

**
The contents of this message do not reflect any position of the U.S.  
Government or NOAA.
**
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division 
Southwest Fisheries Science Center
1352 Lighthouse Avenue
Pacific Grove, CA 93950-2097

e-mail: [EMAIL PROTECTED] (Note new e-mail address)
voice: (831)-648-9029
fax: (831)-648-8440
www: http://www.pfeg.noaa.gov/

Old age and treachery will overcome youth and skill.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] long character string problem

2007-03-09 Thread toby909
Hi All

I am having 2 very long character strings (550chars) and I want to put them as 
expressions together with c(). The problem is that I also get these 
double-quotes, as seen below in 'fct'. How can I remove these double-quotes? I 
tried as.name() but it did not work (because of size?). These are creating 
trouble with subsequent programs, which I tested with strings that for some 
reason do not have these double quotes (see very bottom).





  cum1
[1] 
A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110)
  cum2
[1] 
A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210)
  fct = c(as.expression(cum1), as.expression(cum2))
  fct
expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110),
 

 
A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210))
 








  fct = c(expression(2*x1^3-7*x2^2-9), expression(x1^2-x2^3+1))
  fct
expression(2 * x1^3 - 7 * x2^2 - 9, x1^2 - x2^3 + 1)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] understanding print.summary.lm and perhaps print/show in general

2007-03-09 Thread Paul Bailey
Petr,

Thanks, you set me on the right path. It turns out that the behavior that 
surprises me is this: when you change $coef it isn't the same as changing 
$coefficients. The first changes the value, but the second changes the value 
and 
makes the print/show metod change its output from when the summary.lm was 
created. I think the sample code below highlights this behavior nicely.

Why would you want this behavior?

Cheers,
Paul

 R code:
lma -  lm(dist ~ speed, data=cars) 
suma - summary(lma)
colnames(suma$coef) - c(LETTERS[1:4]) 
dimnames(suma$coef) # after setting colnames, dimnames of coefficients 
variable is set
suma # but printing is still the old print
dimnames(suma$coefficients) - list(names(suma$coefficients), c(LETTERS[1:4])) 
dimnames(suma$coef) # no change in dimnames from before
suma # but the summary output is now refreshed!
##


Another solution is to look into the code of summary.lm a few lines 
above where the (dim)names are assigned. Based on this, you may try

lma - lm(dist ~ speed, data=cars)
suma - summary(lma)
colnames(suma$coef) - c(LETTERS[1:4])
printCoefmat(suma$coef) # prints what I expect
suma

dimnames(suma$coefficients) - list(names(suma$coefficients), 
c(LETTERS[1:4]))
suma

You might also find reading the chapter on generic functions in the 
R-lang (R language definition) manual useful.
Petr


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dendrogram - got it , just need to label :)

2007-03-09 Thread Steven McKinney

Here is one example of labeling nodes,
borrowing code from the help page for
the dendrapply() function.

local({
  edgeLab - function(n) {
  if(!is.leaf(n)) {
a - attributes(n)
i - i+1
attr(n, edgetext) -
format(i)
  }
  n
  }
  i - 0
 })
dL - dendrapply(as.dendrogram(hclust(dist(iris[, 1:4]), method = single)), 
edgeLab)
plot(dL)


This labels the edges above the nodes.

Martin Maechler and Robert Gentleman are developing the
dendrogram objects suite of functions.
As I have had to label nodes in S-PLUS, I'd like to put
in a request for a few more control parameters for
edge/internal node labeling control:

 - Allow the label without the polygon surrounding it.
   The polygon can obliterate too much of the dendrogram
   for larger sample sizes.  Perhaps an edgePar polygon
   plot logical p.plot taking values TRUE (default) and
   FALSE to omit the polygon.
 - Allow the label to appear near the node at the base
   of the edge.  Perhaps an edgePar text location parameter
   t.pos taking values in (0.0, 1.0) where 0.5 is in the
   middle of the edge (the default) and 1.0 is at the base 
   of the edge.

Since clusters are not always identified by 'cutting'
the dendrogram (e.g. in the iris single linkage
dendrogram plot we want to identify internal nodes that
are large runts or whose vertical edge lengths are
considerably longer than average) it is useful to be able
to identify nodes deeper in the tree.  This is aided
by having access to internal node labels/names and being
able to extract internal nodes by those labels/names.


Best

Steven McKinney

Statistician
Molecular Oncology and Breast Cancer Program
British Columbia Cancer Research Centre

email: [EMAIL PROTECTED]

tel: 604-675-8000 x7561

BCCRC
Molecular Oncology
675 West 10th Ave, Floor 4
Vancouver B.C. 
V5Z 1L3
Canada




-Original Message-
From: [EMAIL PROTECTED] on behalf of bunny , lautloscrew.com
Sent: Fri 3/9/2007 2:02 PM
To: R-help@stat.math.ethz.ch
Subject: [R] dendrogram - got it , just need to label :)
 
Hi all, Hi Gavin,

thx for your help i finally found out what i want to do and how to  
fix it.
just needed to get some more level my cut level was too small...

two question remain...

a) can i somehow scale the twigs after cutting ?
b) how can i label the nodes and how to label which one...

thx !!

-m.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] long character string problem

2007-03-09 Thread Stephen Tucker
I think you are looking for 
fct - c(parse(text=cum1),parse(text=cum2))

although you need to include operators before your A coefficients (for
example,

...C11)A12*...




--- [EMAIL PROTECTED] wrote:

 Hi All
 
 I am having 2 very long character strings (550chars) and I want to put them
 as 
 expressions together with c(). The problem is that I also get these 
 double-quotes, as seen below in 'fct'. How can I remove these
 double-quotes? I 
 tried as.name() but it did not work (because of size?). These are creating 
 trouble with subsequent programs, which I tested with strings that for some
 
 reason do not have these double quotes (see very bottom).
 
 
 
 
 
   cum1
 [1] 

A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110)
   cum2
 [1] 

A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210)
   fct = c(as.expression(cum1), as.expression(cum2))
   fct

expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110),
 
 
  

A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210))
  
 
 
 
 
 
 
 
 
   fct = c(expression(2*x1^3-7*x2^2-9), expression(x1^2-x2^3+1))
   fct
 expression(2 * x1^3 - 7 * x2^2 - 9, x1^2 - x2^3 + 1)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 




 

Need Mail bonding?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] long character string problem

2007-03-09 Thread jim holtman
first of all, your expressions are not legal.  For example

sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)

should be

sqrt(B11*(X11*x1+X21*x2)^2+C11)*A12*(X12*x1+X22*x2)

Here is a rewrite of your first equation that seems to present the correct
results:

 x - expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)*A12*
+ (X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)*A13*(X13*x1+X23*x2)+
+
-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)*A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+
+
C14)*A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)*A16*(X16*x1+X26*x2)+
+
1*sqrt(B16*(X16*x1+X26*x2)^2+C16)*A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)*A18*
+
(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)*A19*(X19*x1+X29*x2)+-1*
+
sqrt(B19*(X19*x1+X29*x2)^2+C19)*A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110))

 x
expression(A11 * (X11 * x1 + X21 * x2) + 1 * sqrt(B11 * (X11 *
x1 + X21 * x2)^2 + C11) * A12 * (X12 * x1 + X22 * x2) + 1 *
sqrt(B12 * (X12 * x1 + X22 * x2)^2 + C12) * A13 * (X13 *
x1 + X23 * x2) + -1 * sqrt(B13 * (X13 * x1 + X23 * x2)^2 +
C13) * A14 * (X14 * x1 + X24 * x2) + -1 * sqrt(B14 * (X14 *
x1 + X24 * x2)^2 + C14) * A15 * (X15 * x1 + X25 * x2) + 1 *
sqrt(B15 * (X15 * x1 + X25 * x2)^2 + C15) * A16 * (X16 *
x1 + X26 * x2) + 1 * sqrt(B16 * (X16 * x1 + X26 * x2)^2 +
C16) * A17 * (X17 * x1 + X27 * x2) + 1 * sqrt(B17 * (X17 *
x1 + X27 * x2)^2 + C17) * A18 * (X18 * x1 + X28 * x2) + 1 *
sqrt(B18 * (X18 * x1 + X28 * x2)^2 + C18) * A19 * (X19 *
x1 + X29 * x2) + -1 * sqrt(B19 * (X19 * x1 + X29 * x2)^2 +
C19) * A110 * (X110 * x1 + X210 * x2) + 1 * sqrt(B110 * (X110 *
x1 + X210 * x2)^2 + C110))




On 3/9/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:

 Hi All

 I am having 2 very long character strings (550chars) and I want to put
 them as
 expressions together with c(). The problem is that I also get these
 double-quotes, as seen below in 'fct'. How can I remove these
 double-quotes? I
 tried as.name() but it did not work (because of size?). These are creating
 trouble with subsequent programs, which I tested with strings that for
 some
 reason do not have these double quotes (see very bottom).





  cum1
 [1]

 A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110)
  cum2
 [1]

 A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210)
  fct = c(as.expression(cum1), as.expression(cum2))
  fct

 expression(A11*(X11*x1+X21*x2)+1*sqrt(B11*(X11*x1+X21*x2)^2+C11)A12*(X12*x1+X22*x2)+1*sqrt(B12*(X12*x1+X22*x2)^2+C12)A13*(X13*x1+X23*x2)+-1*sqrt(B13*(X13*x1+X23*x2)^2+C13)A14*(X14*x1+X24*x2)+-1*sqrt(B14*(X14*x1+X24*x2)^2+C14)A15*(X15*x1+X25*x2)+1*sqrt(B15*(X15*x1+X25*x2)^2+C15)A16*(X16*x1+X26*x2)+1*sqrt(B16*(X16*x1+X26*x2)^2+C16)A17*(X17*x1+X27*x2)+1*sqrt(B17*(X17*x1+X27*x2)^2+C17)A18*(X18*x1+X28*x2)+1*sqrt(B18*(X18*x1+X28*x2)^2+C18)A19*(X19*x1+X29*x2)+-1*sqrt(B19*(X19*x1+X29*x2)^2+C19)A110*(X110*x1+X210*x2)+1*sqrt(B110*(X110*x1+X210*x2)^2+C110),



 A21*(X11*x1+X21*x2)+1*sqrt(B21*(X11*x1+X21*x2)^2+C21)A22*(X12*x1+X22*x2)+1*sqrt(B22*(X12*x1+X22*x2)^2+C22)A23*(X13*x1+X23*x2)+-1*sqrt(B23*(X13*x1+X23*x2)^2+C23)A24*(X14*x1+X24*x2)+-1*sqrt(B24*(X14*x1+X24*x2)^2+C24)A25*(X15*x1+X25*x2)+1*sqrt(B25*(X15*x1+X25*x2)^2+C25)A26*(X16*x1+X26*x2)+1*sqrt(B26*(X16*x1+X26*x2)^2+C26)A27*(X17*x1+X27*x2)+1*sqrt(B27*(X17*x1+X27*x2)^2+C27)A28*(X18*x1+X28*x2)+1*sqrt(B28*(X18*x1+X28*x2)^2+C28)A29*(X19*x1+X29*x2)+-1*sqrt(B29*(X19*x1+X29*x2)^2+C29)A210*(X110*x1+X210*x2)+1*sqrt(B210*(X110*x1+X210*x2)^2+C210))
 








  fct = c(expression(2*x1^3-7*x2^2-9), expression(x1^2-x2^3+1))
  fct
 expression(2 * x1^3 - 7 * x2^2 - 9, x1^2 - x2^3 + 1)

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to 

Re: [R] use nnet

2007-03-09 Thread Aimin Yan
thank you very much.
I have a another question about nnet
if I set size=0, and skip=TRUE.
Then this network has just input layer and out layer.
Is this also called perceptron network?

thanks,

Aimin Yan


At 12:39 PM 3/9/2007, Wensui Liu wrote:
AM,
Sorry. please ignore the top box in the code. It is not actually a cv
validation but just a simple split-sample validation.
sorry for confusion.

On 3/9/07, Wensui Liu [EMAIL PROTECTED] wrote:
AM,
I have a pieice of junk on my blog. Here it is.
#
# USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR  #
# THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER #
# OF HIDDEN UNITS) OF NEURAL NETS   #
#

library(nnet);
library(MASS);
data(Boston);
X - I(as.matrix(Boston[-14]));
# STANDARDIZE PREDICTORS
st.X - scale(X);
Y - I(as.matrix(Boston[14]));
boston - data.frame(X = st.X, Y);

# DIVIDE DATA INTO TESTING AND TRAINING SETS
set.seed(2005);
test.rows - sample(1:nrow(boston), 100);
test.set - boston[test.rows, ];
train.set - boston[-test.rows, ];

# INITIATE A NULL TABLE
sse.table - NULL;

# SEARCH FOR OPTIMAL WEIGHT DECAY
# RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY
for (w in c(0.0001, 0.001, 0.01))
{
   # SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS
   for (n in 1:10)
   {
 # UNITIATE A NULL VECTOR
 sse - NULL;
 # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES
 for (i in 1:10)
 {
   # INITIATE THE RANDOM STATE FOR EACH NET
   set.seed(i);
   # TRAIN NEURAL NETS
   net - nnet(Y~X, size = n, data = train.set, rang = 0.1,
linout = TRUE, maxit = 1, decay = w,
skip = FALSE, trace = FALSE);
   # CALCULATE SSE FOR TESTING SET
   test.sse - sum((test.set$Y - predict(net, test.set))^2);
   # APPEND EACH SSE TO A VECTOR
   if (i == 1) sse - test.sse else sse - rbind(sse, test.sse);
 }
 # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE
 sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse)));
   }
}
# PRINT OUT THE RESULT
print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry


On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote:
  I want to adjust weight decay and number of hidden units for nnet by
  a loop like
  for(decay)
  {
for(number of unit)
 {
  for(#run)
   {model-nnet()
 test.error-
   }
 }
  }
 
  for example:
  I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and
  calculate test error
 
  after that I want to get a matrix like this
 
  decay  size   maxit  #run  test_error
  0.13200   1   1.2
  0.13200   2   1.1
  0.13200   3   1.0
  0.13200   4   3.4
  0.13200   5..
  0.13200   6 ..
  0.13200   7   ..
  0.13200   8  ..
  0.13200   9   ..
  0.13200   10   ..
  0.23200   1   1.2
  0.23200   2   1.1
  0.23200   3   1.0
  0.23200   4   3.4
  0.23200   5..
  0.23200   6 ..
  0.23200   7   ..
  0.23200   8  ..
  0.23200   9   ..
  0.23200   10   ..
 
  I am not sure if this is correct way to do this?
  Does anyone tune these parameters like this before?
  thanks,
 
  Aimin
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


--
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)


--
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read a irregular text file data into dataframe()

2007-03-09 Thread j.joshua thomas
I am using R2.4.1 calling a text file contains the following data structure:

when i call the file into R using

tData-read.table(c:\\test.txt)

it gave me Error saying, irregular column in the data set
however i need to use the below type of data

Is there any alternative in R?

~

0010 0028 0061 0088
0010 0042 0084
0004 0010 0055
0010 0018 0040 0042
0010 0046 0059
0010 0016 0042 0055
0010 0012 0018 0054
0010 0034 0042 0102
0081
0001 0076 0085
0080 0086
0017 0032 0081
0004 0010 0055
0010 0042 0061 0080
0010 0017 0078 0084
0006 0010 0040 0042
0075 0080
0005 0028 0032
0006 0010 0040 0061
-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] use nnet

2007-03-09 Thread Wensui Liu
no, it is called regression. ^_^.

On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote:
 thank you very much.
 I have a another question about nnet
 if I set size=0, and skip=TRUE.
 Then this network has just input layer and out layer.
 Is this also called perceptron network?

 thanks,

 Aimin Yan


 At 12:39 PM 3/9/2007, Wensui Liu wrote:
 AM,
 Sorry. please ignore the top box in the code. It is not actually a cv
 validation but just a simple split-sample validation.
 sorry for confusion.
 
 On 3/9/07, Wensui Liu [EMAIL PROTECTED] wrote:
 AM,
 I have a pieice of junk on my blog. Here it is.
 #
 # USE CROSS-VALIDATION TO DO A GRID-SEARCH FOR  #
 # THE OPTIMAL SETTINGS (WEIGHT DECAY AND NUMBER #
 # OF HIDDEN UNITS) OF NEURAL NETS   #
 #
 
 library(nnet);
 library(MASS);
 data(Boston);
 X - I(as.matrix(Boston[-14]));
 # STANDARDIZE PREDICTORS
 st.X - scale(X);
 Y - I(as.matrix(Boston[14]));
 boston - data.frame(X = st.X, Y);
 
 # DIVIDE DATA INTO TESTING AND TRAINING SETS
 set.seed(2005);
 test.rows - sample(1:nrow(boston), 100);
 test.set - boston[test.rows, ];
 train.set - boston[-test.rows, ];
 
 # INITIATE A NULL TABLE
 sse.table - NULL;
 
 # SEARCH FOR OPTIMAL WEIGHT DECAY
 # RANGE OF WEIGHT DECAYS SUGGESTED BY B. RIPLEY
 for (w in c(0.0001, 0.001, 0.01))
 {
# SEARCH FOR OPTIMAL NUMBER OF HIDDEN UNITS
for (n in 1:10)
{
  # UNITIATE A NULL VECTOR
  sse - NULL;
  # FOR EACH SETTING, RUN NEURAL NET MULTIPLE TIMES
  for (i in 1:10)
  {
# INITIATE THE RANDOM STATE FOR EACH NET
set.seed(i);
# TRAIN NEURAL NETS
net - nnet(Y~X, size = n, data = train.set, rang = 0.1,
 linout = TRUE, maxit = 1, decay = w,
 skip = FALSE, trace = FALSE);
# CALCULATE SSE FOR TESTING SET
test.sse - sum((test.set$Y - predict(net, test.set))^2);
# APPEND EACH SSE TO A VECTOR
if (i == 1) sse - test.sse else sse - rbind(sse, test.sse);
  }
  # APPEND AVERAGED SSE WITH RELATED PARAMETERS TO A TABLE
  sse.table - rbind(sse.table, c(WT = w, UNIT = n, SSE = mean(sse)));
}
 }
 # PRINT OUT THE RESULT
 print(sse.table);http://statcompute.spaces.live.com/Blog/cns!39C8032DBD1321B7!290.entry
 
 
 On 3/9/07, Aimin Yan [EMAIL PROTECTED] wrote:
   I want to adjust weight decay and number of hidden units for nnet by
   a loop like
   for(decay)
   {
 for(number of unit)
  {
   for(#run)
{model-nnet()
  test.error-
}
  }
   }
  
   for example:
   I set decay=0.1, size=3, maxit=200, for this set I run 10 times, and
   calculate test error
  
   after that I want to get a matrix like this
  
   decay  size   maxit  #run  test_error
   0.13200   1   1.2
   0.13200   2   1.1
   0.13200   3   1.0
   0.13200   4   3.4
   0.13200   5..
   0.13200   6 ..
   0.13200   7   ..
   0.13200   8  ..
   0.13200   9   ..
   0.13200   10   ..
   0.23200   1   1.2
   0.23200   2   1.1
   0.23200   3   1.0
   0.23200   4   3.4
   0.23200   5..
   0.23200   6 ..
   0.23200   7   ..
   0.23200   8  ..
   0.23200   9   ..
   0.23200   10   ..
  
   I am not sure if this is correct way to do this?
   Does anyone tune these parameters like this before?
   thanks,
  
   Aimin
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 --
 WenSui Liu
 A lousy statistician who happens to know a little programming
 (http://spaces.msn.com/statcompute/blog)
 
 
 --
 WenSui Liu
 A lousy statistician who happens to know a little programming
 (http://spaces.msn.com/statcompute/blog)





-- 
WenSui Liu
A lousy statistician who happens to know a little programming
(http://spaces.msn.com/statcompute/blog)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read a irregular text file data into dataframe()

2007-03-09 Thread Stephen Tucker
I don't know of any canned function to do this but you can write your own
function (see contents below) to:

(1) open file connection
(2) read number of fields
(3) create empty matrix with the number of rows and maximum number of columns
of your data
(4) rewind to beginning of file
(5) scan line-by-line and fill the matrix
(6) close the file connection
(7) convert matrix to data frame
(8) use the function type.convert to automatically convert numerical columns
to mode numeric (since scan(), as I've specified it, reads in everything as
mode character, which converts the holding matrix's mode to character from
its default of logical).

the function below will work for your example data set, but to make it more
general, you can add arguments like 'what' to scan(), 'sep' to both
count.fields() and scan(); depending on whether you have column names you can
modify it accordingly as well.

# call function with this line
df - read.irregular(c:\\test.txt)

# this is the function

read.irregular - function(filenm) {
  fileID - file(filenm,open=rt)
  nFields - count.fields(fileID)
  mat - matrix(nrow=length(nFields),ncol=max(nFields))
  invisible(seek(fileID,where=0,origin=start,rw=read))
  for(i in 1:nrow(mat) ) {
mat[i,1:nFields[i]] -scan(fileID,what=,nlines=1,quiet=TRUE)
  }
  close(fileID)
  df - as.data.frame(mat)
  df[] - lapply(df,type.convert,as.is=TRUE)
  return(df)
}

Hope this helps.

--- j.joshua thomas [EMAIL PROTECTED] wrote:

 I am using R2.4.1 calling a text file contains the following data
 structure:
 
 when i call the file into R using
 
 tData-read.table(c:\\test.txt)
 
 it gave me Error saying, irregular column in the data set
 however i need to use the below type of data
 
 Is there any alternative in R?
 
 ~
 
 0010 0028 0061 0088
 0010 0042 0084
 0004 0010 0055
 0010 0018 0040 0042
 0010 0046 0059
 0010 0016 0042 0055
 0010 0012 0018 0054
 0010 0034 0042 0102
 0081
 0001 0076 0085
 0080 0086
 0017 0032 0081
 0004 0010 0055
 0010 0042 0061 0080
 0010 0017 0078 0084
 0006 0010 0040 0042
 0075 0080
 0005 0028 0032
 0006 0010 0040 0061
 -- 
 Lecturer J. Joshua Thomas
 KDU College Penang Campus
 Research Student,
 University Sains Malaysia
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 



 

It's here! Your new message!  
Get new email alerts with the free Yahoo! Toolbar.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read a irregular text file data into dataframe()

2007-03-09 Thread Petr Klasterecky
read.table(c:\\test.txt,fill=TRUE)

Petr

j.joshua thomas napsal(a):
 I am using R2.4.1 calling a text file contains the following data structure:
 
 when i call the file into R using
 
 tData-read.table(c:\\test.txt)
 
 it gave me Error saying, irregular column in the data set
 however i need to use the below type of data
 
 Is there any alternative in R?
 
 ~
 
 0010 0028 0061 0088
 0010 0042 0084
 0004 0010 0055
 0010 0018 0040 0042
 0010 0046 0059
 0010 0016 0042 0055
 0010 0012 0018 0054
 0010 0034 0042 0102
 0081
 0001 0076 0085
 0080 0086
 0017 0032 0081
 0004 0010 0055
 0010 0042 0061 0080
 0010 0017 0078 0084
 0006 0010 0040 0042
 0075 0080
 0005 0028 0032
 0006 0010 0040 0061

-- 
Petr Klasterecky
Dept. of Probability and Statistics
Charles University in Prague
Czech Republic

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using large datasets: can I overload the subscript operator?

2007-03-09 Thread Roger Bivand
On Sat, 10 Mar 2007, Maciej Radziejewski wrote:

 Hello,
 

The http://www.met.rdg.ac.uk/cag/rclim/ site may have some useful leads. 
In addition, you'll find ideas in two packages created by Tim Keitt, 
rgdal, and Rdbi+RdbiPgSQL (now on Bioconductor). 

 I do some computations on datasets that come from climate models. These data
 are huge arrays, significantly larger than typically available RAM, so they
 have to be accessed row-by-row, or rather slice-by slice, depending on the
 task. I would like to make an R package to easily access such datasets
 within R. The C++ backend is ready and being used under Windows/.Net/Visual
 Basic, but I have yet to learn the specifics of R programming to make a good
 R interface.

Look at the Matrix package for examples - you may need finalizers to tidy 
up memory allocation - see examples in rgdal. The key thing will be 
thinking through how to implement the R objects as classes, probably not 
simply reflecting the C++ classes. Classes are covered in the Green Book 
(Chambers 1998) and Venables  Ripley (2000) S Programming.

 
 I think it should be possible to make a package (call it slice) that could
 be used like this:
 
 library (slice)
 dataset - load.virtualarray (dataset_definition.xml)
 ordinaryvector - dataset [ , 2, 3] # Load a portion of the data from disk
 and extract it
 
 In the above dataset is an object that holds a definition of a
 3-dimensional large dataset, and ordinaryvector is an ordinary R vector.
 The subscripting operator fetches necessary data from disk and extracts a
 required slice, taking care of caching and other technical details. So, my
 questions are:
 
 Has anyone ever made a similar extension, with virtual (lazy) arrays?
 
 Can the suscript operator be overloaded like that in R? (I know it can be in
 S, at least for vectors.)
 

Yes, there are many examples, see the Matrix package for some that use 
new-style classes (in language issues like this, R is S, the differences 
are in scoping).

 And a tough one: is it possible to make an expression like [1] (without
 quoutes) meaningful in R? At the moment it results in a syntax error. I
 would like to make it return an object of a special class that gets
 interpreted when subscripting my virtual array as drop this dimension,
 like this:

Most likely not in this context, because [ in this context will not be
what you want. But if your [.dataset method is careful about examining
its arguments, you ought to be able to get the result you want. You'll
likely learn a good deal from looking for example at the code in the
Matrix package.

 
 dataset [, 2, 3, drop = F]  # Return a 3-dimensional array
 dataset [, [2], 3, drop = F]  # Return a 2-dimensional array
 dataset [, [2], [3], drop = F]  # Return a 1-dimensional array, like dataset
 [, 2, 3]
 
 Thanks in advance for any help,
 
 Maciej.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >