Re: [R] normality testing with nortest

2006-05-22 Thread Raymond Wan

Hi Rolf,

On Sun, 21 May 2006, Rolf Turner wrote:
 I don't know from the nortest package, but it should ***always***
 be the case that you test hypotheses
...
 If the nortest package does it differently (and I don't really see
 how it possibly could!) then it is confusingly designed.  I rather
 suspect that its design is just fine, and that it does what it should
 do.

Thank you for your reply.  As I had expected, it was my knowledge 
of the test which was inadequate and it had nothing to do with the 
package.  No doubt it works fine!  Your explanation sounds very reasonable 
to me...thanks again!

Ray

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Martin Maechler
 Gabor == Gabor Grothendieck [EMAIL PROTECTED]
 on Sun, 21 May 2006 09:47:07 -0400 writes:

Gabor If you know that test is a scalar

Gabor result - if (test) a else b

Gabor will do it.  

Yes, indeed!
IMO,  ifelse(test, a, b) is much overused where as  
  if(test) a else b  is much UNDER used.

From some e-mail postings, and even some documents (even printed
books?), I get the impression that too many people think that
 ifelse(.,.,.)  is to be used as expression / function and 
  if(.) . else . only for program flow control.
This leads to quite suboptimal code, and I personally use
if(.) . else .  __as expression__ much more frequently than ifelse(.,.,.)
 
Martin Maechler, ETH Zurich.

Gabor Here is another approach:

Gabor as.vector(test * ts(a) + (!test) * ts(b))



Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
 Dear R wizards:
 
 I just got stung by the ifelse() feature.
 
  a - 10:15
  b - 20:300
  test - 1
  ifelse(test,a,b)
 [1] 10
 
 I had not realized that this was the default behavior---I had expected
 10:15.  mea culpa.  however, I wonder whether it would make sense to
 replace ifelse with a different semantic, where if test is a single
 scalar, it means what a stupid user like me would imagine.
 
 Aside, I like the flexibility of R, but I am not thrilled by all the
 recycling rules.  I either mean I want a scalar or a vector of
 equal/appropriate dimension.  I never want a recycle of a smaller
 vector.  (I do often use a recycle of a scalar.)
 
 regards,
 
 /iaw
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
 

Gabor __
Gabor R-help@stat.math.ethz.ch mailing list
Gabor https://stat.ethz.ch/mailman/listinfo/r-help
Gabor PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] normality testing with nortest

2006-05-22 Thread Uwe Ligges
Rolf Turner wrote:

 I don't know from the nortest package, but it should ***always***
 be the case that you test hypotheses
 
   H_0:  The data have a normal distribution.
 vs.
   H_a:  The data do not have a normal distribution.
 
 So if you get a p-value  0.05 you can say that
 
   ***there is evidence***
 
 (at the 0.05 significance level) that the data are not from a
 normal distribution.
 
 If the nortest package does it differently (and I don't really see
 how it possibly could!) then it is confusingly designed.  I rather
 suspect that its design is just fine, and that it does what it should
 do.



I suspect so as well.
If you think something is wrong, please contact the package maintainer 
(CCing; he's not reading R-help posts).

Uwe Ligges




   cheers,
 
   Rolf Turner
   [EMAIL PROTECTED]
 
 Original message:
 
 
  I have a question regarding normality testing with the nortest 
package.  I have to admit, my question is so general that it might more be 
suited a newsgroup such as sci.math.  However, just in case it is 
implemented differently, I was hoping someone who has used it can help me 
out.

  I just want to double check that for all of the tests, the null 
hypothesis is whether or not the input distribution *differs* with the 
normal distribution.  So, if you get a p-value less than (say) 0.05, you 
can reject the null hypothesis at a 95% confidence level and say that the 
input distribution is *not* normal.

  So these tests check/confirm whether a distribution is not 
normal...as opposed to confirming that it is normal.  Does this sound 
about right?

Ray
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How can you buy R?

2006-05-22 Thread Berwin A Turlach
G'day Deepayan,

 DS == Deepayan Sarkar [EMAIL PROTECTED] writes:

DS let me first summarize this sub-discussion so far: [...]
Sound like a perfect summary. :)

DS As far as I can tell (and please correct me if I'm wrong),
DS your contention is that by linking a GPL component P with a
DS non-GPL component Q, a user may lose the rights granted to him
DS by the GPL to the GPL-d part P.
I don't think that I said this explicitly, but I can see how what I
said can be interpreted in such a way.  The point is rather that at
the moment component P and Q are linked (and I perhaps carelessly
assumed that the user was doing this) a product is produced that
should be completely under the GPL.  Obviously it is not.  Hence, the
status of this linked product, and whether it can be used by anybody,
is an open question.  And the answer is probably given by the
copyright laws (and others?) of the country in the linking happens.  


DS Let's assume this is true. All that means is that the user has
DS lost his rights to copy, modify and redistribute P. He does
DS NOT lose the rights to use P.
I agree with you on this.  Probably I was to terse in my writing and
produced misunderstandings.  I never intended to say something about
the rights that the user has with regards to P alone.  My comments
were directed towards the linked product P+Q.  In particular, it is
not clear to me whether one can execute such a product without
violating copyright laws. 

Thus, the last sentence of mine that you quoted:

 My understanding was that in that moment a product was
 created that would have to be wholly under the GPL, so the
 user was violating the GPL and lost the write to use your
 package.

Should perhaps better be formulated as:

 My understanding was that in that moment a product was
 created that would have to be wholly under the GPL, so the
 person who did the linking was violating the GPL and it is
 not clear whether anyone is allowed to use the linked product.

 A simple google search would have confirmed to you that the
 linux kernel is developed under the GPL.  [...]
DS Linux is under GPL2, and not GPL2 or later. [...]
Oh, I wasn't aware that they did not use the typical(?) or later
phrase.  Thanks for pointing this out and I note that we both agree
that the linux kernel is definitely not under LGPL.

DS In any case, this is the complete opposite of the situation we
DS were originally discussing: [...]
 [...]  So I have to wonder to what you are referring to as the
 situation we were originally discussing.

DS I was referring to your question (quoted above) about use of
DS GPL'd code in S-PLUS, which is what I was replying to. As I
DS was saying, that situation is the opposite of the one in your
DS example.
O.k., sorry, I used a different scale with the time point of origin at
Spencer's e-mail and my answer to that mail.  Now I am with you.

Agreed, the situation is the opposite, but that was the example
discussed in gnu.misc.discuss.  From an abstract point of view the
situations are the same.  You make someone else link a GPL product
with a non-GPL product creating a derived work, the derived work would
have to be under the GPL but is not.  Hence, the derived work has a
legal status that is in limbo and it is not clear whether anyone has
to right to use it.  

The discussions on gnu.misc.discuss were centred on cases were people
provided non-GPL binaries, asked their users to download GPL software
from elsewhere, compile and link everything together and then use the
combined product.  

As you say it is the exact opposite (and hence mirror image) from the
situation that I was worried about, where I provide GPL software and
ask others to compile and link it with non-GPL binaries and then use
the combined product.

If one scenario is not on, I don't see how the other one could be
acceptable either.  Except that in the first scenario there is a clear
intend of circumventing the GPL.  But I was not sure whether such kind
of intent makes any difference.  Thus, to avoid all these problems I
decided to rather use the LGPL since that licence definitely seemed to
allow both.  

Hope this clarifies some of my comments.

Cheers,

Berwin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nls fitting

2006-05-22 Thread Joerg van den Hoff
Lorenzo Isella wrote:
 Dear All,
 I may look ridiculous, but I am puzzled at the behavior of the nls with
 a fitting I am currently dealing with.
 My data are:
 
x N
 1   346.4102 145.428256
 2   447.2136 169.530634
 3   570.0877 144.081627
 4   721.1103 106.363316
 5   894.4272 130.390552
 6  1264.9111  36.727069
 7  1788.8544  52.848587
 8  2449.4897  25.128742
 9  3464.1016   7.531766
 10 4472.1360   8.827367
 11 6123.7244   6.600603
 12 8660.2540   4.083339
 
 I would like to fit N as a function of x according to a function
 depending on 9 parameters (A1,A2,A3,mu1,mu2,mu3,myvar1,myvar2,myvar3),
 namely
 N ~
 (log(10)*A1/sqrt(2*pi)/log(myvar1)*exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1))
 
 +log(10)*A2/sqrt(2*pi)/log(myvar2)*exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2))
 
 +log(10)*A3/sqrt(2*pi)/log(myvar3)*exp(-((log(x/mu3))^2)/2/log(myvar3)/log(myvar3)))
 
 (i.e. N is to be seen as a sum of three bells whose parameters I need
 to determine).
 
 
 So I tried:
 out-nls(N ~ 
 (log(10)*A1/sqrt(2*pi)/log(myvar1)*exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1))
 
 +log(10)*A2/sqrt(2*pi)/log(myvar2)*exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2))
 
 +log(10)*A3/sqrt(2*pi)/log(myvar3)*exp(-((log(x/mu3))^2)/2/log(myvar3)/log(myvar3)))
  ,start=list(A1 = 85,
 A2=23,A3=4,mu1=430,mu2=1670,mu3=4900,myvar1=1.59,myvar2=1.5,myvar3=1.5  )
 ,algorithm = port
 ,control=list(maxiter=2,tol=1)
 ,lower=c(A1=0.1,A2=0.1,A3=0.1,mu1=0.1,mu2=0.1,mu3=0.1,myvar1=0.1,myvar2=0.1,myvar3=0.1)
 )
 
 getting the error message:
 Error in nls(N ~ (log(10) * A1/sqrt(2 * pi)/log(myvar1) *
 exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1)) +  :
 Convergence failure: singular convergence (7)
 
 
 I tried to adjust tol  maxiter, but unsuccessfully.
 If I try fitting N with only two bells, then nls works:
 
 out-nls(N ~ 
 (log(10)*A1/sqrt(2*pi)/log(myvar1)*exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1))
 
 +log(10)*A2/sqrt(2*pi)/log(myvar2)*exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2))
 )
  ,start=list(A1 = 85, A2=23,mu1=430,mu2=1670,myvar1=1.59,myvar2=1.5  )
 ,algorithm = port
 ,control=list(maxiter=2,tol=1)
 ,lower=c(A1=0.1,A2=0.1,mu1=0.1,mu2=0.1,myvar1=0.1,myvar2=0.1)
 )
 
  out
 Nonlinear regression model
   model:  N ~ (log(10) * A1/sqrt(2 * pi)/log(myvar1) *
 exp(-((log(x/mu1))^2)/2/log(myvar1)/log(myvar1)) +  log(10) *
 A2/sqrt(2 * pi)/log(myvar2) *
 exp(-((log(x/mu2))^2)/2/log(myvar2)/log(myvar2)))
data:  parent.frame()
 A1 A2mu1mu2 myvar1 myvar2
  84.920085  40.889968 409.656404 933.081936   1.811560   2.389215
  residual sum-of-squares:  2394.876
 
 Any idea about how to get nls working with the whole model?
 I had better luck with the nls.lm package, but it does not allow to
 introduce any constrain on my fitting parameters.
 I was also suggested to try other packages like optim to do the same
 fitting, but I am a bit unsure about how to set up the problem.
 Any suggestions? BTW, I am working with R Version 2.2.1
 
 Lorenzo
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


apart from the fact that fitting 9 parameters to 12 points quite 
genereally will not yield satisfactory results (at least estimates will 
have huge uncertainties), total failure in your case seems obviouus if 
you plot your data: it's not even obvious where the three peaks (means 
of the gaussians) should be: all below x=2000 or is there one peak at 
about x=4500 and one of the 'peaks' below x=2000 is spurious? if you 
can't decide, nls has problems. moreover: how should reliable estimates 
of the standard deviations (width) of the gaussian result if the peaks 
essentially consist of exactly one point?

in short: I believe, you either  need much more data points or you must 
put in substantial a priori information (e.g. either means or standard 
deviations of the gaussians).

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] normality testing with nortest

2006-05-22 Thread Raymond Wan

On Mon, 22 May 2006, Uwe Ligges wrote:
 Rolf Turner wrote:
 If the nortest package does it differently (and I don't really see
 how it possibly could!) then it is confusingly designed.  I rather
 suspect that its design is just fine, and that it does what it should
 do.
 I suspect so as well.
 If you think something is wrong, please contact the package maintainer 
 (CCing; he's not reading R-help posts).

Ah, ok -- but in this case, it was clearly my misunderstanding 
which is one reason why I never though of writing to the package 
maintainer.  I have one of the books that the Nortest documentation cites, 
but I was clearly reading it backwards or upside-down or something as I 
missed several crucial points.

One thing that threw me off (and this is not really specific to 
Nortest as it seems to be correct, but just my understanding), but the 
p-value seems quite unstable.  For example:

 ad.test(rnorm(100,mean=5,sd=3))
...
A = 0.2382, p-value = 0.7767

 ad.test(rnorm(100,mean=5,sd=3))
...
A = 0.1846, p-value = 0.9059

 ad.test(rnorm(100,mean=5,sd=3))
...
A = 0.5138, p-value = 0.1887

I mistakenly had thought the p-values would be more stable since I 
am artificially creating a random normal distribution.  Is this expected 
for a normality test or is this an issue with how rnorm is producing 
random numbers?  I guess if I run it many times, I would find that I would 
get many large values for the p-value?

Ray

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] normality testing with nortest

2006-05-22 Thread Uwe Ligges
Raymond Wan wrote:

 
 On Mon, 22 May 2006, Uwe Ligges wrote:
 
 Rolf Turner wrote:

 If the nortest package does it differently (and I don't really see
 how it possibly could!) then it is confusingly designed.  I rather
 suspect that its design is just fine, and that it does what it should
 do.

 I suspect so as well.
 If you think something is wrong, please contact the package maintainer 
 (CCing; he's not reading R-help posts).
 
 
 Ah, ok -- but in this case, it was clearly my misunderstanding which 
 is one reason why I never though of writing to the package maintainer.  
 I have one of the books that the Nortest documentation cites, but I was 
 clearly reading it backwards or upside-down or something as I missed 
 several crucial points.
 
 One thing that threw me off (and this is not really specific to 
 Nortest as it seems to be correct, but just my understanding), but the 
 p-value seems quite unstable.  For example:
 
 ad.test(rnorm(100,mean=5,sd=3))
 
 ...
 A = 0.2382, p-value = 0.7767
 
 ad.test(rnorm(100,mean=5,sd=3))
 
 ...
 A = 0.1846, p-value = 0.9059
 
 ad.test(rnorm(100,mean=5,sd=3))
 
 ...
 A = 0.5138, p-value = 0.1887
 
 I mistakenly had thought the p-values would be more stable since I 
 am artificially creating a random normal distribution.  Is this expected 
 for a normality test or is this an issue with how rnorm is producing 
 random numbers?  I guess if I run it many times, I would find that I 
 would get many large values for the p-value?

Well, as many large values as small values, 5% significant differences 
for the 5% level

The following looks alright:

hist(replicate(1000,  ad.test(rnorm(100,mean=5,sd=3))$p.value))

 Ray


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to reference R in papers?

2006-05-22 Thread j.joshua thomas
Dear List,

How do i reference R 2.3.0 in research papers?

JJ
---

-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How to reference R in papers?

2006-05-22 Thread Christian Ritz
Hi JJ,

try the following function in R:

citation()


Christian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Unreadable labels -- in x11() and derivative

2006-05-22 Thread Martin Maechler
 Maciej == Maciej Bliziński [EMAIL PROTECTED]
 on Sun, 21 May 2006 14:48:13 +0200 writes:

Maciej Playing around with examples from MASS4, I found a font problem in 
the
Maciej mosaicplot in R-2.3.0. It doesn't happen in other plots. Running 
this
Maciej example from MASS4, page 326...

Maciej library(MASS)
Maciej caith1 - as.matrix(caith)
Maciej names(dimnames(caith1)) - c(eyes, hair)
Maciej mosaicplot(caith1, color = TRUE)

Maciej ...I get an image as attached. The column and row labels are 
unreadable.
Maciej It is true for both x11() and png() devices.

.

Maciej Looking at the demo(graphics), I can also recognize some unreadable
Maciej labels. Do you think it's only the matter of the X-Window system?

Maciej Installed fonts? 

Maciej Any hints appreciated.

Yes, it's a matter of the X - font server of your X-window
system. As  help(png)  clearly notes, png() itself is also using
the x11() internals.

For me, the labels are all well readable for the above example;
even here (at work) where the font server is still not providing
correct font sizes too often for my taste -- e.g. when drawing graphs
with the Rgraphviz package {from bioconductor}.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to contribute improved documentation {was .. lm.predict ..}

2006-05-22 Thread Martin Maechler
 Larry == Larry Howe [EMAIL PROTECTED]
 on Sat, 20 May 2006 15:10:09 -0400 writes:

  An optional data frame in which to look for variables with which to
  predict. means almost nothing to me. Specifically it doesn't translate
  to me, as an engineer and software developer, but not a statistician,
  that the names have to match, and furthermore if they don't match, R
  fails silently.
 
  Larry Howe
 
 I have had offline conversations not related to this thread by others
 who were also confused by this so I agree it would be worthwhile
 to clarify it.

Larry Thank you! If there is anything I can do to help get
Larry this documented / clarified, please let me know.

BTW, this has now become a topic that belongs a bit more to the
R-devel rather than R-help mailing list,  but I keep it here for
reference and in the hope it will be closed for now.

Maybe because of your question (or completely incidentally?)
yesterday, we had the topic
  Suggesting changes to HELP files?
on R-devel, see
   https://stat.ethz.ch/pipermail/r-devel/2006-May/037785.html
and it's follow up postings, notably Duncan Murdoch's
   https://stat.ethz.ch/pipermail/r-devel/2006-May/037787.html
I had written part of the following independently, and I would
recommend to use in conjunction with Duncan's.

Yes, you can help to get this documentation improved.
Ideally, and with highest probability of success, if
you provide a patch *) to the current documentation *source*,
which for help pages always means a file named *.Rd from the R
source - which is available online, as a whole, see CRAN - source,
or - in this case more conveniently - the complete file tree is
at https://svn.R-project.org/R/trunk/ and the file in question
is https://svn.r-project.org/R/trunk/src/library/stats/man/predict.lm.Rd

Since providing a patch *) is hard for average R users,
for documentation changes we would also be content to just receive
the new version of the text you changed from the corresponding
*.Rd file.

What we do like much less is an edited version of some
*representation* of the help, e.g., any version you get by using
help() or browsing any forms of the manuals.

Martin Maechler, ETH Zurich


*) Providing a patch means:
- edit the file to be changed and keep the original;
- run 'diff -u oldfile  newfile ' in your shell (which must
  'diff' in its path).
- E-mail (with a mail program that does NOT mangle lines and
  does not send HTML ..) the output of the above 'diff ..' to R-devel as a
  proposal for improving R.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Matrix in 3D

2006-05-22 Thread robert-mcfadden
Dear R Users,
Is it possible to add another (third) index to matrix (as in MATLAB). For some 
analysis e.g. finite mixture models is necessary. Simple example

i-3
matrix[, , i]-matrixA[, ,i]%*%matrixB[, , i]

I would appreciate any help
Rob

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] random generation of a factor

2006-05-22 Thread Christian Bieli
Hi everybody

Does anybody know a function that generates a factor of length N with K 
levels given the proportions c(p1, p2,... ,pk)? It would not be too hard 
to write it for myself, but if there's a preexisting one..
I searched the documentation and the help archive, but could not find 
anything helpful.

  version
 _ 
platform i386-pc-mingw32
arch i386  
os   mingw32   
system   i386, mingw32 
status 
major2 
minor2.0   
year 2005  
month10
day  06
svn rev  35749 
language R

Thanks a lot.
Christian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix in 3D

2006-05-22 Thread Uwe Ligges
[EMAIL PROTECTED] wrote:

 Dear R Users,
 Is it possible to add another (third) index to matrix (as in MATLAB). For 
 some analysis e.g. finite mixture models is necessary. Simple example
 
 i-3
 matrix[, , i]-matrixA[, ,i]%*%matrixB[, , i]


See ?array

Uwe Ligges


 I would appreciate any help
 Rob
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] how to vectorize a matrix

2006-05-22 Thread Lorenzo Bencivelli
The help documentation suggests to use the command “stack”. Either it does
not work or i did not understand well how to manage it.

I need to vectorize the columns of a matrix in a certain order, any
suggestion (or explanation on the command “stack”)?

Thanks in advance. L

 

*

credo nella ragione umana,

e nella libertà e nella giustizia che dalla ragione scatiruscono. (sciascia)

*

 


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] SUSE 10.1 64 bit binary

2006-05-22 Thread Daniel Brewer
Hi,

I was just wondering if anyone had created a 64 bit binary for SUSE 10.1
 x86_64 yet.  Unfortunately, I upgraded before looking whether a package
was available and now cannot find a suitable package on CRAN.  If noone
has produced one yet, does anyone have any idea when one will be available?

Many Thanks

Daniel Brewer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix in 3D

2006-05-22 Thread Robin Hankin
Hi

there are dozens of examples of this kind of thing
in the R-and-octave.txt file,  in the contributed docs section
of CRAN.  See the multidimensional arrays section, near the
bottom, for reproduction of all of matlab/octave's array
handling capabilities.

best wishes


Robin



On 22 May 2006, at 09:56, Uwe Ligges wrote:

 [EMAIL PROTECTED] wrote:

 Dear R Users,
 Is it possible to add another (third) index to matrix (as in  
 MATLAB). For some analysis e.g. finite mixture models is  
 necessary. Simple example

 i-3
 matrix[, , i]-matrixA[, ,i]%*%matrixB[, , i]


 See ?array

 Uwe Ligges


 I would appreciate any help
 Rob

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting- 
 guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting- 
 guide.html

--
Robin Hankin
Uncertainty Analyst
National Oceanography Centre, Southampton
European Way, Southampton SO14 3ZH, UK
  tel  023-8059-7743

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] how to vectorize a matrix

2006-05-22 Thread Christian Ritz
Hi Lorenzo,

maybe the following example is of use?


a - matrix(1:25,5,5)

stack(as.data.frame(a[, c(1,3,5,2,4)])) 


Note that 'stack' takes a data frame or list as first argument (not a
matrix). Therefore the matrix is first converted to a data frame using
'as.data.frame'.

Christian

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] normality testing with nortest

2006-05-22 Thread Peter Dalgaard
Uwe Ligges [EMAIL PROTECTED] writes:


  A = 0.1846, p-value = 0.9059
  
  ad.test(rnorm(100,mean=5,sd=3))
  
  ...
  A = 0.5138, p-value = 0.1887
  
  I mistakenly had thought the p-values would be more stable since I 
  am artificially creating a random normal distribution.  Is this expected 
  for a normality test or is this an issue with how rnorm is producing 
  random numbers?  I guess if I run it many times, I would find that I 
  would get many large values for the p-value?
 
 Well, as many large values as small values, 5% significant differences 
 for the 5% level
 
 The following looks alright:
 
 hist(replicate(1000,  ad.test(rnorm(100,mean=5,sd=3))$p.value))

We see this misunderstanding worryingly often. Worrying because it
reveals that a fundamental aspect of statistical inference has not
been grasped: that p-values are designed to be (approximately)
uniformly distributed and fall below any given level with the stated
probability, when the null hypothesis is true.

There is no mechanism to give you fewer significant or more stable
p-values, and a p-value close to one is no better an indication of a
true null hypothesis than one of 0.5 or 0.25.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] determination of number of entries in list elements

2006-05-22 Thread botto
Hi,

thanks you all a lot for the quick reply. 

regards

Benjamin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] SUSE 10.1 64 bit binary

2006-05-22 Thread Peter Dalgaard
Daniel Brewer [EMAIL PROTECTED] writes:

 Hi,
 
 I was just wondering if anyone had created a 64 bit binary for SUSE 10.1
  x86_64 yet.  Unfortunately, I upgraded before looking whether a package
 was available and now cannot find a suitable package on CRAN.  If noone
 has produced one yet, does anyone have any idea when one will be available?
 
 Many Thanks
 
 Daniel Brewer

I would suspect that the one for 10.0 works for 10.1 as well. With
2.3.1 scheduled for June 1, there's little point in asking for an RPM
now. 

Otherwise, you can fairly easily build one from sources + Detlef's
.spec file. You may need a handful of build tools, though.

(Incidentally, Detlef: The Source: entry in the SPEC is wrong.
cvs.r-project.org is not accessible and hasn't been so for quite a
while. 

ftp://cran.r-project.org/pub/R/src/base/R-2/R-%version.tar.gz

works, and is the same used for RedHat/Fedora.)

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Matrix in 3D

2006-05-22 Thread robert-mcfadden
Thank you very much you Robin and Uwe. 


Od: Robin Hankin [EMAIL PROTECTED]
Do: Uwe Ligges [EMAIL PROTECTED]
Data: 22 maja 2006 11:04
Temat: Re: [R] Matrix in 3D

 Hi
 
 there are dozens of examples of this kind of thing
 in the R-and-octave.txt file,  in the contributed docs section
 of CRAN.  See the multidimensional arrays section, near the
 bottom, for reproduction of all of matlab/octave's array
 handling capabilities.
 
 best wishes
 
 
 Robin
 
 
 
 On 22 May 2006, at 09:56, Uwe Ligges wrote:
 
  [EMAIL PROTECTED] wrote:
 
  Dear R Users,
  Is it possible to add another (third) index to matrix (as in  
  MATLAB). For some analysis e.g. finite mixture models is  
  necessary. Simple example
 
  i-3
  matrix[, , i]-matrixA[, ,i]%*%matrixB[, , i]
 
 
  See ?array
 
  Uwe Ligges
 
 
  I would appreciate any help
  Rob
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting- 
  guide.html
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! http://www.R-project.org/posting- 
  guide.html
 
 --
 Robin Hankin
 Uncertainty Analyst
 National Oceanography Centre, Southampton
 European Way, Southampton SO14 3ZH, UK
   tel  023-8059-7743
 
 
 
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Can lmer() fit a multilevel model embedded in a regression?

2006-05-22 Thread Doran, Harold
So, in the hierarchical notation, does the model look like this (for the linear 
predictor):

DV = constant + food_1(B_1) + food_2(B_2) + ... + food_82(B_82) + sex(B_83) + 
age(B_84)
food_1 = gamma_00 + gamma_01(folic) + r_01 
food_2 = gamma_10 + gamma_11(folic) + r_02
...
food_82 = gamma_20 + gamma_21(folic) + r_82

where r_qq ~ N(0, Psi) and Psi is an 82-dimensional covariance matrix.

I usually need to see this in model form as it helps me translate this into 
lmer syntax if it can be estimated. From what I see, this would be estimating 
82(82+1)/2 = 3403 parameters in the covariance matrix. 

What I'm stuck on is below you say it would be hopeless to estimate the 82 
predictors using ML. But, if I understand the model correctly, the multilevel 
regression still resolves the predictors (fixed effects) using ML once 
estimates of the variances are obtained. So, I feel I might still be missing 
something.



-Original Message-
From:   Andrew Gelman [mailto:[EMAIL PROTECTED]
Sent:   Sun 5/21/2006 7:35 PM
To: Doran, Harold
Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
Subject:Re: [R] Can lmer() fit a multilevel model embedded in a 
regression?

Harold,

I get confused by the terms fixed and random.  Our first-level model 
(in the simplified version we're discussing here) has 800 data points 
(the persons in the study) and 84 predictors:  sex, age, and 82 
coefficients for foods.  The second-level model has 82 data points (the 
foods) and two predictors:  a constant term and folic acid concentration.

It would be hopeless to estimate the 82 food coefficients via maximum 
likelihood, so the idea is to do a multilevel model, with a regression 
of these coefficients on the constant term and folic acid.  The 
group-level model has a residual variance.  If the group-level residual 
variance is 0, it's equivalent to ignoring food, and just using total 
folic acid as an individual predictor.  If the group-level residual 
variance is infinity, it's equivalent to estimating the original 
regression (with 84 predictors) using least squares.

The difficulty is that the foods aren't groups in the usual sense, 
since persons are not nested within foods; rather, each person eats many 
foods, and this is reflected in the X matrix.

Andrew

Doran, Harold wrote:

 OK, I'm piecing this together a bit, sorry I'm not familiar with the 
 article you cite. Let me try and fully understand the issue if you 
 don't mind. Are you estimating each of the 82 foods as fixed effects? 
 If so, in the example below this implies 84 total fixed effects (1 for 
 each food type in the X matrix and then sex and age).

 I'm assuming that food type is nested within one of the 82 folic acid 
 concentrations and then folic acid is treated as a random effect.

 Is this accurate?


 -Original Message-
 From:   Andrew Gelman [mailto:[EMAIL PROTECTED]
 Sent:   Sun 5/21/2006 9:17 AM
 To: Doran, Harold
 Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
 Subject:Re: [R] Can lmer() fit a multilevel model embedded in 
 a regression?

 Harold,

 I'm confused now.  Just for concretness, suppose we have 800 people, 82
 food items, and one predictor (folic, the folic acid concentration) at
 the food-item level.  Then DV will be a vector of length 800, foods is
 an 800 x 82 matrix, sex is a vector of length 800, age is a vector of
 length 800, and folic is a vector of length 82.  The vector of folic
 acid concentrations in individual diets is then just foods%*%folic,
 which I can call folic_indiv.

 How would I fit the model in lmer(), then?  There's some bit of
 understading that I'm still missing.

 Thanks.
 Andrew


 Doran, Harold wrote:

  Prof Gelman:
 
  I believe the answer is yes. It sounds as though persons are partially
  crossed within food items?
 
  Assuming a logit link, the syntax might follow along the lines of
 
  fm1 - lmer(DV ~ foods + sex + age + (1|food_item), data, family = 
  binomial(link='logit'), method = Laplace, control = list(usePQL=
  FALSE) )
 
  Maybe this gets you partly there.
 
  Harold
 
 
 
  -Original Message-
  From:   [EMAIL PROTECTED] on behalf of Andrew Gelman
  Sent:   Sat 5/20/2006 5:49 AM
  To: r-help@stat.math.ethz.ch
  Cc: [EMAIL PROTECTED]
  Subject:[R] Can lmer() fit a multilevel model embedded in a
  regression?
 
  I would like to fit a hierarchical regression model from Witte et al.
  (1994; see reference below).  It's a logistic regression of a health
  outcome on quntities of food intake; the linear predictor has the form,
  X*beta + W*gamma,
  where X is a matrix of consumption of 82 foods (i.e., the rows of X
  represent people in the study, the columns represent different foods,
  and X_ij is the amount of food j eaten by person i); and W is a matrix
  of some other predictors (sex, age, ...).
 
  The second stage of the model is a regression of X on some food-level
  predictors.
 
  Is it possible to fit this model in (the current 

Re: [R] how to vectorize a matrix

2006-05-22 Thread Martin Maechler
 Christian == Christian Ritz [EMAIL PROTECTED]
 on Mon, 22 May 2006 11:17:32 +0200 writes:

Christian Hi Lorenzo,
Christian maybe the following example is of use?


Christian a - matrix(1:25,5,5)

Christian stack(as.data.frame(a[, c(1,3,5,2,4)])) 


Christian Note that 'stack' takes a data frame or list as first argument 
(not a
Christian matrix). Therefore the matrix is first converted to a data frame 
using
Christian 'as.data.frame'.

I do wonder though if not simply

  c(a[, c(1,3,5,2,4)])
 [1]  1  2  3  4  5 11 12 13 14 15 21 22 23 24 25  6  7  8  9 10 16 17 18 19 20

is enough for Lorenzo, in this case ??

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] normality testing with nortest

2006-05-22 Thread Rolf Turner
[EMAIL PROTECTED] wrote:

   One thing that threw me off (and this is not really specific to 
 Nortest as it seems to be correct, but just my understanding), but the 
 p-value seems quite unstable.  For example:



It is worth noting that if the null hypothesis is true, then
the p-value is

uniformly distributed on [0,1].

This should be kept in mind when assessing the ``instability''
of p-values.

cheers,

Rolf Turner
[EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random generation of a factor

2006-05-22 Thread Rau, Roland
Hi,


 
 Does anybody know a function that generates a factor of 
 length N with K 
 levels given the proportions c(p1, p2,... ,pk)? It would not 

## does this code piece help you?

mydata - c(yesterday, today, tomorrow)
myproportions - c(0.3, 0.5, 0.2)

n - 20
sample(x=mydata, size=n, replace=TRUE, prob=myproportions)

cat(Best,\nRoland\n)

--
This mail has been sent through the MPI for Demographic Rese...{{dropped}}

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] SUSE 10.1 64 bit binary

2006-05-22 Thread Detlef Steuer
Hi, 

I just uploaded one to CRAN. You can use my repository

http://fawn.hsu-hh.de/~steuer/SL-10.1

already.

Detlef

On Mon, 22 May 2006 09:46:53 +0100
Daniel Brewer [EMAIL PROTECTED] wrote:

 Hi,
 
 I was just wondering if anyone had created a 64 bit binary for SUSE 10.1
  x86_64 yet.  Unfortunately, I upgraded before looking whether a package
 was available and now cannot find a suitable package on CRAN.  If noone
 has produced one yet, does anyone have any idea when one will be available?
 
 Many Thanks
 
 Daniel Brewer
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] SUSE 10.1 64 bit binary

2006-05-22 Thread Detlef Steuer
Thx, I´ll fix it!

Detlef

On 22 May 2006 11:43:33 +0200
Peter Dalgaard [EMAIL PROTECTED] wrote:

 (Incidentally, Detlef: The Source: entry in the SPEC is wrong.
 cvs.r-project.org is not accessible and hasn't been so for quite a
 while.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] odd feature

2006-05-22 Thread Duncan Murdoch
On 5/22/2006 3:26 AM, Martin Maechler wrote:
 Gabor == Gabor Grothendieck [EMAIL PROTECTED]
 on Sun, 21 May 2006 09:47:07 -0400 writes:
 
 Gabor If you know that test is a scalar
 
 Gabor result - if (test) a else b
 
 Gabor will do it.  
 
 Yes, indeed!
 IMO,  ifelse(test, a, b) is much overused where as  
   if(test) a else b  is much UNDER used.
 
From some e-mail postings, and even some documents (even printed
 books?), I get the impression that too many people think that
  ifelse(.,.,.)  is to be used as expression / function and 
   if(.) . else . only for program flow control.
 This leads to quite suboptimal code, and I personally use
 if(.) . else .  __as expression__ much more frequently than ifelse(.,.,.)

For overuse of ifelse(), do you mean cases where test is length 1, so 
if() would work?  Or are you thinking of something else?

I'd also be interested in what you mean by quite suboptimal code.  Are 
you thinking of things like

  if (test)
 temp - a
  else
 temp - b
  result - f(temp)

versus

  result - f( if (test) a else b )

?

I would generally use the former, because it's easier to get the 
formatting right, and I find it easier to read.  It's suboptimal in 
speed and memory use because of creating the temp variable, but in most 
cases I think that would be such a small difference that the small 
increase in readability is worthwhile.

Duncan Murdoch

  
 Martin Maechler, ETH Zurich.
 
 Gabor Here is another approach:
 
 Gabor as.vector(test * ts(a) + (!test) * ts(b))
 
 
 
 Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
  Dear R wizards:
  
  I just got stung by the ifelse() feature.
  
   a - 10:15
   b - 20:300
   test - 1
   ifelse(test,a,b)
  [1] 10
  
  I had not realized that this was the default behavior---I had expected
  10:15.  mea culpa.  however, I wonder whether it would make sense to
  replace ifelse with a different semantic, where if test is a single
  scalar, it means what a stupid user like me would imagine.
  
  Aside, I like the flexibility of R, but I am not thrilled by all the
  recycling rules.  I either mean I want a scalar or a vector of
  equal/appropriate dimension.  I never want a recycle of a smaller
  vector.  (I do often use a recycle of a scalar.)
  
  regards,
  
  /iaw
  
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
 
 Gabor __
 Gabor R-help@stat.math.ethz.ch mailing list
 Gabor https://stat.ethz.ch/mailman/listinfo/r-help
 Gabor PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] [Rd] Citation for R (was How to reference R in papers?)

2006-05-22 Thread White, Charles E WRAIR-Wash DC
The following link will take you to a long running discussion on your
topic:
http://tolstoy.newcastle.edu.au/R/help/06/02/21659.html

Chuck

Charles E. White, MS, Senior Biostatistician
Walter Reed Army Institute of Research
503 Robert Grant Ave., Room 1w102
Silver Spring, MD 20910-1557
301 319-9781

Personal/Professional Site:
http://users.starpower.net/cwhite571/professional/ 

-Original Message-
Date: Mon, 22 May 2006 16:17:28 +0800
From: j.joshua thomas [EMAIL PROTECTED]
Subject: [R] How to reference R in papers?

Dear List,

How do i reference R 2.3.0 in research papers?

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How can you buy R?

2006-05-22 Thread Deepayan Sarkar
On 5/22/06, Berwin A Turlach [EMAIL PROTECTED] wrote:

[...]

 Thus, the last sentence of mine that you quoted:

  My understanding was that in that moment a product was
  created that would have to be wholly under the GPL, so the
  user was violating the GPL and lost the write to use your
  package.

 Should perhaps better be formulated as:

  My understanding was that in that moment a product was
  created that would have to be wholly under the GPL, so the
  person who did the linking was violating the GPL and it is
  not clear whether anyone is allowed to use the linked product.

I think you are still missing the point. The act of creating a
derivative work is NOT governed by the GPL, so it cannot possibly by
itself violate the GPL. The question of violation only applies when
the creator of this derivative work wishes to _distribute_ it. This is
like me writing a book that no one else ever reads; it doesn't matter
if I have plagiarized huge parts of it. This point is not as academic
as you might think. It is well known that Google uses a customized
version of Linux for their servers; however, they do not distribute
this customized version, and hence are under no obligation to provide
the changes (and they do not, in fact). This is NOT a violation of the
GPL.

 DS I was referring to your question (quoted above) about use of
 DS GPL'd code in S-PLUS, which is what I was replying to. As I
 DS was saying, that situation is the opposite of the one in your
 DS example.
 O.k., sorry, I used a different scale with the time point of origin at
 Spencer's e-mail and my answer to that mail.  Now I am with you.

 Agreed, the situation is the opposite, but that was the example
 discussed in gnu.misc.discuss.  From an abstract point of view the
 situations are the same.  You make someone else link a GPL product
 with a non-GPL product creating a derived work, the derived work would
 have to be under the GPL but is not.  Hence, the derived work has a
 legal status that is in limbo and it is not clear whether anyone has
 to right to use it.

 The discussions on gnu.misc.discuss were centred on cases were people
 provided non-GPL binaries, asked their users to download GPL software
 from elsewhere, compile and link everything together and then use the
 combined product.

 As you say it is the exact opposite (and hence mirror image) from the
 situation that I was worried about, where I provide GPL software and
 ask others to compile and link it with non-GPL binaries and then use
 the combined product.

 If one scenario is not on, I don't see how the other one could be
 acceptable either.  Except that in the first scenario there is a clear
 intend of circumventing the GPL.  But I was not sure whether such kind
 of intent makes any difference.  Thus, to avoid all these problems I
 decided to rather use the LGPL since that licence definitely seemed to
 allow both.

That's your choice, but the situations are not symmetric, and quite
deliberately so. The FSF's plan was not to produce a completely
independent and fully functional 'GNU system' at once (which would be
unrealistic), but rather produce replacements of UNIX tools one by
one. It was entirely necessary to allow these new versions to operate
within the older, proprietary system. In fact, GCC was not the first
piece of software released under the GPL, and until then the only way
to use GPL software was to compile them using a non-free compiler.
This is enabled by means of exceptions to the GPL, as described in

http://www.gnu.org/licenses/gpl-faq.html#GPLIncompatibleLibs

(which I have already referred to once before).

Deepayan

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] How can you buy R?

2006-05-22 Thread Duncan Murdoch
On 5/22/2006 3:55 AM, Berwin A Turlach wrote:
 G'day Deepayan,
 
 DS == Deepayan Sarkar [EMAIL PROTECTED] writes:
 
 DS let me first summarize this sub-discussion so far: [...]
 Sound like a perfect summary. :)
 
 DS As far as I can tell (and please correct me if I'm wrong),
 DS your contention is that by linking a GPL component P with a
 DS non-GPL component Q, a user may lose the rights granted to him
 DS by the GPL to the GPL-d part P.
 I don't think that I said this explicitly, but I can see how what I
 said can be interpreted in such a way.  The point is rather that at
 the moment component P and Q are linked (and I perhaps carelessly
 assumed that the user was doing this) a product is produced that
 should be completely under the GPL.  Obviously it is not.  Hence, the
 status of this linked product, and whether it can be used by anybody,
 is an open question.  And the answer is probably given by the
 copyright laws (and others?) of the country in the linking happens.  
 
 
 DS Let's assume this is true. All that means is that the user has
 DS lost his rights to copy, modify and redistribute P. He does
 DS NOT lose the rights to use P.
 I agree with you on this.  Probably I was to terse in my writing and
 produced misunderstandings.  I never intended to say something about
 the rights that the user has with regards to P alone.  My comments
 were directed towards the linked product P+Q.  In particular, it is
 not clear to me whether one can execute such a product without
 violating copyright laws. 

The GPL is quite explicit on this:  as Deepayan said, it confers rights 
to copy, modify and redistribute P.  Activities other than copying, 
distribution and modification are not covered by this License; they are 
outside its scope.

This probably varies from country to country, but I think the assumption 
is that if you have a legally acquired copy of a program, you have a 
right to execute it as you like.  (The American DMCA and laws in other 
countries that implement the WIPO anti-circumvention rules limit you in 
specific ways, but they probably don't apply to the situation we're 
talking about.)

Now, I suppose you might argue that executing P+Q makes a copy of it in 
memory, but I think countries that have modernized their copyright laws 
recognize that this is something you have a right to do with a legally 
acquired copy.  You don't need the GPL to give you permission to do 
this.  That's certainly true in the US and Canada.  Your country may vary.

Duncan Murdoch

 
 Thus, the last sentence of mine that you quoted:
 
  My understanding was that in that moment a product was
  created that would have to be wholly under the GPL, so the
  user was violating the GPL and lost the write to use your
  package.
 
 Should perhaps better be formulated as:
 
  My understanding was that in that moment a product was
  created that would have to be wholly under the GPL, so the
  person who did the linking was violating the GPL and it is
  not clear whether anyone is allowed to use the linked product.
 
  A simple google search would have confirmed to you that the
  linux kernel is developed under the GPL.  [...]
 DS Linux is under GPL2, and not GPL2 or later. [...]
 Oh, I wasn't aware that they did not use the typical(?) or later
 phrase.  Thanks for pointing this out and I note that we both agree
 that the linux kernel is definitely not under LGPL.
 
 DS In any case, this is the complete opposite of the situation we
 DS were originally discussing: [...]
  [...]  So I have to wonder to what you are referring to as the
  situation we were originally discussing.
 
 DS I was referring to your question (quoted above) about use of
 DS GPL'd code in S-PLUS, which is what I was replying to. As I
 DS was saying, that situation is the opposite of the one in your
 DS example.
 O.k., sorry, I used a different scale with the time point of origin at
 Spencer's e-mail and my answer to that mail.  Now I am with you.
 
 Agreed, the situation is the opposite, but that was the example
 discussed in gnu.misc.discuss.  From an abstract point of view the
 situations are the same.  You make someone else link a GPL product
 with a non-GPL product creating a derived work, the derived work would
 have to be under the GPL but is not.  Hence, the derived work has a
 legal status that is in limbo and it is not clear whether anyone has
 to right to use it.  
 
 The discussions on gnu.misc.discuss were centred on cases were people
 provided non-GPL binaries, asked their users to download GPL software
 from elsewhere, compile and link everything together and then use the
 combined product.  
 
 As you say it is the exact opposite (and hence mirror image) from the
 situation that I was worried about, where I provide GPL software and
 ask others to compile and link it with non-GPL binaries and then 

[R] what are supressors doing in hierarchical partitioning??

2006-05-22 Thread jones christian
Hello,
After fitting  a model with glm (glm(A~a+b+c+d,binomial)) I want to have a 
closer look at the goodness of fit for each of the 4 independant variables by 
using hierarchical partitioning. The results (see below) indicate that three 
variables (MEDIANH_HE+CLOJUL+K95) act as a supressor because their dependant 
contribution is negative. For two variables the total contribution is near 0 
(CLOJUL+N5_97). Surprisingly all variables are highly significant in the glm 
model. How can that be? And do the results from  hierarchical partitioning mean 
that I better not use this model for prediction??? (PS: I went to a lot of 
effort fitting this model)
many thanks in advance
Christian




 hier.part(NACHTIGALL.x,A2HP_N5,family=binomial,gof=logLik,barplot=T)
$gfs
 [1] -131.61028 -106.38458 -131.60991 -124.60603 -131.51891 -101.70663
 [7] -102.38016 -105.11808 -124.55582 -131.51761 -124.31356  -98.96125
[13]  -98.59341 -100.65798 -124.29683  -95.51063

$IJ
   I JTotal
MEDIANH_HE 26.895004 -1.669303 2.522570e+01
CLOJUL  2.511079 -2.510707 3.721918e-04
K95 5.245935  1.758320 7.004255e+00
N5_97   1.447637 -1.356262 9.137552e-02

$I.perc
   I
MEDIANH_HE 74.502107
CLOJUL  6.955965
K9514.531814
N5_97   4.010114



--

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] normality testing with nortest

2006-05-22 Thread Duncan Murdoch
On 5/22/2006 5:25 AM, Peter Dalgaard wrote:
 Uwe Ligges [EMAIL PROTECTED] writes:
 
 
 A = 0.1846, p-value = 0.9059

 ad.test(rnorm(100,mean=5,sd=3))
 ...
 A = 0.5138, p-value = 0.1887

 I mistakenly had thought the p-values would be more stable since I 
 am artificially creating a random normal distribution.  Is this expected 
 for a normality test or is this an issue with how rnorm is producing 
 random numbers?  I guess if I run it many times, I would find that I 
 would get many large values for the p-value?
 Well, as many large values as small values, 5% significant differences 
 for the 5% level

 The following looks alright:

 hist(replicate(1000,  ad.test(rnorm(100,mean=5,sd=3))$p.value))
 
 We see this misunderstanding worryingly often. Worrying because it
 reveals that a fundamental aspect of statistical inference has not
 been grasped: that p-values are designed to be (approximately)
 uniformly distributed and fall below any given level with the stated
 probability, when the null hypothesis is true.

I think it's the fallacious belief that the p-value measures the 
probability that the null hypothesis is true.  This is currently 
misunderstanding #1 in the Wikipedia entry for P-values.

(Google had me worried:  I searched for probability that the null 
hypothesis is true and found

 P-value - Wikipedia, the free encyclopedia
 The p-value is the probability that the null hypothesis is true, justifying 
 the rule of considering as significant p-values closer to 0 (zero). ...
 en.wikipedia.org/wiki/P-value - 17k - Cached - Similar pages

This quote is preceded by:  All of the following [...] statements are 
false: Context is important! :-)

The vast majority of hits to that search also pointed out that this 
interpretation was incorrect.  A couple of counterexamples were a 
research methods page at a department of psychology, and
another at a medical school.  I'll send a copy of this note to people there.

Duncan Murdoch

 
 There is no mechanism to give you fewer significant or more stable
 p-values, and a p-value close to one is no better an indication of a
 true null hypothesis than one of 0.5 or 0.25.


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Frank E Harrell Jr
Duncan Murdoch wrote:
 On 5/22/2006 3:26 AM, Martin Maechler wrote:
 
Gabor == Gabor Grothendieck [EMAIL PROTECTED]
on Sun, 21 May 2006 09:47:07 -0400 writes:

Gabor If you know that test is a scalar

Gabor result - if (test) a else b

Gabor will do it.  

Yes, indeed!
IMO,  ifelse(test, a, b) is much overused where as  
  if(test) a else b  is much UNDER used.

From some e-mail postings, and even some documents (even printed
books?), I get the impression that too many people think that
 ifelse(.,.,.)  is to be used as expression / function and 
  if(.) . else . only for program flow control.
This leads to quite suboptimal code, and I personally use
if(.) . else .  __as expression__ much more frequently than ifelse(.,.,.)
 
 
 For overuse of ifelse(), do you mean cases where test is length 1, so 
 if() would work?  Or are you thinking of something else?
 
 I'd also be interested in what you mean by quite suboptimal code.  Are 
 you thinking of things like
 
   if (test)
  temp - a
   else
  temp - b
   result - f(temp)
 
 versus
 
   result - f( if (test) a else b )
 
 ?
 
 I would generally use the former, because it's easier to get the 
 formatting right, and I find it easier to read.  It's suboptimal in 
 speed and memory use because of creating the temp variable, but in most 
 cases I think that would be such a small difference that the small 
 increase in readability is worthwhile.

IMHO that approach too verbose and not more readable.

Frank

 
 Duncan Murdoch
 
 
 
Martin Maechler, ETH Zurich.

Gabor Here is another approach:

Gabor as.vector(test * ts(a) + (!test) * ts(b))



Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
 Dear R wizards:
 
 I just got stung by the ifelse() feature.
 
  a - 10:15
  b - 20:300
  test - 1
  ifelse(test,a,b)
 [1] 10
 
 I had not realized that this was the default behavior---I had expected
 10:15.  mea culpa.  however, I wonder whether it would make sense to
 replace ifelse with a different semantic, where if test is a single
 scalar, it means what a stupid user like me would imagine.
 
 Aside, I like the flexibility of R, but I am not thrilled by all the
 recycling rules.  I either mean I want a scalar or a vector of
 equal/appropriate dimension.  I never want a recycle of a smaller
 vector.  (I do often use a recycle of a scalar.)
 
 regards,
 
 /iaw
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 



-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Gabor Grothendieck
If you don't like f(if (temp) a else b)
then what about

temp - if (test) a else b
f(temp)

or

temp - if (test)
  a
else
  b
f(temp)

I think its easier to understand if you factor the temp- out since
one immediately then knows the purpose of the statement is
to set temp.


On 5/22/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
 On 5/22/2006 3:26 AM, Martin Maechler wrote:
  Gabor == Gabor Grothendieck [EMAIL PROTECTED]
  on Sun, 21 May 2006 09:47:07 -0400 writes:
 
  Gabor If you know that test is a scalar
 
  Gabor result - if (test) a else b
 
  Gabor will do it.
 
  Yes, indeed!
  IMO,  ifelse(test, a, b) is much overused where as
if(test) a else b  is much UNDER used.
 
 From some e-mail postings, and even some documents (even printed
  books?), I get the impression that too many people think that
   ifelse(.,.,.)  is to be used as expression / function and
if(.) . else . only for program flow control.
  This leads to quite suboptimal code, and I personally use
  if(.) . else .  __as expression__ much more frequently than ifelse(.,.,.)

 For overuse of ifelse(), do you mean cases where test is length 1, so
 if() would work?  Or are you thinking of something else?

 I'd also be interested in what you mean by quite suboptimal code.  Are
 you thinking of things like

  if (test)
 temp - a
  else
 temp - b
  result - f(temp)

 versus

  result - f( if (test) a else b )

 ?

 I would generally use the former, because it's easier to get the
 formatting right, and I find it easier to read.  It's suboptimal in
 speed and memory use because of creating the temp variable, but in most
 cases I think that would be such a small difference that the small
 increase in readability is worthwhile.

 Duncan Murdoch

 
  Martin Maechler, ETH Zurich.
 
  Gabor Here is another approach:
 
  Gabor as.vector(test * ts(a) + (!test) * ts(b))
 
 
 
  Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
   Dear R wizards:
  
   I just got stung by the ifelse() feature.
  
a - 10:15
b - 20:300
test - 1
ifelse(test,a,b)
   [1] 10
  
   I had not realized that this was the default behavior---I had 
  expected
   10:15.  mea culpa.  however, I wonder whether it would make sense to
   replace ifelse with a different semantic, where if test is a single
   scalar, it means what a stupid user like me would imagine.
  
   Aside, I like the flexibility of R, but I am not thrilled by all the
   recycling rules.  I either mean I want a scalar or a vector of
   equal/appropriate dimension.  I never want a recycle of a smaller
   vector.  (I do often use a recycle of a scalar.)
  
   regards,
  
   /iaw
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
  
 
  Gabor __
  Gabor R-help@stat.math.ethz.ch mailing list
  Gabor https://stat.ethz.ch/mailman/listinfo/r-help
  Gabor PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
  http://www.R-project.org/posting-guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Duncan Murdoch
On 5/22/2006 9:38 AM, Frank E Harrell Jr wrote:
 Duncan Murdoch wrote:
 On 5/22/2006 3:26 AM, Martin Maechler wrote:

 Gabor == Gabor Grothendieck [EMAIL PROTECTED]
on Sun, 21 May 2006 09:47:07 -0400 writes:
Gabor If you know that test is a scalar

Gabor result - if (test) a else b

Gabor will do it.  

 Yes, indeed!
 IMO,  ifelse(test, a, b) is much overused where as  
  if(test) a else b  is much UNDER used.

 From some e-mail postings, and even some documents (even printed
 books?), I get the impression that too many people think that
 ifelse(.,.,.)  is to be used as expression / function and 
  if(.) . else . only for program flow control.
 This leads to quite suboptimal code, and I personally use
 if(.) . else .  __as expression__ much more frequently than ifelse(.,.,.)

 For overuse of ifelse(), do you mean cases where test is length 1, so 
 if() would work?  Or are you thinking of something else?

 I'd also be interested in what you mean by quite suboptimal code.  Are 
 you thinking of things like

   if (test)
  temp - a
   else
  temp - b
   result - f(temp)

 versus

   result - f( if (test) a else b )

 ?

 I would generally use the former, because it's easier to get the 
 formatting right, and I find it easier to read.  It's suboptimal in 
 speed and memory use because of creating the temp variable, but in most 
 cases I think that would be such a small difference that the small 
 increase in readability is worthwhile.
 
 IMHO that approach too verbose and not more readable.

IMO terse unreadable :-)

Duncan Murdoch

 
 Frank
 
 Duncan Murdoch


 Martin Maechler, ETH Zurich.

Gabor Here is another approach:

Gabor as.vector(test * ts(a) + (!test) * ts(b))



Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
 Dear R wizards:
 
 I just got stung by the ifelse() feature.
 
  a - 10:15
  b - 20:300
  test - 1
  ifelse(test,a,b)
 [1] 10
 
 I had not realized that this was the default behavior---I had expected
 10:15.  mea culpa.  however, I wonder whether it would make sense to
 replace ifelse with a different semantic, where if test is a single
 scalar, it means what a stupid user like me would imagine.
 
 Aside, I like the flexibility of R, but I am not thrilled by all the
 recycling rules.  I either mean I want a scalar or a vector of
 equal/appropriate dimension.  I never want a recycle of a smaller
 vector.  (I do often use a recycle of a scalar.)
 
 regards,
 
 /iaw
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 

 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Martin Maechler
 Gabor == Gabor Grothendieck [EMAIL PROTECTED]
 on Mon, 22 May 2006 09:31:14 -0400 writes:

Gabor If you don't like f(if (temp) a else b)
Gabor then what about

Gabor temp - if (test) a else b
Gabor f(temp)

Gabor or

Gabor temp - if (test)
Gabor a
Gabor else
Gabor b
Gabor f(temp)

Gabor I think its easier to understand if you factor the temp- out since
Gabor one immediately then knows the purpose of the statement is
Gabor to set temp.

I strongly agree with Gabor on the above.

But, to Duncan's question: 
Yes, indeed, my main point was that people use  
ifelse(test, a, b)  also in cases where test is known to be of
length one.

BTW, the 2nd point about why I don't ``like'' ifelse() so much
is on its help page:

Both 'a' and 'b' are fully evaluated even though only one of the
two values of a[i], b[i] are used in the result.

Martin

Gabor On 5/22/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
 On 5/22/2006 3:26 AM, Martin Maechler wrote:
  Gabor == Gabor Grothendieck [EMAIL PROTECTED]
  on Sun, 21 May 2006 09:47:07 -0400 writes:
 
  Gabor If you know that test is a scalar
 
  Gabor result - if (test) a else b
 
  Gabor will do it.
 
  Yes, indeed!
  IMO,  ifelse(test, a, b) is much overused where as
if(test) a else b  is much UNDER used.
 
 From some e-mail postings, and even some documents (even printed
  books?), I get the impression that too many people think that
   ifelse(.,.,.)  is to be used as expression / function and
if(.) . else . only for program flow control.
  This leads to quite suboptimal code, and I personally use
  if(.) . else .  __as expression__ much more frequently than 
ifelse(.,.,.)
 
 For overuse of ifelse(), do you mean cases where test is length 1, so
 if() would work?  Or are you thinking of something else?
 
 I'd also be interested in what you mean by quite suboptimal code.  Are
 you thinking of things like
 
 if (test)
 temp - a
 else
 temp - b
 result - f(temp)
 
 versus
 
 result - f( if (test) a else b )
 
 ?
 
 I would generally use the former, because it's easier to get the
 formatting right, and I find it easier to read.  It's suboptimal in
 speed and memory use because of creating the temp variable, but in most
 cases I think that would be such a small difference that the small
 increase in readability is worthwhile.
 
 Duncan Murdoch
 
 
  Martin Maechler, ETH Zurich.
 
  Gabor Here is another approach:
 
  Gabor as.vector(test * ts(a) + (!test) * ts(b))
 
 
 
  Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
   Dear R wizards:
  
   I just got stung by the ifelse() feature.
  
a - 10:15
b - 20:300
test - 1
ifelse(test,a,b)
   [1] 10
  
   I had not realized that this was the default behavior---I had 
expected
   10:15.  mea culpa.  however, I wonder whether it would make 
sense to
   replace ifelse with a different semantic, where if test is a 
single
   scalar, it means what a stupid user like me would imagine.
  
   Aside, I like the flexibility of R, but I am not thrilled by 
all the
   recycling rules.  I either mean I want a scalar or a vector of
   equal/appropriate dimension.  I never want a recycle of a 
smaller
   vector.  (I do often use a recycle of a scalar.)
  
   regards,
  
   /iaw
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
  
 
  Gabor __
  Gabor R-help@stat.math.ethz.ch mailing list
  Gabor https://stat.ethz.ch/mailman/listinfo/r-help
  Gabor PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Gabor Grothendieck
Due to lazy evaluation, I don't think a and b are fully evaluated:

 ifelse(1, a - 1, b - 2)
[1] 1
 a
[1] 1
 b
Error: object b not found


On 5/22/06, Martin Maechler [EMAIL PROTECTED] wrote:
  Gabor == Gabor Grothendieck [EMAIL PROTECTED]
  on Mon, 22 May 2006 09:31:14 -0400 writes:

Gabor If you don't like f(if (temp) a else b)
Gabor then what about

Gabor temp - if (test) a else b
Gabor f(temp)

Gabor or

Gabor temp - if (test)
Gabor a
Gabor else
Gabor b
Gabor f(temp)

Gabor I think its easier to understand if you factor the temp- out since
Gabor one immediately then knows the purpose of the statement is
Gabor to set temp.

 I strongly agree with Gabor on the above.

 But, to Duncan's question:
 Yes, indeed, my main point was that people use
 ifelse(test, a, b)  also in cases where test is known to be of
 length one.

 BTW, the 2nd point about why I don't ``like'' ifelse() so much
 is on its help page:

 Both 'a' and 'b' are fully evaluated even though only one of the
 two values of a[i], b[i] are used in the result.

 Martin

Gabor On 5/22/06, Duncan Murdoch [EMAIL PROTECTED] wrote:
 On 5/22/2006 3:26 AM, Martin Maechler wrote:
  Gabor == Gabor Grothendieck [EMAIL PROTECTED]
  on Sun, 21 May 2006 09:47:07 -0400 writes:
 
  Gabor If you know that test is a scalar
 
  Gabor result - if (test) a else b
 
  Gabor will do it.
 
  Yes, indeed!
  IMO,  ifelse(test, a, b) is much overused where as
if(test) a else b  is much UNDER used.
 
 From some e-mail postings, and even some documents (even printed
  books?), I get the impression that too many people think that
   ifelse(.,.,.)  is to be used as expression / function and
if(.) . else . only for program flow control.
  This leads to quite suboptimal code, and I personally use
  if(.) . else .  __as expression__ much more frequently than 
 ifelse(.,.,.)

 For overuse of ifelse(), do you mean cases where test is length 1, so
 if() would work?  Or are you thinking of something else?

 I'd also be interested in what you mean by quite suboptimal code.  Are
 you thinking of things like

 if (test)
 temp - a
 else
 temp - b
 result - f(temp)

 versus

 result - f( if (test) a else b )

 ?

 I would generally use the former, because it's easier to get the
 formatting right, and I find it easier to read.  It's suboptimal in
 speed and memory use because of creating the temp variable, but in most
 cases I think that would be such a small difference that the small
 increase in readability is worthwhile.

 Duncan Murdoch

 
  Martin Maechler, ETH Zurich.
 
  Gabor Here is another approach:
 
  Gabor as.vector(test * ts(a) + (!test) * ts(b))
 
 
 
  Gabor On 5/21/06, ivo welch [EMAIL PROTECTED] wrote:
   Dear R wizards:
  
   I just got stung by the ifelse() feature.
  
a - 10:15
b - 20:300
test - 1
ifelse(test,a,b)
   [1] 10
  
   I had not realized that this was the default behavior---I had 
 expected
   10:15.  mea culpa.  however, I wonder whether it would make 
 sense to
   replace ifelse with a different semantic, where if test is a 
 single
   scalar, it means what a stupid user like me would imagine.
  
   Aside, I like the flexibility of R, but I am not thrilled by 
 all the
   recycling rules.  I either mean I want a scalar or a vector of
   equal/appropriate dimension.  I never want a recycle of a 
 smaller
   vector.  (I do often use a recycle of a scalar.)
  
   regards,
  
   /iaw
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
  
 
  Gabor __
  Gabor R-help@stat.math.ethz.ch mailing list
  Gabor https://stat.ethz.ch/mailman/listinfo/r-help
  Gabor PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html




Re: [R] odd feature

2006-05-22 Thread Thomas Lumley
On Mon, 22 May 2006, Gabor Grothendieck wrote:

 Due to lazy evaluation, I don't think a and b are fully evaluated:

 ifelse(1, a - 1, b - 2)
 [1] 1
 a
 [1] 1
 b
 Error: object b not found

yes. If you look at the code for ifelse() it evaluates the second argument 
if any test values are TRUE and the third argument if any test values are 
FALSE, so in the scalar case it does not evaluate both arguments.

-thomas


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] win2k memory problem with merge()'ing repeatedly (long email)

2006-05-22 Thread Sean O'Riordain
Good afternoon,

I have a 63 small .csv files which I process daily, and until two
weeks ago they processed just fine and only took a matter of moments
and had non noticeable memory problem.  Two weeks ago they have
reached 318 lines and my script broke.  There are some
missing-values in some of the files.  I have tried hard many times
over the last two weeks to create a small repeatable example to give
you but I've failed - unless I use my data it works fine... :-(

Am I missing something obvious? (again)

A line in a typical file has lines which look like :
01/06/2005,1372

Though there are three files which have two values (files 3,32,33) and
these have lines which look like...
01/06/2005,1766,
or
15/05/2006,289,114

a1 - read.csv(file1.csv,header=F)
etc...
a63 - read.csv(file63.csv,header=F)
names(a1) - c(mdate,file1.column.description)

atot - merge(a1,a2,all=T)

followed by repeatedly doing...
atot - merge(atot, a3,all=T)
atot - merge(atot, a4,all=T)
etc...

I normally start R with --vanilla.

What appears to happen is that atot doubles in size each iteration and
just falls over due to lack of memory at about i=17... even though the
total memory required for all of these individual a1...a63 is only
1001384 bytes (doing an object.size() on a1..a63)
at this point I've been trying to pin down this problem for two weeks
and I just gave up...

The following works fine as I'd expect with minimal memory usage...

for (i in 3:67) {
datelist - as.Date(start.date)+0:(count-1)
#remove a couple of elements...
datelist - datelist[-(floor(runif(nacount)*count))]
a2 - as.data.frame(datelist)
names(a2) - mdate
vname - paste(value, i, sep=)
a2[vname] - runif(length(datelist))
#a2[floor(runif(nacount)*count), vname] - NA

# atot - merge(atot,a2,all=T)
i - 2
a.eval.text - paste(merge(atot, a, i, , all=T), sep=)
cat(a.eval.text is: -, a.eval.text, -\n, sep=)
atot - eval(parse(text=a.eval.text))

cat(i:, i,  , gc(), \n)
}

this works fine... but on my files (as per attached 'lastsave.txt'
file) it just gobbles memory.
Am I doing something wrong?  I (wrongly?) expected that repeatedly
merge(atot,aN) would only increase the memory requirement linearly
(with jumps perhaps as we go through a 2^n boundary)... which is what
happens when merging simulated data.frames as above... no problem at
all and its really fast...

The attached text file shows a (slightly edited) session where the
memory required by the merge() operation just doubles with each use...
and I can only allow it to run until i=17!!!

I've even run it with gctorture() set on... with similar, but
excruciatingly slow results...

Is there any relevant info that I'm missing?  Unfortunately I am not
able to post the contents of the files to a public list like this...

As per a previous thread, I know that I can use a list to handle these
dataframes - but I had difficulty with the syntax of a list of
dataframes...

I'd like to know why the memory requirements for this merge just explode...

cheers, (and thanks in advance!)
Sean O'Riordain

==
 version
   _
platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status Patched
major  2
minor  3.0
year   2006
month  05
day09
svn rev38014
language   R
version.string Version 2.3.0 Patched (2006-05-09 r38014)

Running on Win2k with 1Gb ram.

I also tried it (with the same results) on 2.2.1 and 2.3.0.



R : Copyright 2006, The R Foundation for Statistical Computing
Version 2.3.0 Patched (2006-05-09 r38014)
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

 gc()
 used (Mb) gc trigger (Mb) max used (Mb)
Ncells 178186  4.8 407500 10.9   35  9.4
Vcells  73112  0.6 786432  6.0   333585  2.6
 # take the information in the .csv files created from the emails
 setwd(C:/Documents and Settings/c_oriordain_s/My 
 Documents/pasip/mms/mms_emails)

 # the input file from Amdocs (as supplied by revenue assurance)
 amdocs_csv_filename - amdocs_volumes_revised4.csv
 # where shall we put the output plot file
 copypath - ient1dfs001\\general\\Process Improvement Projects\\Process 
 Improvement Projects Repository\\Active Projects\\MMS\\01 Measure\\

 # set to F (false) instead of T (true) if you're just tricking around and you 
 don't
 # want to be copying over 

Re: [R] odd feature

2006-05-22 Thread Thomas Lumley
On Sun, 21 May 2006, ivo welch wrote:

 Aside, I like the flexibility of R, but I am not thrilled by all the
 recycling rules.  I either mean I want a scalar or a vector of
 equal/appropriate dimension.  I never want a recycle of a smaller
 vector.  (I do often use a recycle of a scalar.)


One case where the vector-vector recycling rules are used is in 
vector-matrix operations:
a-1:4
b-diag(4)
a+b

-thomas

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Martin Maechler
 TL == Thomas Lumley [EMAIL PROTECTED]
 on Mon, 22 May 2006 07:09:09 -0700 (PDT) writes:

TL On Mon, 22 May 2006, Gabor Grothendieck wrote:
 Due to lazy evaluation, I don't think a and b are fully evaluated:
 
 ifelse(1, a - 1, b - 2)
 [1] 1
 a
 [1] 1
 b
 Error: object b not found

TL yes. If you look at the code for ifelse() it evaluates
TL the second argument if any test values are TRUE and the
TL third argument if any test values are FALSE, so in the
TL scalar case it does not evaluate both arguments.

yes.  
And (Gabor), if you recall what I said earlier,  it's pretty clear
that I would not even *consider* using ifelse() in the scalar
case.  So my original statement was about the case ifelse() is
designed for: The non-trivial vector case, and there my
statement of fully evaluated  does apply.

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] JSS Code Snippets

2006-05-22 Thread Jan de Leeuw
JSS is trying to develop it's Code Snippets section. We have some  
snippets lined
up and one published in the latest volume

http://www.jstatsoft.org/index.php?vol=16

If you have small chunks of code of obvious relevance to statistical  
computing
(need not be in R) consider submitting it -- why keep it to  
yourself ? Just a
matter of pasting a minimal amount of TeX into our templates and  
diving into our
breathtakingly efficient review process.

We also welcome regular submissions and suggestions for special  
issues, book reviews,
and software reviews. Of course if you suggest something you may very  
well be asked
also to actually do something about that suggestion.

===
Jan de Leeuw; Distinguished Professor and Chair, UCLA Department of  
Statistics;
Editor: Journal of Multivariate Analysis, Journal of Statistical  
Software
US mail: 8125 Math Sciences Bldg, Box 951554, Los Angeles, CA 90095-1554
phone (310)-825-9550;  fax (310)-206-5658;  email: [EMAIL PROTECTED]
.mac: jdeleeuw ++  aim: deleeuwjan ++ skype: j_deleeuw
homepages: http://gifi.stat.ucla.edu ++ http://www.cuddyvalley.org
   
 
-
   No matter where you go, there you are. --- Buckaroo Banzai
http://gifi.stat.ucla.edu/sounds/nomatter.au

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Can lmer() fit a multilevel model embedded in a regression?

2006-05-22 Thread Andrew Gelman
Harold,

I think we use slightly different notation (I like to use variance 
parameters rather than covariance matrices).  Let me try to write it in 
model form:

Data points y_i, i=1,...,800

800 x 84 matrix of predictors, X:  for columns j=1,...,82, X_{i,j} is 
the amount of food j consumed by person i.  X_{i,83} is an indicator (1 
if male, 0 if female), and X_{i,84} is the age of person i.

Data-level model:  Pr (y_i=1) = inverse.logit (X_i*beta), for 
i=1,...,800, with independent outcomes.

beta is a (column) vector of length 84.

Group-level model:  for j=1,...,82:  beta_j ~ Normal (gamma_0 + gamma_1 
* u_j, sigma^2_{beta}).

u is a vector of length 82, where u_j = folate concentration in food j

gamma_0 and gamma_1 are scalar coefficients (for the group-level model), 
and sigma_{beta} is the sd of the group-level errors.

It would be hopeless to estimate all the betas using maximum 
likelihood:  that's 800 data points and 84 predictors, the results will 
just be too noisy.  But it should be ok using the 2-level model above.  
The question is:  can I fit in lmer()?

Thanks again.
Andrew

Doran, Harold wrote:

 So, in the hierarchical notation, does the model look like this (for 
 the linear predictor):

 DV = constant + food_1(B_1) + food_2(B_2) + ... + food_82(B_82) + 
 sex(B_83) + age(B_84)
 food_1 = gamma_00 + gamma_01(folic) + r_01
 food_2 = gamma_10 + gamma_11(folic) + r_02
 ...
 food_82 = gamma_20 + gamma_21(folic) + r_82

 where r_qq ~ N(0, Psi) and Psi is an 82-dimensional covariance matrix.

 I usually need to see this in model form as it helps me translate this 
 into lmer syntax if it can be estimated. From what I see, this would 
 be estimating 82(82+1)/2 = 3403 parameters in the covariance matrix.

 What I'm stuck on is below you say it would be hopeless to estimate 
 the 82 predictors using ML. But, if I understand the model correctly, 
 the multilevel regression still resolves the predictors (fixed 
 effects) using ML once estimates of the variances are obtained. So, I 
 feel I might still be missing something.



 -Original Message-
 From:   Andrew Gelman [mailto:[EMAIL PROTECTED]
 Sent:   Sun 5/21/2006 7:35 PM
 To: Doran, Harold
 Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
 Subject:Re: [R] Can lmer() fit a multilevel model embedded in 
 a regression?

 Harold,

 I get confused by the terms fixed and random.  Our first-level model
 (in the simplified version we're discussing here) has 800 data points
 (the persons in the study) and 84 predictors:  sex, age, and 82
 coefficients for foods.  The second-level model has 82 data points (the
 foods) and two predictors:  a constant term and folic acid concentration.

 It would be hopeless to estimate the 82 food coefficients via maximum
 likelihood, so the idea is to do a multilevel model, with a regression
 of these coefficients on the constant term and folic acid.  The
 group-level model has a residual variance.  If the group-level residual
 variance is 0, it's equivalent to ignoring food, and just using total
 folic acid as an individual predictor.  If the group-level residual
 variance is infinity, it's equivalent to estimating the original
 regression (with 84 predictors) using least squares.

 The difficulty is that the foods aren't groups in the usual sense,
 since persons are not nested within foods; rather, each person eats many
 foods, and this is reflected in the X matrix.

 Andrew

 Doran, Harold wrote:

  OK, I'm piecing this together a bit, sorry I'm not familiar with the
  article you cite. Let me try and fully understand the issue if you
  don't mind. Are you estimating each of the 82 foods as fixed effects?
  If so, in the example below this implies 84 total fixed effects (1 for
  each food type in the X matrix and then sex and age).
 
  I'm assuming that food type is nested within one of the 82 folic acid
  concentrations and then folic acid is treated as a random effect.
 
  Is this accurate?
 
 
  -Original Message-
  From:   Andrew Gelman [mailto:[EMAIL PROTECTED]
  Sent:   Sun 5/21/2006 9:17 AM
  To: Doran, Harold
  Cc: r-help@stat.math.ethz.ch; [EMAIL PROTECTED]
  Subject:Re: [R] Can lmer() fit a multilevel model embedded in
  a regression?
 
  Harold,
 
  I'm confused now.  Just for concretness, suppose we have 800 people, 82
  food items, and one predictor (folic, the folic acid concentration) at
  the food-item level.  Then DV will be a vector of length 800, foods is
  an 800 x 82 matrix, sex is a vector of length 800, age is a vector of
  length 800, and folic is a vector of length 82.  The vector of folic
  acid concentrations in individual diets is then just foods%*%folic,
  which I can call folic_indiv.
 
  How would I fit the model in lmer(), then?  There's some bit of
  understading that I'm still missing.
 
  Thanks.
  Andrew
 
 
  Doran, Harold wrote:
 
   Prof Gelman:
  
   I believe the answer is yes. It sounds as though persons are partially
   crossed 

[R] writing 100 files

2006-05-22 Thread Federico Calboli
Hi All,

I need to write as text files 1000 ish variation of the same data frame, 
once I permute a row.

I would like to use the function write.table() to write the files, and 
use a loop to do it:

for (i in 1:1000){

bb8[2,] = sample(bb8[2,])
write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F, 
file = 'whatever?.txt')
}
so all the files are called whatever1: whatever1000

Any idea?

Cheers,

Federico

-- 
Federico C. F. Calboli
Department of Epidemiology and Public Health
Imperial College, St Mary's Campus
Norfolk Place, London W2 1PG

Tel  +44 (0)20 7594 1602 Fax (+44) 020 7594 3193

f.calboli [.a.t] imperial.ac.uk
f.calboli [.a.t] gmail.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Gabor Grothendieck
On 5/22/06, Martin Maechler [EMAIL PROTECTED] wrote:
  TL == Thomas Lumley [EMAIL PROTECTED]
  on Mon, 22 May 2006 07:09:09 -0700 (PDT) writes:

TL On Mon, 22 May 2006, Gabor Grothendieck wrote:
 Due to lazy evaluation, I don't think a and b are fully evaluated:

 ifelse(1, a - 1, b - 2)
 [1] 1
 a
 [1] 1
 b
 Error: object b not found

TL yes. If you look at the code for ifelse() it evaluates
TL the second argument if any test values are TRUE and the
TL third argument if any test values are FALSE, so in the
TL scalar case it does not evaluate both arguments.

 yes.
 And (Gabor), if you recall what I said earlier,  it's pretty clear
 that I would not even *consider* using ifelse() in the scalar
 case.  So my original statement was about the case ifelse() is
 designed for: The non-trivial vector case, and there my
 statement of fully evaluated  does apply.

 Martin


I don't think they are fully evaluated in the vector case either:

 ifelse(rep(TRUE,4), a - 1:4, b - 1:4)
[1] 1 2 3 4
 a
[1] 1 2 3 4
 b
Error: object b not found

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing 100 files

2006-05-22 Thread Duncan Murdoch
On 5/22/2006 11:24 AM, Federico Calboli wrote:
 Hi All,
 
 I need to write as text files 1000 ish variation of the same data frame, 
 once I permute a row.
 
 I would like to use the function write.table() to write the files, and 
 use a loop to do it:
 
 for (i in 1:1000){
 
 bb8[2,] = sample(bb8[2,])
 write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F, 
 file = 'whatever?.txt')
 }
 so all the files are called whatever1: whatever1000
 
 Any idea?

Use the paste() function to construct the name, e.g.

file = paste(whatever,i,.txt, sep=)

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] win2k memory problem with merge()'ing repeatedly (long email)

2006-05-22 Thread bogdan romocea
Repeated merge()-ing does not always increase the space requirements
linearly. Keep in mind that a join between two tables where the same
value appears M and N times will produce M*N rows for that particular
value. My guess is that the number of rows in atot explodes because
you have some duplicate values in your files (having the same
duplicate date in each data frame would cause atot to contain 4, then
8, 16, 32, 64... rows for that date).


 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Sean O'Riordain
 Sent: Monday, May 22, 2006 10:12 AM
 To: r-help
 Subject: [R] win2k memory problem with merge()'ing repeatedly
 (long email)

 Good afternoon,

 I have a 63 small .csv files which I process daily, and until two
 weeks ago they processed just fine and only took a matter of moments
 and had non noticeable memory problem.  Two weeks ago they have
 reached 318 lines and my script broke.  There are some
 missing-values in some of the files.  I have tried hard many times
 over the last two weeks to create a small repeatable example to give
 you but I've failed - unless I use my data it works fine... :-(

 Am I missing something obvious? (again)

 A line in a typical file has lines which look like :
 01/06/2005,1372

 Though there are three files which have two values (files 3,32,33) and
 these have lines which look like...
 01/06/2005,1766,
 or
 15/05/2006,289,114

 a1 - read.csv(file1.csv,header=F)
 etc...
 a63 - read.csv(file63.csv,header=F)
 names(a1) - c(mdate,file1.column.description)

 atot - merge(a1,a2,all=T)

 followed by repeatedly doing...
 atot - merge(atot, a3,all=T)
 atot - merge(atot, a4,all=T)
 etc...

 I normally start R with --vanilla.

 What appears to happen is that atot doubles in size each iteration and
 just falls over due to lack of memory at about i=17... even though the
 total memory required for all of these individual a1...a63 is only
 1001384 bytes (doing an object.size() on a1..a63)
 at this point I've been trying to pin down this problem for two weeks
 and I just gave up...

 The following works fine as I'd expect with minimal memory usage...

 for (i in 3:67) {
 datelist - as.Date(start.date)+0:(count-1)
 #remove a couple of elements...
 datelist - datelist[-(floor(runif(nacount)*count))]
 a2 - as.data.frame(datelist)
 names(a2) - mdate
 vname - paste(value, i, sep=)
 a2[vname] - runif(length(datelist))
 #a2[floor(runif(nacount)*count), vname] - NA

 # atot - merge(atot,a2,all=T)
 i - 2
 a.eval.text - paste(merge(atot, a, i, , all=T), sep=)
 cat(a.eval.text is: -, a.eval.text, -\n, sep=)
 atot - eval(parse(text=a.eval.text))

 cat(i:, i,  , gc(), \n)
 }

 this works fine... but on my files (as per attached 'lastsave.txt'
 file) it just gobbles memory.
 Am I doing something wrong?  I (wrongly?) expected that repeatedly
 merge(atot,aN) would only increase the memory requirement linearly
 (with jumps perhaps as we go through a 2^n boundary)... which is what
 happens when merging simulated data.frames as above... no problem at
 all and its really fast...

 The attached text file shows a (slightly edited) session where the
 memory required by the merge() operation just doubles with each use...
 and I can only allow it to run until i=17!!!

 I've even run it with gctorture() set on... with similar, but
 excruciatingly slow results...

 Is there any relevant info that I'm missing?  Unfortunately I am not
 able to post the contents of the files to a public list like this...

 As per a previous thread, I know that I can use a list to handle these
 dataframes - but I had difficulty with the syntax of a list of
 dataframes...

 I'd like to know why the memory requirements for this merge
 just explode...

 cheers, (and thanks in advance!)
 Sean O'Riordain

 ==
  version
_
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status Patched
 major  2
 minor  3.0
 year   2006
 month  05
 day09
 svn rev38014
 language   R
 version.string Version 2.3.0 Patched (2006-05-09 r38014)
 
 Running on Win2k with 1Gb ram.

 I also tried it (with the same results) on 2.2.1 and 2.3.0.

 

 R : Copyright 2006, The R Foundation for Statistical Computing
 Version 2.3.0 Patched (2006-05-09 r38014)
 ISBN 3-900051-07-0

 R is free software and comes with ABSOLUTELY NO WARRANTY.
 You are welcome to redistribute it under certain conditions.
 Type 'license()' or 'licence()' for distribution details.

   Natural language support but running in an English locale

 R is a collaborative project with many contributors.
 Type 'contributors()' for more information and
 'citation()' on how to cite R or R packages in publications.

 Type 'demo()' for some demos, 'help()' for on-line help, or
 'help.start()' 

Re: [R] odd feature

2006-05-22 Thread Uwe Ligges
Gabor Grothendieck wrote:

 On 5/22/06, Martin Maechler [EMAIL PROTECTED] wrote:
 
TL == Thomas Lumley [EMAIL PROTECTED]
on Mon, 22 May 2006 07:09:09 -0700 (PDT) writes:

   TL On Mon, 22 May 2006, Gabor Grothendieck wrote:
Due to lazy evaluation, I don't think a and b are fully evaluated:
   
ifelse(1, a - 1, b - 2)
[1] 1
a
[1] 1
b
Error: object b not found

   TL yes. If you look at the code for ifelse() it evaluates
   TL the second argument if any test values are TRUE and the
   TL third argument if any test values are FALSE, so in the
   TL scalar case it does not evaluate both arguments.

yes.
And (Gabor), if you recall what I said earlier,  it's pretty clear
that I would not even *consider* using ifelse() in the scalar
case.  So my original statement was about the case ifelse() is
designed for: The non-trivial vector case, and there my
statement of fully evaluated  does apply.

Martin

 
 
 I don't think they are fully evaluated in the vector case either:
 
 
ifelse(rep(TRUE,4), a - 1:4, b - 1:4)
 
 [1] 1 2 3 4
 
a
 
 [1] 1 2 3 4
 
b
 
 Error: object b not found


Martin's non-trivial means not all conditions are TRUE or not all are 
FALSE, then both statements are FULLY evaluated and he is right.

See the ifelse() code which is not that hard to read...

Uwe



 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing 100 files

2006-05-22 Thread Uwe Ligges
Federico Calboli wrote:

 Hi All,
 
 I need to write as text files 1000 ish variation of the same data frame, 
 once I permute a row.
 
 I would like to use the function write.table() to write the files, and 
 use a loop to do it:
 
 for (i in 1:1000){
 
 bb8[2,] = sample(bb8[2,])
 write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F, 
 file = 'whatever?.txt')
 }
 so all the files are called whatever1: whatever1000
 
 Any idea?
 
 Cheers,
 
 Federico
 

This is the FAQ How can I save the result of each iteration in a loop 
into a separate file?

Please read the FAQs prior to posting!

Uwe Ligges

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing 100 files

2006-05-22 Thread Marc Schwartz (via MN)
On Mon, 2006-05-22 at 16:24 +0100, Federico Calboli wrote:
 Hi All,
 
 I need to write as text files 1000 ish variation of the same data frame, 
 once I permute a row.
 
 I would like to use the function write.table() to write the files, and 
 use a loop to do it:
 
 for (i in 1:1000){
 
 bb8[2,] = sample(bb8[2,])
 write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F, 
 file = 'whatever?.txt')
 }
 so all the files are called whatever1: whatever1000
 
 Any idea?
 
 Cheers,
 
 Federico


The same process used in R FAQ 7.34: 

  How can I save the result of each iteration in a loop into a separate
  file?

can be used here.  Instead of using save(), use write.table(), adjusting
the arguments accordingly.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RQuantlib Array processing applying EuropeanOptionExampleArray

2006-05-22 Thread Joe Byers
I am trying to replicate part of the EuropeanOptionExample using my 
date.  I have a data.frame containing all my inputs

atm.vols-subset(data.vols,moneyness==min);
#Some days have the abs(moneyness) exactly between two strike prices,
#Traders will alway price an option at the higher vol when selling it, 
so we will too.

atm.vols.max-data.frame(vol.max=tapply(atm.vols$sigma,INDEX=factor(atm.vols$date),max));
#y-date.mmddyy(as.numeric(row.names(atm.vols)));
#row.names(atm.vols.max)-date.mmddyy(as.numeric(row.names(atm.vols.max)));
atm.work-merge(atm.vols,atm.vols.max,by.x=date,by.y=row.names,all=TRUE)
#ok get only the vols we need
atm.work-subset(atm.work,sigma==vol.max);

atm.work is has 749 rows.

I try and run the EuropeanOption example using atm.work$For_Price as my 
array of underlying prices and the other inputs from row 9 of atm.work.

i-9;
x-EuropeanOption(type = put, underlying = atm.work$For_Price, strike 
= atm.work$K[i],
   dividendYield = atm.work$BEY[i], riskFreeRate = 
atm.work$BEY[i], maturity = atm.work$t_exp[i],

   volatility = atm.work$sigma[i])

x$parameters has the array of underlying prices but the results is only 
a single vector using the first row of atm.work$For_Price.  Is this 
because I am pulling the inputs from data.frame not arrays?


Any help is greatly appreciated.

Thank you
Joe


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] odd feature

2006-05-22 Thread Martin Maechler
 Gabor == Gabor Grothendieck [EMAIL PROTECTED]
 on Mon, 22 May 2006 11:34:09 -0400 writes:

Gabor On 5/22/06, Martin Maechler [EMAIL PROTECTED] wrote:
  TL == Thomas Lumley [EMAIL PROTECTED]
  on Mon, 22 May 2006 07:09:09 -0700 (PDT) writes:
 
TL On Mon, 22 May 2006, Gabor Grothendieck wrote:
  Due to lazy evaluation, I don't think a and b are fully evaluated:
 
  ifelse(1, a - 1, b - 2)
  [1] 1
  a
  [1] 1
  b
  Error: object b not found
 
TL yes. If you look at the code for ifelse() it evaluates
TL the second argument if any test values are TRUE and the
TL third argument if any test values are FALSE, so in the
TL scalar case it does not evaluate both arguments.
 
 yes.
 And (Gabor), if you recall what I said earlier,  it's pretty clear
 that I would not even *consider* using ifelse() in the scalar
 case.  So my original statement was about the case ifelse() is
 designed for: The non-trivial vector case, and there my
 statement of fully evaluated  does apply.
 
 Martin
 

Gabor I don't think they are fully evaluated in the vector case either:

 ifelse(rep(TRUE,4), a - 1:4, b - 1:4)
Gabor [1] 1 2 3 4
 a
Gabor [1] 1 2 3 4
 b
Gabor Error: object b not found

PLEASE 
--

I said  non-trivial  above 
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] writing 100 files

2006-05-22 Thread Sachin J
Try this:
   
  x - 1:12
for (i in 1:2){
bb8 = sample(x)
a - sprintf(whatever%f.txt,i)
write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F,
file = a)
}

  HTH
   
  Sachin
  
Duncan Murdoch [EMAIL PROTECTED] wrote:
  On 5/22/2006 11:24 AM, Federico Calboli wrote:
 Hi All,
 
 I need to write as text files 1000 ish variation of the same data frame, 
 once I permute a row.
 
 I would like to use the function write.table() to write the files, and 
 use a loop to do it:
 
 for (i in 1:1000){
 
 bb8[2,] = sample(bb8[2,])
 write.table(bb8, quote = F, sep = '\t', row.names = F, col.names = F, 
 file = 'whatever?.txt')
 }
 so all the files are called whatever1: whatever1000
 
 Any idea?

Use the paste() function to construct the name, e.g.

file = paste(whatever,i,.txt, sep=)

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html



-

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] intervals from cut() as numerics?

2006-05-22 Thread Gavin Simpson
On Sat, 2006-05-20 at 15:44 +0200, Dimitrios Rizopoulos wrote:
 as an alternative, you can have a look inside cut.default and use the 
 part that produces the breaks, i.e.,
 
 breaks - 10
 groups - cut(x, breaks = breaks)
 max.bias - as.vector(tapply(error, groups, mean))
 
 # from cut.default()
 nb - as.integer(breaks + 1)
 dx - diff(rx - range(x, na.rm = TRUE))
 breaks - round(seq(rx[1] - dx/1000, rx[2] + dx/1000, len = nb), 2)
 mat - cbind(breaks[1:(nb - 1)], breaks[2:nb])
 
 plot(x, error, type = n)
 abline(h = 0, col = grey)
 panel.smooth(x, error)
 arrows(mat[, 1], max.bias, mat[, 2], max.bias, length = 0.05, 
 angle = 90, code = 3)
snip /

Thanks also to Dimitrios and Gabor for your suggests. This suggestion
from Dimitrios is probably the one I'll use even though all three
suggestions give the same end result. I like this one because it seems
(to me) better to do the calculations than process the text labels
assigned to the list names, and because this will end up in a package to
go on CRAN I'd prefer not to add dependencies if I can avoid it -
especially if an alternative is available.

All the best,

Gav
-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
*  Note new Address, Telephone  Fax numbers from 6th April 2006  *
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson
ECRC  ENSIS  [t] +44 (0)20 7679 0522
UCL Department of Geography   [f] +44 (0)20 7679 0565
Pearson Building  [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street  [w] http://www.ucl.ac.uk/~ucfagls/cv/
London, UK.   [w] http://www.ucl.ac.uk/~ucfagls/
WC1E 6BT.
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] win2k memory problem with merge()'ing repeatedly (long email)

2006-05-22 Thread Sean O'Riordain
Thank you very much indeed Bogdan!

 a2[duplicated(a2$mdate),]
  value2 mdate
3180 2006-05-10
3220 2006-05-13
3240 2006-05-14
3260 2006-05-15
3280 2006-05-16

What a relief to know what is causing this problem... now to sort out
the root cause!

cheers and thanks again!
Sean


On 22/05/06, bogdan romocea [EMAIL PROTECTED] wrote:
 Repeated merge()-ing does not always increase the space requirements
 linearly. Keep in mind that a join between two tables where the same
 value appears M and N times will produce M*N rows for that particular
 value. My guess is that the number of rows in atot explodes because
 you have some duplicate values in your files (having the same
 duplicate date in each data frame would cause atot to contain 4, then
 8, 16, 32, 64... rows for that date).


  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Sean O'Riordain
  Sent: Monday, May 22, 2006 10:12 AM
  To: r-help
  Subject: [R] win2k memory problem with merge()'ing repeatedly
  (long email)
 
  Good afternoon,
 
  I have a 63 small .csv files which I process daily, and until two
  weeks ago they processed just fine and only took a matter of moments
  and had non noticeable memory problem.  Two weeks ago they have
  reached 318 lines and my script broke.  There are some
  missing-values in some of the files.  I have tried hard many times
  over the last two weeks to create a small repeatable example to give
  you but I've failed - unless I use my data it works fine... :-(
 
  Am I missing something obvious? (again)
 
  A line in a typical file has lines which look like :
  01/06/2005,1372
 
  Though there are three files which have two values (files 3,32,33) and
  these have lines which look like...
  01/06/2005,1766,
  or
  15/05/2006,289,114
 
  a1 - read.csv(file1.csv,header=F)
  etc...
  a63 - read.csv(file63.csv,header=F)
  names(a1) - c(mdate,file1.column.description)
 
  atot - merge(a1,a2,all=T)
 
  followed by repeatedly doing...
  atot - merge(atot, a3,all=T)
  atot - merge(atot, a4,all=T)
  etc...
 
  I normally start R with --vanilla.
 
  What appears to happen is that atot doubles in size each iteration and
  just falls over due to lack of memory at about i=17... even though the
  total memory required for all of these individual a1...a63 is only
  1001384 bytes (doing an object.size() on a1..a63)
  at this point I've been trying to pin down this problem for two weeks
  and I just gave up...
 
  The following works fine as I'd expect with minimal memory usage...
 
  for (i in 3:67) {
  datelist - as.Date(start.date)+0:(count-1)
  #remove a couple of elements...
  datelist - datelist[-(floor(runif(nacount)*count))]
  a2 - as.data.frame(datelist)
  names(a2) - mdate
  vname - paste(value, i, sep=)
  a2[vname] - runif(length(datelist))
  #a2[floor(runif(nacount)*count), vname] - NA
 
  # atot - merge(atot,a2,all=T)
  i - 2
  a.eval.text - paste(merge(atot, a, i, , all=T), sep=)
  cat(a.eval.text is: -, a.eval.text, -\n, sep=)
  atot - eval(parse(text=a.eval.text))
 
  cat(i:, i,  , gc(), \n)
  }
 
  this works fine... but on my files (as per attached 'lastsave.txt'
  file) it just gobbles memory.
  Am I doing something wrong?  I (wrongly?) expected that repeatedly
  merge(atot,aN) would only increase the memory requirement linearly
  (with jumps perhaps as we go through a 2^n boundary)... which is what
  happens when merging simulated data.frames as above... no problem at
  all and its really fast...
 
  The attached text file shows a (slightly edited) session where the
  memory required by the merge() operation just doubles with each use...
  and I can only allow it to run until i=17!!!
 
  I've even run it with gctorture() set on... with similar, but
  excruciatingly slow results...
 
  Is there any relevant info that I'm missing?  Unfortunately I am not
  able to post the contents of the files to a public list like this...
 
  As per a previous thread, I know that I can use a list to handle these
  dataframes - but I had difficulty with the syntax of a list of
  dataframes...
 
  I'd like to know why the memory requirements for this merge
  just explode...
 
  cheers, (and thanks in advance!)
  Sean O'Riordain
 
  ==
   version
 _
  platform   i386-pc-mingw32
  arch   i386
  os mingw32
  system i386, mingw32
  status Patched
  major  2
  minor  3.0
  year   2006
  month  05
  day09
  svn rev38014
  language   R
  version.string Version 2.3.0 Patched (2006-05-09 r38014)
  
  Running on Win2k with 1Gb ram.
 
  I also tried it (with the same results) on 2.2.1 and 2.3.0.
 
  
 
  R : 

[R] How can you buy R?

2006-05-22 Thread H. Skaug
About glmmADMB and GPL:

We were not very cautious when we put in the GPL statement.
What we wanted to say was that the use of glmmADMB is free, and
does not require a license for AD Model Builder.

Am I correct in interpreting this discussion so that all
we have to do is to remove the License: GPL statement
from the DESCRIPTION file (and everywhere else it may occur),
and there will be no conflict between glmmADMB and the
rules of the R community?

We have temporarily withdrawn glmmADMB until this question has been
settled.

hans



 Brian Ripley wrote:

The issue in the glmmADMB example is not if they were required to release
it under GPL (my reading from the GPL FAQ is that they probably were not,
given that communication is between processes and the R code is
interpreted).

Rather, it is stated to be under GPL _but_ there is no source code offer
for the executables (and the GPL FAQ says that for anonymous FTP it should
be downloadable via the same site, and the principles apply equally to
HTTP sites).  As the executables are not for my normal OS and I would like
to exercise my freedom to try the GPLed code, I have requested the sources
from the package maintainer.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] choosing a mirror

2006-05-22 Thread Erin Hodgess
Dear R People:

Is there a way to select a mirror from the command line
instead of via a menu, please?

I tried chooseCRANmirror but to no avail.

Thanks in advance.

R version 2.3.0 Windows.

Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] choosing a mirror

2006-05-22 Thread Paulo Justiniano Ribeiro Jr

I guess you are using a function like install.packages() or similar
You can, for instalnce, use the argument repos
e.g
...,repos=http://cran.at.r-project.br;)


On Mon, 22 May 2006, Erin Hodgess wrote:


Date: Mon, 22 May 2006 12:57:49 -0500
From: Erin Hodgess [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject: [R]  choosing a mirror

Dear R People:

Is there a way to select a mirror from the command line
instead of via a menu, please?

I tried chooseCRANmirror but to no avail.

Thanks in advance.

R version 2.3.0 Windows.

Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html




Paulo Justiniano Ribeiro Jr
LEG (Laboratório de Estatística e Geoinformação)
Departamento de Estatística
Universidade Federal do Paraná
Caixa Postal 19.081
CEP 81.531-990
Curitiba, PR  -  Brasil
Tel: (+55) 41 3361 3573
Fax: (+55) 41 3361 3141
e-mail: [EMAIL PROTECTED]
http://www.est.ufpr.br/~paulojus__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Problem installing rgdal from source -- Makevars issue

2006-05-22 Thread Thomas Adams
List:

I have installed R, contributed packages, and supporting libraries in a 
non-standard location (but under the same sub-directory tree). This was 
necessary for systems reasons beyond my control. When I attempt to use 
Makevars (as shown below) to install rgdal from source, I get the 
following error:

R CMD INSTALL rgdal_0.4-3.tar.gz
.
.
.
checking proj_api.h usability... no
checking proj_api.h presence... no
checking for proj_api.h... no
libproj.a and proj_api.h not found in standard search locations,
edit src/Makevars manually, adding -Idirectory with proj_api.h
to PKG_CPPFLAGS = , and -Ldirectory with libproj.a to PKG_LIBS =
ERROR: configuration failed for package 'rgdal'
** Removing '/awips/rep/lx/local_apps/R/lib/R/library/rgdal'
** Restoring previous '/awips/rep/lx/local_apps/R/lib/R/library/rgdal'

I get this error regardless of where I put the Makevars file: 
$HOME/.R/Makevars, in the local build directory, or in the rgdal source 
directory src/Makevars runing ./configure

My Makevars contains this:

PKG_CPPFLAGS = -I/awips/rep/lx/local_apps/proj4/include
PKG_LIBS = -L/awips/rep/lx/local_apps/proj4/lib

I have also tried using the file name: Makevars and 
Makevars-i686-pc-linux-gnu

I'm sure I am doing *something* wrong; any suggestions?

Regards,
Tom



-- 
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:  [EMAIL PROTECTED]

VOICE:  937-383-0528
FAX:937-383-0033

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] (no subject)

2006-05-22 Thread yohannes alazar
   C/G GG CC  CG G/T GG TT GT C/T CC TT CT A/G AA GG AG A/C AA CC  AC A/T AA
TT AT

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] editing a big file

2006-05-22 Thread yohannes alazar
I have a file that has 90 columns and 20,000 rows and looks like
   C/G CC GG CG G/T GG TT GT C/T CC TT CT A/G AA GG AG A/C AA CC  AC A/T AA
TT AT

I want to write a code that will read through each row first the first looks
at the first column and then replace the three columns with 12 if it is the
same as the first column e.g. third column 11 if it is a repeat of the first
alphabet like the second column and 22 if it a repeat of the second
alphabet.

Can you please give some hint how I can start coding for this, like command
names, I am not a programmer and an R beginner. my out put should look like
this

   11 22 12 11 22 12 11 22 12 11 22 12 11 22 12 11 22 12

Thank you in advance.

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] choosing a mirror

2006-05-22 Thread Thomas Lumley

On Mon, 22 May 2006, Paulo Justiniano Ribeiro Jr wrote:


I guess you are using a function like install.packages() or similar
You can, for instalnce, use the argument repos
e.g
...,repos=http://cran.at.r-project.br;)


You can also set options(repos)

-thomas




On Mon, 22 May 2006, Erin Hodgess wrote:


Date: Mon, 22 May 2006 12:57:49 -0500
From: Erin Hodgess [EMAIL PROTECTED]
To: r-help@stat.math.ethz.ch
Subject: [R]  choosing a mirror

Dear R People:

Is there a way to select a mirror from the command line
instead of via a menu, please?

I tried chooseCRANmirror but to no avail.

Thanks in advance.

R version 2.3.0 Windows.

Sincerely,
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! 
http://www.R-project.org/posting-guide.html





Paulo Justiniano Ribeiro Jr
LEG (Laborat?rio de Estat?stica e Geoinforma??o)
Departamento de Estat?stica
Universidade Federal do Paran?
Caixa Postal 19.081
CEP 81.531-990
Curitiba, PR  -  Brasil
Tel: (+55) 41 3361 3573
Fax: (+55) 41 3361 3141
e-mail: [EMAIL PROTECTED]
http://www.est.ufpr.br/~paulojus


Thomas Lumley   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]   University of Washington, Seattle__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] choosing a mirror

2006-05-22 Thread Romain Francois
Le 22.05.2006 19:57, Erin Hodgess a écrit :
 Dear R People:

 Is there a way to select a mirror from the command line
 instead of via a menu, please?

 I tried chooseCRANmirror but to no avail.

 Thanks in advance.

 R version 2.3.0 Windows.
   

See repos in ?options

-- 
visit the R Graph Gallery : http://addictedtor.free.fr/graphiques
mixmod 1.7 is released : http://www-math.univ-fcomte.fr/mixmod/index.php
+---+
| Romain FRANCOIS - http://francoisromain.free.fr   |
| Doctorant INRIA Futurs / EDF  |
+---+

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] random generation of a factor

2006-05-22 Thread Christian Bieli
Dear Dave and Roland

Thanks for your answers! I think sample() does the job I'm looking for.
I also came up with rmultinom(), but could not make it working, because 
I don't want several multinomial distributed vectors, but one vector 
with K levels of predefined proportions. Propably there is a way to do 
that with rmultinom() - Dave, could you make up a short example?

Best wishes
Christian

Rau, Roland schrieb:

Hi,


  

Does anybody know a function that generates a factor of 
length N with K 
levels given the proportions c(p1, p2,... ,pk)? It would not 



## does this code piece help you?

mydata - c(yesterday, today, tomorrow)
myproportions - c(0.3, 0.5, 0.2)

n - 20
sample(x=mydata, size=n, replace=TRUE, prob=myproportions)

cat(Best,\nRoland\n)

--
This mail has been sent through the MPI for Demographic Research.  Should you 
receive a mail that is apparently from a MPI user without this text displayed, 
then the address has most likely been faked. If you are uncertain about the 
validity of this message, please check the mail header or ask your system 
administrator for assistance.


  


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread ivo welch
Hi Thomas:

 One case where the vector-vector recycling rules are used is in
 vector-matrix operations:
 a-1:4
 b-diag(4)
 a+b

is this last expression intended to be intuitive and thus desirable?
if anything else, I would end up writing something like this more as
an error than by intent.  I would suggest that recycling might be
better to be explicitly asked for by the user---though I am not sure
what the syntax should be.
   allow.recycle(a+b)
at the end, I think implicit recycling is a tradeoff between
convenience and unintended errors.  I would judge the convenience of
implicit recycling to be fairly low, and the unintended error rate
[with difficulty finding it] to be fairly high.

I guess the R language has few options() that control its
behavior---e.g., ala use strict; in perl.  If it did, I would love
to turn off implicit recycling, provided there is an explicit recycle
possibility.  I would not mind having the ability to be forced to
define variables first, either, though this is not a big deal.  R is
pretty good in telling me when I use variables that have not been
defined.

regards,

/ivo




 -thomas


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] odd feature

2006-05-22 Thread Duncan Murdoch
On 5/22/2006 5:01 PM, ivo welch wrote:
 Hi Thomas:
 
 One case where the vector-vector recycling rules are used is in
 vector-matrix operations:
 a-1:4
 b-diag(4)
 a+b
 
 is this last expression intended to be intuitive and thus desirable?
 if anything else, I would end up writing something like this more as
 an error than by intent.  I would suggest that recycling might be
 better to be explicitly asked for by the user---though I am not sure
 what the syntax should be.
allow.recycle(a+b)
 at the end, I think implicit recycling is a tradeoff between
 convenience and unintended errors.  I would judge the convenience of
 implicit recycling to be fairly low, and the unintended error rate
 [with difficulty finding it] to be fairly high.
 
 I guess the R language has few options() that control its
 behavior---e.g., ala use strict; in perl.  If it did, I would love
 to turn off implicit recycling, provided there is an explicit recycle
 possibility.  

While I agree with you about these things in principle, this is a very 
old S design decision, and I think it can't really change now.  It would 
have to be a local change (i.e. in my namespace, do things strictly), 
and that's very unlikely to happen.

I would not mind having the ability to be forced to
  define variables first, either, though this is not a big deal.  R is
  pretty good in telling me when I use variables that have not been
  defined.

There are some code tools around that help to detect such things.

Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Help with lagSeries (Rmetrics, fCalendar)

2006-05-22 Thread Fernando Saldanha
The function lagSeries calls the function timeSeries, which in turn
calls timeDate. (All three functions are in the package fCalendar)
During the call to timeDate the positions vector is altered. From
that point on one has problems doing arithmetic with the series. For
example,

[EMAIL PROTECTED]
 [1] 1954-02-01 01:00:00 1954-03-01 01:00:00 1954-04-01 01:00:00
 [4] 1954-05-01 01:00:00 1954-06-01 01:00:00 1954-07-01 01:00:00
 [7] 1954-08-01 01:00:00 1954-09-01 01:00:00 1954-10-01 01:00:00
[10] 1954-11-01 01:00:00
 ip.sa.lag - lagSeries(ip.sa)
 [EMAIL PROTECTED]
 [1] 1954-02-01 02:00:00 1954-03-01 02:00:00 1954-04-01 02:00:00
 [4] 1954-05-01 02:00:00 1954-06-01 02:00:00 1954-07-01 02:00:00
 [7] 1954-08-01 02:00:00 1954-09-01 02:00:00 1954-10-01 02:00:00
[10] 1954-11-01 02:00:00
 ip.sa + ip.sa.lag
Error in Ops.timeSeries(ip.sa, ip.sa.lag) :
positions slot must match

My best guess for why this is happening is that timeDate thinks that
it is getting a date from the GMT time zone and converts it to the
Zurich time zone.

I am working with monthly data and I would rather avoid altogether
dealing with hours and minutes. Can this be done? I tried to build my
series with format = %Y-%m-%d but that did not help:

z - timeSeries(matrix(c(1,2,3)), charvec = c(1954-02-01,
1954-03-01, 1954-04-01), format = %Y-%m-%d)
 z
TS.1
1954-02-01 01:00:001
1954-03-01 01:00:002
1954-04-01 01:00:003

If I really have to put up with hours, minutes, and seconds, is there
a way to avoid having lagSeries change the hours as it did in the
first example above? I believe the arguments for lagSeries do not
include anything that would change the behavior of timeDate. So, I
guess, if there is a solution, it might be changing an environment
value. I set

Sys.putenv(TZ = GMT)

as recommended, and of course that did not help.

Thanks for any advice.

FS

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] off-topic; iraq statistics again

2006-05-22 Thread Albyn Jones
Gabor Grothendieck wrote:


I came across this one:

http://www.nysun.com/article/32787

which says that the violent death rate in Iraq (which presumably
includes violent deaths from the war) is lower than the violent
death rate in major American cities.

Does anyone have any insights from statistics on how to
interpret this? 

I finally had time to follow up on this.
The NY Sun article compares apples and oranges, ie US cities to all of Iraq.

For data on Baghdad, see
 http://www.iraqbodycount.org/press/pr13.php
Iraq Body Count Press Release 13, 9th March 2006. 

 Figures released by IBC today, updated by statistics for the year
 2005 from the main Baghdad morgue, show that the total number of
 civilians reported killed has risen year-on-year since May 1st 2003
 (the date that President Bush announced ��major combat operations
 have ended��):

* 6,331 from 1st May 2003 to the first anniversary of the invasion, 
 19th March 2004 (324 days: Year 1)
* 11,312 from 20th March 2004 to 19th March 2005 (365 days: Year 2)
* 12,617 from 20th March 2005 to 1st March 2006 (346 days: Year 3).

According to several websites, the population of Baghdad is about 5 million.

According to R the violent death rate per 10 based on the 
year 3 data is

 (12617/500)*10
[1] 252.34

For comparison, see http://www.infoplease.com/ipa/A0004902.html,
Crime Rates for Selected Large Cities, 2002: homicides per 10 per year.

New York, N.Y.   7.3
Los Angeles, Calif. 17.1
Chicago, Ill.   22.1
Houston, Tex.   12.5
Philadelphia, Pa.   18.9
Phoenix, Ariz.  12.6
San Diego, Calif.3.7
Dallas, Tex.15.8
San Antonio, Tex.8.4
Las Vegas, Nev.311.9
Detroit, Mich.  41.8
San Jose, Calif. 2.8
Honolulu, Hawaii 2.0
Indianapolis, Ind.3 13.9
San Francisco, Calif.8.4
etc...

regards

albyn

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Problem installing rgdal from source -- Makevars issue

2006-05-22 Thread Roger Bivand
On Mon, 22 May 2006, Thomas Adams wrote:

 List:
 
 I have installed R, contributed packages, and supporting libraries in a 
 non-standard location (but under the same sub-directory tree). This was 
 necessary for systems reasons beyond my control. When I attempt to use 
 Makevars (as shown below) to install rgdal from source, I get the 
 following error:
 
 R CMD INSTALL rgdal_0.4-3.tar.gz

This is a rather specific question, I'll reply off-list tomorrow (my
time), because the advice given in the output you can see doesn't help.
The ./configure file cannot know that the rgdal/src/Makevars file has been
set manually, and carries on stopping you. 

Questions like this can preferably be sent directly to the package 
maintainer.

Roger

 .
 .
 .
 checking proj_api.h usability... no
 checking proj_api.h presence... no
 checking for proj_api.h... no
 libproj.a and proj_api.h not found in standard search locations,
 edit src/Makevars manually, adding -Idirectory with proj_api.h
 to PKG_CPPFLAGS = , and -Ldirectory with libproj.a to PKG_LIBS =
 ERROR: configuration failed for package 'rgdal'
 ** Removing '/awips/rep/lx/local_apps/R/lib/R/library/rgdal'
 ** Restoring previous '/awips/rep/lx/local_apps/R/lib/R/library/rgdal'
 
 I get this error regardless of where I put the Makevars file: 
 $HOME/.R/Makevars, in the local build directory, or in the rgdal source 
 directory src/Makevars runing ./configure
 
 My Makevars contains this:
 
 PKG_CPPFLAGS = -I/awips/rep/lx/local_apps/proj4/include
 PKG_LIBS = -L/awips/rep/lx/local_apps/proj4/lib
 
 I have also tried using the file name: Makevars and 
 Makevars-i686-pc-linux-gnu
 
 I'm sure I am doing *something* wrong; any suggestions?
 
 Regards,
 Tom
 
 
 
 

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Subset a list

2006-05-22 Thread Doran, Harold
I have a data frame of ~200 columns and ~20,000 rows where each column
consists of binary responses (0,1) and a 9 for missing data. I am
interested in finding the columns for which there are fewer than 100
individuals with responses of 0. 

I can use an apply function to generate a table for each column, but I'm
not certain whether I can subset a list based on some criterion as
subset() is designed for vectors, matrices or dataframes.

For example, I can use the following:
tt - apply(data, 2, table)

Which returns an object of class list. Here is some sample output from
tt

$R0235940b

0 1 9 
 2004  1076 15361 

$R710a

0 9 
2 18439 

$R710b

0 1 9 
   3941 11167 

tt$R710a meets my criteria and I would want to be able to easily
find this instead of rolling through the entire output. Is there a way
to subset this list to identify the columns which meet the criteria I
note above?


Thanks,
Harold




[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Subset a list

2006-05-22 Thread Marc Schwartz (via MN)
On Mon, 2006-05-22 at 17:55 -0400, Doran, Harold wrote:
 I have a data frame of ~200 columns and ~20,000 rows where each column
 consists of binary responses (0,1) and a 9 for missing data. I am
 interested in finding the columns for which there are fewer than 100
 individuals with responses of 0. 
 
 I can use an apply function to generate a table for each column, but I'm
 not certain whether I can subset a list based on some criterion as
 subset() is designed for vectors, matrices or dataframes.
 
 For example, I can use the following:
 tt - apply(data, 2, table)
 
 Which returns an object of class list. Here is some sample output from
 tt
 
 $R0235940b
 
 0 1 9 
  2004  1076 15361 
 
 $R710a
 
 0 9 
 2 18439 
 
 $R710b
 
 0 1 9 
    3941 11167 
 
 tt$R710a meets my criteria and I would want to be able to easily
 find this instead of rolling through the entire output. Is there a way
 to subset this list to identify the columns which meet the criteria I
 note above?
 
 
 Thanks,
 Harold

Harold,

How about this:

 DF
   V1 V2 V3 V4 V5
1   0  1  0  1  0
2   0  0  1  0  1
3   0  0  1  1  0
4   1  1  0  0  1
5   1  1  1  1  0
6   0  1  0  1  1
7   0  1  1  1  0
8   0  1  0  0  0
9   0  0  1  1  0
10  1  0  0  1  1

# Find the columns with 5 0's
 which(sapply(DF, function(x) sum(x == 0))  5)
V2 V4
 2  4


So in your case, just replace the DF with your data frame name and the 5
with 100.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Ordinal Independent Variables

2006-05-22 Thread Rick Bilonick
When I run lrm from the Design package, I get a warning about
contrasts when I include an ordinal variable:

Warning message:
Variable ordfac is an ordered factor.
 You should set
options(contrasts=c(contr.treatment,contr.treatment))
or Design will not work properly. in: Design(eval(m, sys.parent()))

I don't get this message if I use glm with family=binomial. It produces
linear and quadratic contrasts.

If it's improper to do this for an ordinal variable, why does glm not
balk?

Rick B.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] sort the values of members and obtain their ranks within groups

2006-05-22 Thread Zhen Zhang
Dear all,

I would like to sort the values of member for each group, and obtain a
variable to indicate its rank within the group.

For example, we have original dataset as follows:

df - data.frame(group = c(rep(1, 3), rep(5, 3)),
 member = c(30, 10, 22, 21, 44, 15))

  group member
1 1 30
2 1 10
3 1 22
4 5 21
5 5 44
6 5 15

After sorting the variable member, we want the dataset like this:

  group member order
1 1 30 3
2 1 10 1
3 1 22 2
4 5 21 2
5 5 44 3
6 5 15 1

I already searched the R helper archives, but failed to find the proper
answers.

Could you please kindly give me some suggestions, except using loop because
of larger sample size?  Thanka a lot.


Best regards,

Zhen Zhang

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Subset a list

2006-05-22 Thread Chuck Cleland
Doran, Harold wrote:
 I have a data frame of ~200 columns and ~20,000 rows where each column
 consists of binary responses (0,1) and a 9 for missing data. I am
 interested in finding the columns for which there are fewer than 100
 individuals with responses of 0. 
 
 I can use an apply function to generate a table for each column, but I'm
 not certain whether I can subset a list based on some criterion as
 subset() is designed for vectors, matrices or dataframes.
 
 For example, I can use the following:
 tt - apply(data, 2, table)
 
 Which returns an object of class list. Here is some sample output from
 tt
 
 $R0235940b
 
 0 1 9 
  2004  1076 15361 
 
 $R710a
 
 0 9 
 2 18439 
 
 $R710b
 
 0 1 9 
    3941 11167 
 
 tt$R710a meets my criteria and I would want to be able to easily
 find this instead of rolling through the entire output. Is there a way
 to subset this list to identify the columns which meet the criteria I
 note above?

How about this?

newdf - mydf[,colSums(mydf == 0)  100]

 Thanks,
 Harold
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Ordinal Independent Variables

2006-05-22 Thread Frank E Harrell Jr
Rick Bilonick wrote:
 When I run lrm from the Design package, I get a warning about
 contrasts when I include an ordinal variable:
 
 Warning message:
 Variable ordfac is an ordered factor.
  You should set
 options(contrasts=c(contr.treatment,contr.treatment))
 or Design will not work properly. in: Design(eval(m, sys.parent()))
 
 I don't get this message if I use glm with family=binomial. It produces
 linear and quadratic contrasts.
 
 If it's improper to do this for an ordinal variable, why does glm not
 balk?
 
 Rick B.

Standard regression methods don't make good use of ordinal predictors 
and just have to treat them as categorical.  Design is a bit picky about 
this.  If the predictor has numeric scores for the categories, you can 
get a test of adequacy of the scores (with k-2 d.f. for k categories) by 
using scored(predictor) in the formula.  Or just create a factor( ) 
variable to hand to Design.
-- 
Frank E Harrell Jr   Professor and Chair   School of Medicine
  Department of Biostatistics   Vanderbilt University

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sort the values of members and obtain their ranks within groups

2006-05-22 Thread jim holtman
 x
  group member
1 1 30
2 1 10
3 1 22
4 5 21
5 5 44
6 5 15
 x$order - NA
 invisible(tapply(seq(nrow(x)), x$group, function(y) x$order[y] -
rank(x$member[y])))
 x
  group member order
1 1 30 3
2 1 10 1
3 1 22 2
4 5 21 2
5 5 44 3
6 5 15 1



On 5/22/06, Zhen Zhang [EMAIL PROTECTED] wrote:

 Dear all,

 I would like to sort the values of member for each group, and obtain a
 variable to indicate its rank within the group.

 For example, we have original dataset as follows:

 df - data.frame(group = c(rep(1, 3), rep(5, 3)),
 member = c(30, 10, 22, 21, 44, 15))

 group member
 1 1 30
 2 1 10
 3 1 22
 4 5 21
 5 5 44
 6 5 15

 After sorting the variable member, we want the dataset like this:

 group member order
 1 1 30 3
 2 1 10 1
 3 1 22 2
 4 5 21 2
 5 5 44 3
 6 5 15 1

 I already searched the R helper archives, but failed to find the proper
 answers.

 Could you please kindly give me some suggestions, except using loop
 because
 of larger sample size?  Thanka a lot.


 Best regards,

 Zhen Zhang

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] How to call a value labels attribute?

2006-05-22 Thread Heinz Tuechler
Dear All,

after searching on CRAN I got the impression that there is no standard way
in R to label values of a numerical variable.
Since this would be useful for me I intend to create such an attribute, at
the moment for my personal use.
Still I would like to choose a name which does not conflict with names of
commonly used attributes.

Would value.labels or vallabs create conflicts?

The attribute should be structured as data.frame with two columns, levels
(numeric) and labels (character). These could then also be used to
transform from numeric to factor. If the attribute is copied to the factor
variable it could also serve to retransform the factor to the original
numerical variable.

Comments? Ideas?

Thanks

Heinz Tüchler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] standardization of values before call to pam() or clara()

2006-05-22 Thread Dylan Beaudette
Greetings,

Experimenting with the cluster package, and am starting to scratch my head in 
regards to the *best* way to standardize my data. Both functions can 
pre-standardize columns in a dataframe. according to the manual:

Measurements are standardized for each variable (column), by subtracting the 
variable's mean value and dividing by the variable's mean absolute deviation. 

This works well when input variables are all in the same units. When I include 
new variables with a different intrinsic range, the ones with the largest 
relative values tend to be _weighted_ . this is certainly not surprising, but 
complicates things. 

Does there exist a robust technique to effectively re-scale each of the 
variables, regardless of their intrinsic range to some set range, say from 
{0,1} ?

I have tried dividing a variable by the maximum value of that variable, but I 
am not sure if this is statistically correct. 

Any ideas, thoughts would be greatly appreciated.

Cheers,

-- 
Dylan Beaudette
Soils and Biogeochemistry Graduate Group
University of California at Davis
530.754.7341

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Cross correlation/ bivariate/ mantel

2006-05-22 Thread Spencer Graves
  RSiteSearch(mantel statistic) produced 27 hits for me, but the 
first few suggested I might not find enlightenment there. 
RSiteSearch(spatial statistics) returned 1064 hits, so I restricted 
that to RSiteSearch(spatial statistics, functions).  The first of 
those was a director for the 'spdep' package.  A search for Mantel on 
that page identified plot.mc.sim:  Mantel-Hubert spatial general cross 
product statistic.  I haven't tried it.  I know nothing more about it 
than what I've already told you.

  I hope you'll forgive the terrible pun if I say that I hope I not 
only gave you a fish, I showed you how I caught it.

  Best Wishes,
  Spencer Graves

McClatchie, Sam (PIRSA-SARDI) wrote:
 Background:
 OS: Linux Ubuntu Dapper
 release: R 2.3.0
 editor: GNU Emacs 21.4.1
 front-end: ESS 5.2.3
 -
 Colleagues

   I have two spatial datasets (latitude, 
longitude, fish eggs) and (latitude, longitude,
fish larvae) at the same 280 stations (i.e. 280
cases). I want to determine if the 2 datasets
are spatially correlated. In other words, do
high numbers of larvae occur where there are
high numbers of eggs? I would like to calculate
the cross correlation for these bivariate data
and calculate a Mantel statistic as described
on pg. 147 of Fortin and Dale 2005 Spatial
analysis.
 
   My search of R packages came up with 
acf and ccf functions but I don't think these
are what I want. Does anyone know which spatial
package I might find the appropriate test,
please?
 
 Best fishes

 Sam
 
 Sam McClatchie,
 Biological oceanography 
 South Australian Aquatic Sciences Centre
 PO Box 120, Henley Beach 5022
 Adelaide, South Australia
 email [EMAIL PROTECTED]
 Cellular: 0431 304 497 
 Telephone: (61-8) 8207 5448
 FAX: (61-8) 8207 5481
 Research home page http://www.members.iinet.net.au/~s.mcclatchie/
   
/\
   ...xX(° 
  
°)Xx
   /  \\
 (((° 
   (((°   ...xX(°O°)Xx

 
   [[alternative HTML version deleted]]
 
 
 
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Cross correlation/ bivariate/ mantel

2006-05-22 Thread McClatchie, Sam \(PIRSA-SARDI\)
Thanks Spencer

Give a wo/man a fish and s/he becomes an aid dependent, give a wo/man a rod 
and s/he becomes a fisherman

Sam 

-Original Message-
From: Spencer Graves [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, 23 May 2006 9:58 AM
To: McClatchie, Sam (PIRSA-SARDI)
Cc: r-help@stat.math.ethz.ch
Subject: Re: [R] Cross correlation/ bivariate/ mantel

 RSiteSearch(mantel statistic) produced 27 hits for 
me, but the first few suggested I might not find enlightenment there. 
RSiteSearch(spatial statistics) returned 1064 hits, so I 
restricted that to RSiteSearch(spatial statistics, 
functions).  The first of those was a director for the 
'spdep' package.  A search for Mantel on that page 
identified plot.mc.sim:  Mantel-Hubert spatial general cross 
product statistic.  I haven't tried it.  I know nothing more 
about it than what I've already told you.

 I hope you'll forgive the terrible pun if I say that 
I hope I not only gave you a fish, I showed you how I caught it.

 Best Wishes,
 Spencer Graves

McClatchie, Sam (PIRSA-SARDI) wrote:
 Background:
 OS: Linux Ubuntu Dapper
 release: R 2.3.0
 editor: GNU Emacs 21.4.1
 front-end: ESS 5.2.3
 -
 Colleagues

  I have two spatial datasets (latitude,
longitude, fish eggs) and (latitude, longitude, fish larvae) 
at the same 280 stations (i.e. 280 cases). I want to determine 
if the 2 datasets are spatially correlated. In other words, do 
high numbers of larvae occur where there are high numbers of 
eggs? I would like to calculate the cross correlation for 
these bivariate data and calculate a Mantel statistic as 
described on pg. 147 of Fortin and Dale 2005 Spatial analysis.
 
  My search of R packages came up with
acf and ccf functions but I don't think these are what I want. 
Does anyone know which spatial package I might find the 
appropriate test, please?
 
 Best fishes

 Sam
 
 Sam McClatchie,
 Biological oceanography
 South Australian Aquatic Sciences Centre PO Box 120, Henley Beach 
 5022 Adelaide, South Australia email 
 [EMAIL PROTECTED]
 Cellular: 0431 304 497
 Telephone: (61-8) 8207 5448
 FAX: (61-8) 8207 5481
 Research home page http://www.members.iinet.net.au/~s.mcclatchie/
   
/\
   ...xX(° 
  
°)Xx
   /  \\
 (((° 
   (((°   ...xX(°O°)Xx

 
  [[alternative HTML version deleted]]
 
 
 
 
--
 --
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RQuantlib

2006-05-22 Thread Joe Byers

RQuantlib Experrs

I am trying to perform some analysis on a dataset of options.
I got RQuantlib to work using a for loop over my data.frame.  I tried 
the sapply, lapply, and other apply functions to no avial.


My first error occurs with the following
x-sapply(X=atm.work,FUN=EuropeanOption, underlying = 
atm.work$For_Price, strike = atm.work$K,
   dividendYield = atm.work$BEY, riskFreeRate = atm.work$BEY, 
maturity = atm.work$t_exp,

   volatility = atm.work$sigma)
Note, the type= is not specified.  The function picks up my date column 
in the data.frame as the type= column.
 x-sapply(X=atm.work,FUN=EuropeanOption,underlying = 
atm.work$For_Price, strike = atm.work$K,
+ dividendYield = atm.work$BEY, riskFreeRate = atm.work$BEY, maturity = 
atm.work$t_exp,

+ volatility = atm.work$sigma)
Error in EuropeanOption.default(X[[2]], ...) :
   Unexpected option type 2Jan97, aborting

When I include a column named type with put in all rows I get the 
following error
x-sapply(X=atm.work,FUN=EuropeanOption, underlying = 
atm.work$For_Price, strike = atm.work$K,
+ dividendYield = atm.work$BEY, riskFreeRate = atm.work$BEY, maturity = 
atm.work$t_exp,

+ volatility = atm.work$sigma, type=atm.work$type)
Error in FUN(X[[1]], ...) : unused argument(s) ( ...)

Using type=put or type=rep(1,nrows,put) all get this unused argument 
error.


I check the function call to see what happens with I do not include the 
correct number of parameters.  I received
error messages detailing the call order of European parms. 


Thank you
Joe


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] lattice package - question on plotting lines

2006-05-22 Thread Tim Smith
Hi all,
   
  I was trying to plot a graph using the following data:
   
  method percent accuracy group
A1 4 0.8529 cns
A1 10 0.8412 cns
A1 15 0.8235 cns
A2 4 0.9353 cns
A2 10 0.9412 cns
A2 15 0.9471 cns
A1 4 0.8323 col
A1 10 0.8452 col
A1 15 0.8484 col
A2 4 0.8839 col
A2 10 0.8677 col
A2 15 0.8678 col
   
  # 
  The code I'm using to generate the graphs is:
   
  ### code : 
   
  xyplot(accuracy ~ percent|group,data = k, groups = method, allow.multiple = 
TRUE, scales = same,type=o, 
  xlab = % of genes, ylab = Accuracy, main =  Accuracy , aspect = 2/3,
panel = function(...) { 
  panel.superpose(...) 
} 
)
   
   
  
   
  I have tried to use the 'get' and 'set' functions of 'par()', but can't 
figured it out. I have two questions:
   
  i) How can I set it so that the line plotted for A1 in all the plots is 
'solid', and the one for A2 is 'dotted' for both the groups (cns and col)
   
  ii) How can I set the abline feature to show a horizontal line at different 
points on the y-axis for each of the two graphs?
   
  Any help would be highly appreciated.
   
  many thanks.
   
   
   
   




-
Sneak preview the  all-new Yahoo.com. It's not radically different. Just 
radically better. 
[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] nls: formula error?

2006-05-22 Thread H. Paul Benton
So thanks for the help,

I have a matrix (AB) which in the first column has my bin numbers so -4 - +4
in 0.1 bin units. Then I have in the second column the frequency from some
data. I have plotted them and they look roughly Gaussian. So I want to fit
them/ find/optimize mu, sigma, and A. 

So I call the nls function : 

nls_AB - nls(x ~ (A/sig*sqrt(2*pi))* exp(-1*((x-mu)^2/(2*
sig^2))),data=temp, start= list(A=0.1, mu=0.01, sig=0.5), trace=TRUE)
Error in eval(expr, envir, enclos) : numeric 'envir' arg not of length one

Temp looks like this:
   bin   x
 [1,] -4.0   0
 [2,] -3.9   0
 [3,] -3.8   0
.etc
[41,]  0.0 241
[42,]  0.1 229
[43,]  0.2 258
[44,]  0.3 305
[45,]  0.4 370
[46,]  0.5 388

So I don't get my error message. I looked at doing 
class(fo - (x ~ (A/sig*sqrt(2*pi))* exp(-1*((x-mu)^2/(2* sig^2)
terms(fo)

and that seems to work. So if anyone has any ideas I would welcome them. 

Cheers,

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] editing a big file

2006-05-22 Thread jim holtman
Here is one way of doing it:

 x - c(C/G, CC, GG, CG, G/T, GG, TT, GT, C/T, CC,
+ TT, CT, A/G, AA, GG, AG, A/C, AA, CC, AC,
+ A/T, AA, TT, AT)
 # convert to a matrix with 4 columns
 x - matrix(x, ncol=4, byrow=TRUE)
 x
 [,1]  [,2] [,3] [,4]
[1,] C/G CC GG CG
[2,] G/T GG TT GT
[3,] C/T CC TT CT
[4,] A/G AA GG AG
[5,] A/C AA CC AC
[6,] A/T AA TT AT
 # now substitute for the last 3 columns
 result - apply(x, 1, function(a){
+ # substitute for the first character
+ d1 - gsub(substr(a[1], 1, 1), '1', a[2:4])
+ # substitute for the second
+ gsub(substr(a[1], 3, 3), '2', d1)
+ })
 result
 [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 11 11 11 11 11 11
[2,] 22 22 22 22 22 22
[3,] 12 12 12 12 12 12
 # output as vector
 as.integer(as.vector(result))
 [1] 11 22 12 11 22 12 11 22 12 11 22 12 11 22 12 11 22 12




On 5/22/06, yohannes alazar [EMAIL PROTECTED] wrote:

 I have a file that has 90 columns and 20,000 rows and looks like
   C/G CC GG CG G/T GG TT GT C/T CC TT CT A/G AA GG AG A/C AA CC  AC A/T AA
 TT AT

 I want to write a code that will read through each row first the first
 looks
 at the first column and then replace the three columns with 12 if it is
 the
 same as the first column e.g. third column 11 if it is a repeat of the
 first
 alphabet like the second column and 22 if it a repeat of the second
 alphabet.

 Can you please give some hint how I can start coding for this, like
 command
 names, I am not a programmer and an R beginner. my out put should look
 like
 this

   11 22 12 11 22 12 11 22 12 11 22 12 11 22 12 11 22 12

 Thank you in advance.

[[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide!
 http://www.R-project.org/posting-guide.html




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390 (Cell)
+1 513 247 0281 (Home)

What is the problem you are trying to solve?

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] sort the values of members and obtain their ranks within groups

2006-05-22 Thread Liaw, Andy
See if this does what you want:
 
 cbind(df, rank=ave(df$member, df$group, FUN=rank))
  group member rank
1 1 303
2 1 101
3 1 222
4 5 212
5 5 443
6 5 151

Andy

  _  

From: [EMAIL PROTECTED] on behalf of Zhen Zhang
Sent: Mon 5/22/2006 6:40 PM
To: R-help@stat.math.ethz.ch
Subject: [R] sort the values of members and obtain their ranks within groups
[Broadcast]



Dear all, 

I would like to sort the values of member for each group, and obtain a 
variable to indicate its rank within the group. 

For example, we have original dataset as follows: 

df - data.frame(group = c(rep(1, 3), rep(5, 3)), 
 member = c(30, 10, 22, 21, 44, 15)) 

  group member 
1 1 30 
2 1 10 
3 1 22 
4 5 21 
5 5 44 
6 5 15 

After sorting the variable member, we want the dataset like this: 

  group member order 
1 1 30 3 
2 1 10 1 
3 1 22 2 
4 5 21 2 
5 5 44 3 
6 5 15 1 

I already searched the R helper archives, but failed to find the proper 
answers. 

Could you please kindly give me some suggestions, except using loop because 
of larger sample size?  Thanka a lot. 


Best regards, 

Zhen Zhang 

[[alternative HTML version deleted]] 

__ 
R-help@stat.math.ethz.ch mailing list 
https://stat.ethz.ch/mailman/listinfo/r-help
https://stat.ethz.ch/mailman/listinfo/r-help  
PLEASE do read the posting guide!
http://www.R-project.org/posting-guide.html
http://www.R-project.org/posting-guide.html

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] lattice package - question on plotting lines

2006-05-22 Thread Tony
On Mon, 2006-05-22 at 18:36 -0700, Tim Smith wrote:
 Hi all,

   I was trying to plot a graph using the following data:

   method percent accuracy group
 A1 4 0.8529 cns
 A1 10 0.8412 cns
 A1 15 0.8235 cns
 A2 4 0.9353 cns
 A2 10 0.9412 cns
 A2 15 0.9471 cns
 A1 4 0.8323 col
 A1 10 0.8452 col
 A1 15 0.8484 col
 A2 4 0.8839 col
 A2 10 0.8677 col
 A2 15 0.8678 col

   # 
   The code I'm using to generate the graphs is:

   ### code : 

   xyplot(accuracy ~ percent|group,data = k, groups = method, allow.multiple = 
 TRUE, scales = same,type=o, 
   xlab = % of genes, ylab = Accuracy, main =  Accuracy , aspect = 2/3,
 panel = function(...) { 
   panel.superpose(...) 
 } 
 )


   

   I have tried to use the 'get' and 'set' functions of 'par()', but can't 
 figured it out. I have two questions:

   i) How can I set it so that the line plotted for A1 in all the plots is 
 'solid', and the one for A2 is 'dotted' for both the groups (cns and col)

   ii) How can I set the abline feature to show a horizontal line at different 
 points on the y-axis for each of the two graphs?

   Any help would be highly appreciated.

   many thanks.




 
 
 
   
 -
 Sneak preview the  all-new Yahoo.com. It's not radically different. Just 
 radically better. 
   [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
 

Does this get you closer?

xyplot(accuracy ~ percent|group,data = k, groups = method,
allow.multiple = TRUE, scales = same,type=o,
  par.settings=list(superpose.line = list(lty=1:2)),   # added
  xlab = % of genes, ylab = Accuracy, main =  Accuracy , aspect =
2/3,
panel = function(...) { 
  panel.superpose(...) 
} 
)
-- 
Tony [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RQuantlib Array processing applying EuropeanOptionExampleArray

2006-05-22 Thread Dirk Eddelbuettel

On 22 May 2006 at 10:57, Joe Byers wrote:
| I try and run the EuropeanOption example using atm.work$For_Price as my 
| array of underlying prices and the other inputs from row 9 of atm.work.
| i-9;
| x-EuropeanOption(type = put, underlying = atm.work$For_Price, strike 
| = atm.work$K[i],
| dividendYield = atm.work$BEY[i], riskFreeRate = 
| atm.work$BEY[i], maturity = atm.work$t_exp[i],
| volatility = atm.work$sigma[i])
| 
| x$parameters has the array of underlying prices but the results is only 
| a single vector using the first row of atm.work$For_Price.  Is this 
| because I am pulling the inputs from data.frame not arrays?

If I understand correctly what you are trying to do, then there may simply be
a misunderstanding on your part.

EuropeanOption(), like all but one [ more on that in the next paragraph ] of
the other RQuantLib functions, expects _scalars_ for all its inputs. But you
seem to expect that it magically vectorises things automatically for you. It
doesn't, and that's why you get errors or unexpected results.

There is however one convenience function -- mostly as a proof of concept --
which unrolls vectors (or sequences) in any of its arguments.  See 

 help(EuropeanOptionArrays)

and

 example(EuropeanOptionArrays)

where the results are shown in terms of two sequences of values for the
underlying and the volatility.

As always, there will be many more clever ways for doing this. Suggestions
and patches welcome.

| Any help is greatly appreciated.

Hope this helps. 

Regards, Dirk

-- 
Hell, there are no rules here - we're trying to accomplish something. 
  -- Thomas A. Edison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] RMySQL on Mac OS 10.4

2006-05-22 Thread Ryan Hafen
I have been having trouble getting RMySQL to work on Mac OS 10.4.  It  
was working fine for me about a year ago but I didn't touch it for a  
while and several upgrades of R, MySQL, and OS X later, it is not  
working.

I first had the problem that others have been having getting the  
error discussed in https://stat.ethz.ch/pipermail/r-sig-mac/2005- 
November/002435.html saying that RS-DBI driver warning: (MySQL  
mismatch between compiled version 4.0.24 and runtime version 4.1.14).

I was able to get this error message to disappear by compiling MySQL  
with gcc 3.3, and now I can connect to my database and run queries,  
but when I try to use dbWriteTable, I get this error:

  dbWriteTable(con, test, rnorm(100))
Error in .class1(object) : no direct or inherited method for function  
'dbWriteTable' for this call

For more info, I am running OS X 10.4.6, R 2.2.1, I have compiled  
RMySQL from source version 0.5-7, and I'm running MySQL 4.1.19.

I have no idea what this error means and I have not found anything  
about it anywhere else in the help files.  If anyone knows what I can  
do to fix this, I would really really appreciate it.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] RQuantlib Array processing applying EuropeanOptionExampleArray

2006-05-22 Thread Joe Byers

Dirk,

Thank you.  Shortly after I sent this email off, I discovered that I 
left the Array off the function name.  It worked fine.


My ignorance and oversight.

Thanx for the help.

Joe


Dirk Eddelbuettel wrote:

On 22 May 2006 at 10:57, Joe Byers wrote:
| I try and run the EuropeanOption example using atm.work$For_Price as my 
| array of underlying prices and the other inputs from row 9 of atm.work.

| i-9;
| x-EuropeanOption(type = put, underlying = atm.work$For_Price, strike 
| = atm.work$K[i],
| dividendYield = atm.work$BEY[i], riskFreeRate = 
| atm.work$BEY[i], maturity = atm.work$t_exp[i],

| volatility = atm.work$sigma[i])
| 
| x$parameters has the array of underlying prices but the results is only 
| a single vector using the first row of atm.work$For_Price.  Is this 
| because I am pulling the inputs from data.frame not arrays?


If I understand correctly what you are trying to do, then there may simply be
a misunderstanding on your part.

EuropeanOption(), like all but one [ more on that in the next paragraph ] of
the other RQuantLib functions, expects _scalars_ for all its inputs. But you
seem to expect that it magically vectorises things automatically for you. It
doesn't, and that's why you get errors or unexpected results.

There is however one convenience function -- mostly as a proof of concept --
which unrolls vectors (or sequences) in any of its arguments.  See 


 help(EuropeanOptionArrays)

and

 example(EuropeanOptionArrays)

where the results are shown in terms of two sequences of values for the
underlying and the volatility.

As always, there will be many more clever ways for doing this. Suggestions
and patches welcome.

| Any help is greatly appreciated.

Hope this helps. 


Regards, Dirk

  
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] RMySQL on Mac OS 10.4

2006-05-22 Thread Seth Falcon
Ryan Hafen [EMAIL PROTECTED] writes:
   dbWriteTable(con, test, rnorm(100))
 Error in .class1(object) : no direct or inherited method for function  
 'dbWriteTable' for this call

 For more info, I am running OS X 10.4.6, R 2.2.1, I have compiled  
 RMySQL from source version 0.5-7, and I'm running MySQL 4.1.19.

 I have no idea what this error means and I have not found anything  
 about it anywhere else in the help files.  If anyone knows what I can  
 do to fix this, I would really really appreciate it.

I think this just means that dbWriteTable doesn't know how to write
vectors, it knows how to write data frames (according to the man
page).

So you could test:

dbWriteTable(con, test, data.frame(rng=rnorm(100)))


+ seth

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] nls: formula error?

2006-05-22 Thread Gabor Grothendieck
The data= argument cannot be a matrix.  See ?nls

On 5/22/06, H. Paul Benton [EMAIL PROTECTED] wrote:
 So thanks for the help,

 I have a matrix (AB) which in the first column has my bin numbers so -4 - +4
 in 0.1 bin units. Then I have in the second column the frequency from some
 data. I have plotted them and they look roughly Gaussian. So I want to fit
 them/ find/optimize mu, sigma, and A.

 So I call the nls function :

 nls_AB - nls(x ~ (A/sig*sqrt(2*pi))* exp(-1*((x-mu)^2/(2*
 sig^2))),data=temp, start= list(A=0.1, mu=0.01, sig=0.5), trace=TRUE)
 Error in eval(expr, envir, enclos) : numeric 'envir' arg not of length one

 Temp looks like this:
   bin   x
  [1,] -4.0   0
  [2,] -3.9   0
  [3,] -3.8   0
 .etc
 [41,]  0.0 241
 [42,]  0.1 229
 [43,]  0.2 258
 [44,]  0.3 305
 [45,]  0.4 370
 [46,]  0.5 388

 So I don't get my error message. I looked at doing
 class(fo - (x ~ (A/sig*sqrt(2*pi))* exp(-1*((x-mu)^2/(2* sig^2)
 terms(fo)

 and that seems to work. So if anyone has any ideas I would welcome them.

 Cheers,

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Get list of ODBC data sources?

2006-05-22 Thread Jim Porzak
Hello R Helpers,

Before setting up a connection with RODBC, I would like to present my
users with a pick list of ODBC data sources available in their
environment. I may be missing something, but don't see anything in
RODBC itself to return list of sources for use in select.list(). Any
hints?

I'm running 2.3.0 on Win XP SP2.

-- 
TIA,
Jim Porzak
Loyalty Matrix Inc.
San Francisco, CA

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Plotting problem-Dendrograms Help!!

2006-05-22 Thread j.joshua thomas
Dear List,

My dataset is as below: I am using library(cluster) hierarchical clustering
on these data.
If i try to plot these i couldn't see the partitions clearly,
please find the link on the plot : http://roughjade.blogspot.com

Could anyone give me some suggesstions.
  AAP326 29
 AAW315 37
 AAW321 24
 AAW322 7
 AAW331 22
 ACE381 22
 ACP112 21
 ACP212 24
 ACP251 26
 ACP321 31
 ACW102 39
 ACW112 21
 ACW121 17
 ACW131 7
 ACW212 24
 ACW241 37
 ACW251 26
 ACW261 33
 ACW311 7
 ACW321 31
 ACW342 39
 ACW371 29
 AFP365 24
 AFU366 20
 AFW360 2
 AFW362 40
 AGU610 8
 AGW609 16
 AGW615 1
 AGW617 31
 AGW619 6
 AGW705 32
 AGW707 2
 AGW708 13
 AKP202 16
 AKW101 16
 AKW102 40
 AKW103 16
 AKW104 40
 AKW202 16
 AKW301 33
 AKW302 9
 AMP342 20
 AMP343 2
 AMP344 22
 AMP345 36
 AMP347 40
 AMU643 3
 AMW340 25
 AMW343 2
 AMW344 22
 AOP351 40
 AOP352 32
 AOP353 2
 AOP354 2
 AOP356 13
 AOU655 21
 AOW351 40
 AOW353 2
 APP301 22
 APP373 18
 APP374 13
 APW302 26
 ATP393 32
 ATU396 8
 ATW103 30
 ATW108 39
 ATW222 28
 ATW223 28
 ATW241 35
 ATW251 14
 ATW252 14
 ATW392 5
 BAT213 8
 BAT301 18
 BAT302 20
 BBT213 38
 BBT214 26
 BBT301 40
 BBT303 22
 BET302 18
 BET303 28
 BGT213 24
 BKT302 5
 BMT202 26
 BMT204 21
 BMT302 17
 BOE201 14
 BOI101 28
 BOI102 1
 BOI103 23
 BOI104 21
 BOI109 37
 BOM113 22
 BOM114 31
 BOT205 13
 BPT302 6
 BST203 5
 BST204 24
 BST301 39
 BST304 34
 BTT202 5
 BTT302 32
 BTT303 28
 BVT211 32
 BZT213 34
 BZT302 22
 BZT303 7
 CAM101 38
 CAP101 38
 CAT102 6
 CCS523 14
 CCS542 21
 CCS545 28
 CIT511 7
 CIT512 36
 CIT521 33
 CIT522 21
 CIT531 28
 CMM101 38
 CMM102 36
 CMM201 5
 CMM202 7
 CMM212 38
 CMM301 2
 CMP301 40
 CMT201 38
 CMT202 5
 CMT311 2
 CMT312 7
 CPM101 38
 CPM102 33
 CPM301 2
 CPM302 7
 CPM303 26
 CPP301 40
 CPP302 38
 CPP303 16
 CPS301 18
 CPS304 14
 CPT101 38
 CPT103 13
 CPT104 6
 CPT311 40
 CPT312 38
 CSA401 14
 CSC535 14
 CSI513 33
 CSI533 1
 CST102 33
 CST311 36
 CST312 31
 DTM302 36
 DTM324 3
 DTM325 30
 DTM346 16
 DTM364 6
 FCP552 23
 FCP554 37
 FCP557 25
 FEK101 26
 FEK202 22
 FEK302 24
 FEK403 28
 FFK454 32
 FFK455 23
 FFK456 30
 FIT141 15
 FIT142 3
 FIT143 8
 FIT242 3
 FIT343 26
 FIT344 39
 FKF112 13
 FKF213 8
 FKK171 5
 FTF222 5
 FTF223 32
 FTF323 6
 FTF324 3
 HBT104 28
 HBT105 34
 HBT207 30
 HBT213 5
 HEK212 26
 HEK222 28
 HET123 13
 HET224 2
 HET322 5
 HET323 17
 HET324 23
 HET521 18
 HET522 33
 HGF121 38
 HGF324 22
 HGG353 40
 HGM131 6
 HGM236 32
 HGM237 29
 HGM341 26
 HGT111 36
 HGT219 5
 HGT318 1
 HGW371 3
 HIA101 37
 HIS213 36
 HIS315 16
 HIU225 24
 HKB112 30
 HKB315 26
 HKB316 7
 HKB501 29
 HKB502 2
 HKN101 8
 HKN206 20
 HKN207 13
 HKN208 22
 HKN307 28
 HKT121 15
 HMT124 32
 HMT221 23
 HMT222 34
 HMT228 27
 HMT322 13
 HMT328 39
 HMT504 32
 HMT505 2
 HMT507 16
 HPW102 1
 HSE251 37
 HSI242 24
 HSM111 2
 HSM213 39
 HSM215 28
 HSM316 20
 HSM317 18
 HST121 30
 HST325 26
 HTU211 19
 HTV201 14
 HXE109 21
 HXE110 32
 HXE205 28
 HXE208 24
 HXE302 6
 HXE303 20
 HXE308 36
 IEG100 21
 IEK102 32
 IEK103 27
 IEK106 6
 IEK205 29
 IEK206 35
 IEK208 37
 IEK308 17
 IEK316 23
 IMG103 21
 IMG208 15
 IMG302 38
 IMG304 15
 IMK104 34
 IMK208 17
 IMK209 32
 IMK210 22
 IMK304 13
 IPK101 7
 IPK203 23
 IPK205 13
 IPK216 34
 IPK218 39
 IPK306 5
 IPK314 25
 IPK317 21
 IUK105 14
 IUK204 29
 IUK302 40
 IWK101 38
 IWK102 6
 IWK104 21
 IWK202 5
 IWK203 33
 IWK301 33
 IWK305 14
 KAA502 1
 KAA505 23
 KAE246 21
 KAE345 23
 KAT241 33
 KAT244 32
 KAT341 4
 KEN200 26
 KEN300 36
 KFT131 17
 KFT232 6
 KIE232 20
 KIE356 29
 KIE358 6
 KIT252 26
 KIT254 36
 KIT256 34
 KIT359 20
 KOE321 39
 KOT121 2
 KTE211 36
 KTE311 1
 LAA100 20
 LAA200 28
 LAA300 36
 LAC100 30
 LAC200 32
 LAE100 23
 LAE200 23
 LAE300 1
 LAG100 24
 LAG200 22
 LAG300 28
 LAJ100 8
 LAJ200 8
 LAJ300 36
 LAJ400 35
 LAP100 26
 LAP200 13
 LAS100 30
 LAS200 13
 LAS300 36
 LHP451 37
 LHP452 17
 LHP453 1
 LHP454 30
 LKM100 18
 LKM200 13
 LKM300 9
 LKM400 4
 LLC200 6
 LLC400 39
 LLJ200 38
 LLJ400 15
 LLS307 32
 LPT208 18
 MAA101 30
 MAA111 30
 MAA161 13
 MAT102 1
 MAT111 35
 MAT122 3
 MAT161 30
 MAT181 15
 MAT202 21
 MAT511 29
 MSG162 8
 MSG228 25
 MSG252 29
 MSG253 1
 MSG262 22
 MSG265 32
 MSG283 31
 MSG284 33
 MSG355 24
 MSG356 34
 MSG367 39
 MSG368 35
 MSG388 23
 MSG389 31
 MSS211 33
 MSS301 26
 MSS302 30
 MST565 26
 PGA103 25
 PGA104 40
 PGM261 33
 PGM322 9
 PGM324 38
 PGM333 9
 PGM342 20
 PGM343 34
 PGM363 34
 PGT302 5
 PGT311 30
 PGT313 33
 PGT322 9
 PGT324 38
 PGT325 9
 PGT333 9
 PGT342 20
 PGT343 34
 PGT363 34
 PLG333 17
 PLG516 8
 PLG517 3
 PLG518 37
 PLG526 8
 PLG532 29
 PLG537 13
 PLG542 16
 PLG546 18
 PLG548 22
 PLG551 39
 PLG553 16
 PLG556 22
 PLG558 16
 PLG701 24
 PLG712 16
 PLG721 30
 PLG722 27
 PLG731 37
 PLG732 30
 PPG202 9
 PPG203 40
 PPG310 28
 PPG311 24
 PPG312 39
 PPG313 36
 PPG314 5
 PPG315 7
 PPG316 7
 PPG317 32
 PPG318 16
 PPG319 30
 PPG320 35
 PPG331 26
 PPG333 22
 RAG265 23
 RAG322 13
 RAK343 34
 RAK344 25
 RAK552 21
 RDB314 37
 RDG266 21
 RDG333 34
 REG162 6
 REG261 32
 REG262 6
 REG264 8
 REG363 36
 REG364 1
 RET521 21
 RET522 40
 RET531 13
 RET532 

[R] hclust-Dendrograms-Help!

2006-05-22 Thread j.joshua thomas
Dear List,

My dataset is as below: I am using library(cluster) hierarchical clustering
on these data.
If i try to plot these i couldn't see the partitions clearly,

 dd-hclust(dist(DataSetS01022Full), ave)
 Warning message:
 NAs introduced by coercion
  (dn-as.dendrogram(dd))
  plot(dn)

please find the link on the plot :
http://roughjade.blogspot.com  i couldn't figure it out whether *design *of
the dataset.



Could anyone give me some suggesstions.
  AAP326 29
 AAW315 37
 AAW321 24
 AAW322 7
 AAW331 22
 ACE381 22
 ACP112 21
 ACP212 24
 ACP251 26
 ACP321 31
 ACW102 39
 ACW112 21
 ACW121 17
 ACW131 7
 ACW212 24
 ACW241 37
 ACW251 26
 ACW261 33
 ACW311 7
 ACW321 31
 ACW342 39
 ACW371 29
 AFP365 24
 AFU366 20
 AFW360 2
 AFW362 40
 AGU610 8
 AGW609 16
 AGW615 1
 AGW617 31
 AGW619 6
 AGW705 32
 AGW707 2
 AGW708 13
 AKP202 16
 AKW101 16
 AKW102 40
 AKW103 16
 AKW104 40
 AKW202 16
 AKW301 33
 AKW302 9
 AMP342 20
 AMP343 2
 AMP344 22
 AMP345 36
 AMP347 40
 AMU643 3
 AMW340 25
 AMW343 2
 AMW344 22
 AOP351 40
 AOP352 32
 AOP353 2
 AOP354 2
 AOP356 13
 AOU655 21
 AOW351 40
 AOW353 2

Thanks in Advance,
-- 
Lecturer J. Joshua Thomas
KDU College Penang Campus
Research Student,
University Sains Malaysia

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Possible with lm() or glm()?

2006-05-22 Thread John Vokey
HelpeRs,

I have what amounts to a stupid design---entirely my fault, but it  
seemed like a good idea at the time.  Every participant in the  
experiment receives a series of a two-alternative forced-choice  
(2afc) pairs generated at random from a pool of items that make up  
the pairs.  The independent variable of interest is the difference in  
the 2afc pairs on a metric measure to predict 2afc accuracy.  For  
each participant, the 2afc pairs are generated at random (without  
replacement) from set of items.  This approach means that for any  
participant there are 40 2afc trials in which the independent  
variable varies at random from differences of 0 to some large  
absolute difference.  The problem is that that because of the random  
construction of the pairs there is no consistent variation in the IV  
over participants (e.g., some participants contribute a high  
frequency of judgements on small difference pairs, others on large  
difference pairs, etc.).  What I want is the relationship between the  
IV (pair differences on the metric measure) and 2afc accuracy, with  
the variation due to the random assignment of pair differences to  
participants removed.  To do so, I want to compute the residuals  
relating IV differences to 2afc accuracy after participant variation  
(relating to the covariation with absolute difference) is removed.   
So, I don't want to remove participant variation per se, bit only  
that arising from the covariation of participant with IV pair  
difference.  Ultimately, I want to plot the psychometric function,  
complete with some variant of confidence intervals.  I hope that is  
clear.  So, is there some approach  within lm() or glm() that will  
provide for this?  Any ideas?  To make it concrete, imagine I have a  
data frame consisting of participant as a factor, absolute pair  
difference within participant as the vector IV and response (correct/ 
incorrect) as the vector DV.  [The actual design is more complex, but  
this simple structure captures my concern.]

--
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

-Dr. John R. Vokey

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html