from:"Martin Maechler"

Re: [R] artificial data matrix with 100000 rows

2007-09-10 Thread Martin Maechler

 PS == Paul Smith [EMAIL PROTECTED]
 on Sun, 9 Sep 2007 12:17:32 +0100 writes:

PS On 9/9/07, kevinchang [EMAIL PROTECTED] wrote:
 I tried to made the matrix with this size by either matrix() or array().
 However, there seems to be default limit of number for rows made. I got 
sort
 of error message from R .To be specific,
 
 m--matrix(ncol=3,nrow=10)
 
 error message:[ reached getOption(max.print) -- omitted 7 rows ]]
 
 or
 
 a-array(dim=c(1,3,10))
 
 error message:reached getOption(max.print) -- omitted 6667 row(s) and 6
 matrix slice(s) ]

PS That is not an error message, I guess. 

Definitely not,
thank you, Paul!

Also, they were not produced by what Kevin showed (namely assignments)
but rather when he *prints* the contents of his huge matrix /
array.

PS When the matrices are huge, R is unable to print them
PS totally on the screen, but all data are present.

Not at all unable !!
R protects you from accidentally overflowing your console with
huge amount of non-sensical output.

As the warning above mentions,
you should look at
  ? getOption
  ? options
and particularly the  'max.print'  option

Is  '' reached  getOption(max.print) '' 
too difficult to read?

You *can* increase the 'max.print' option as much as you like,
and that's why I said 'not at all unable'   above.

Regards,
Martin

PS For instance,

 m[(nrow(m)-10):nrow(m),]
PS [,1] [,2] [,3]
PS [1,]   NA   NA   NA
PS [2,]   NA   NA   NA
PS [3,]   NA   NA   NA
PS [4,]   NA   NA   NA
PS [5,]   NA   NA   NA
PS [6,]   NA   NA   NA
PS [7,]   NA   NA   NA
PS [8,]   NA   NA   NA
PS [9,]   NA   NA   NA
PS [10,]   NA   NA   NA
PS [11,]   NA   NA   NA

or rather just

   tail(m)

or tail(m, 11)
or head(m)

or str(m)

etc etc

PS See

PS ?getOption

yes indeed.
Martin

PS Paul

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matlab's lsqnonlin

2007-09-08 Thread Martin Maechler

 KateM == Katharine Mullen [EMAIL PROTECTED]
 on Fri, 7 Sep 2007 20:07:41 +0200 (CEST) writes:

KateM The thread you linked to regarding Levenberg-Marquardt's supposed 
lack of
KateM availability is from 2001; it has been possible to get
KateM to the MINPACK implementation of Levenberg-Marquardt within R via the
KateM package minpack.lm
KateM (http://cran.r-project.org/src/contrib/Descriptions/minpack.lm.html) 
since
KateM 2005.

Thanks a lot, Kate.

I'm wondering about experiences:
Do you know of cases where  minpack.lm's  nls.lm() solved a
(real) problem that nls() would have a problem with ?

Beware however -- one of the main things I learned about this
field from Doug Bates, co-author of Bates_and_Watts and
prinicipal author of S's and R's nls() :
It's a *feature* that nls() does not converge sometimes when
other methods do falsely claim convergence!

Martin Maechler, ETH Zurich

KateM 
KateM Katharine Mullen
KateM mail: Department of Physics and Astronomy, Faculty of Sciences
KateM Vrije Universiteit Amsterdam, de Boelelaan 1081
KateM 1081 HV Amsterdam, The Netherlands
KateM room: T.1.06
KateM tel: +31 205987870
KateM fax: +31 205987992
KateM e-mail: [EMAIL PROTECTED]
KateM homepage: http://www.nat.vu.nl/~kate/


KateM On Fri, 7 Sep 2007, Jose Luis Aznarte M. wrote:

 Hi! I'm translating some code from Matlab to R and I found a problem.
 I need to translate Matlab's function 'lsqnonlin'
 (http://www-ccs.ucsd.edu/matlab/toolbox/optim/lsqnonlin.html) into R,
 and at the beginning I  thought it would be the same as R's 'optim'. But
 then I looked at the definition of 'lsqnonlin' and I don't quite see how
 to make 'optim' to do the same thing. Does anyone have an idea?
 This is apart from the fact that I would like to use the Levenberg
 Marquardt algorithm which is not implemented in R (some discussion about
 this: http://tolstoy.newcastle.edu.au/R/help/00b/2492.html).
 Thank you! All the best,
 
 
 --  --
 Jose Luis Aznarte M.   http://decsai.ugr.es/~jlaznarte
 Department of Computer Science and Artificial Intelligence
 Universidad de Granada   Tel. +34 - 958 - 24 04 67
 GRANADA (Spain)  Fax: +34 - 958 - 24 00 79
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

KateM __
KateM R-help@stat.math.ethz.ch mailing list
KateM https://stat.ethz.ch/mailman/listinfo/r-help
KateM PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
KateM and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] integers

2007-09-04 Thread Martin Maechler

 CH == Christoph Heibl [EMAIL PROTECTED]
 on Tue, 4 Sep 2007 11:53:43 +0200 writes:

CH Hi list,
CH The function is.integer tests if an object is of type integer:

CH see e.g.:

CH is.integer(12)  # FALSE
CH is.real(12) # TRUE
CH mode(12)# numeric

CH But how can I test if a number is actually an integer? 

something likeround(x) == x
is often good enough, maybe   x %% 1 == 0
seems a bit more efficient.

Note that both return  NA  whenever  x[] is NA so may not directly be
appropriate for your use case.


CH R seek is  difficult to search in this case because it mainly yields 
entries  
CH about the integer()-function family.

R seek ???

Do you mean the R function  RSiteSearch() which goes to
'http://search.r-project.org/' ?

Well,  calling  
  RSiteSearch(integer number)

gives almost 3000 hits, *but* number 10 is exactly relevant to
your question...

CH Thanks for any hint!
CH Christoph Heibl

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Incomplete Gamma function

2007-08-31 Thread Martin Maechler

 AN == Anup Nandialath [EMAIL PROTECTED]
 on Fri, 31 Aug 2007 13:15:08 -0700 (PDT) writes:

AN Hi Kris, You just need to understand the mathematics of
AN the incomplete gamma function and the various
AN relationships it has. The answers from both Mathematica
AN and R are correct, except that they are giving you
AN different estimated quantities. It depends on the way
AN the gamma function is written. For instance in R to get
AN the equivalent result from mathematica you should do the
AN following
 
AN  answer - gamma(9) - Igamma(9,11.1). This will give you
AN the incomplete gamma for (9,11.1) as given by
AN Mathematica.
 
AN  You can read more about the model and am sure you will
AN figure it out.
 
Yes, and then,  *PLEASE*  do as Brian Ripley suggested,
and understand that the
(normalized) incomplete gamma function is the same as the gamma
distribution functions and has been available in S and R for ever 
__and__ in R has been very thoroughly tested and quite a few extreme
cases have been made to work more accurately, etc etc: pgamma() !

BTW: The same applies to the incomplete beta function which --
in one of it's equivalent forms -- is called the beta
distribution function in probability and statistics and has been
available in S and R for ever, and for R, has been very
carefully tested  and for extreme border cases gradually
improved over the years,  most recently for the upcoming R
2.6.0, where the precision of pbeta(*,  log=TRUE)  has been
dramatically improved in some extreme tail/range cases.
-- which benefits  pt(), pf(), pbinom() etc in equivalent
situations.

Martin Maechler, 
ETH Zurich and R-core


 
AN  Anup


   
AN - Got a little couch
AN potato?  Check out fun summer activities for kids.
AN [[alternative HTML version deleted]]

AN __
AN R-help@stat.math.ethz.ch mailing list
AN https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
AN read the posting guide
AN http://www.R-project.org/posting-guide.html and provide
AN commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R mailinglist ETH website now with trusted certificate

2007-08-21 Thread Martin Maechler

Some of you, notably I think users of MS Internet Explorer,
may be happy to learn that since yesterday,
the web server of the R mailing lists
https://stat.ethz.ch/
now runs an ``official'' / trusted certificate as opposed to the
inofficial one (Math deparment ETH) that we have had for years
instead.

In particular, this should make access to the (first but by far
not only) mailing list archives more convenient to you,
e.g. for this month, for R-help,
https://stat.ethz.ch/pipermail/r-help/2007-August/thread.html

In parallel, the R Foundation is getting (buying) certificates
for several R-project.org servers, notably also the subversion
(R source) server, and these will hopefully be put in place as
well with the next few weeks.

Martin Maechler,
ETH Zurich  R core

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] library(fCalendar) timeDate(12.03.2005, format=%d.%m.%Y)

2007-08-21 Thread Martin Maechler

 OL == Ola Lindqvist [EMAIL PROTECTED]
 on Tue, 21 Aug 2007 14:32:19 +0200 writes:

OL Thanks!
OL Seems to work fine now!

Well, for your example.

But sorry to say, the patch breaks other cases.

I'm investigating further
(and will hopefully contribute to a new CRAN release of fCalendar
 once Diethelm Wuertz is back from wherever; I've already made
 more changes)

Martin Maechler,
ETH Zurich  [but different department than D.Wuertz]


OL Martin Becker wrote:
 Dear Ola,
 
 I think you spotted a small bug in *package* fCalendar.
 Explicit specification should prevent autodetection of the date 
 format, which is not the case for fCalendar v251.70, instead 
 autodetection is done at least once (twice, if actually appropriate).
 With the following patch, things should work ok:
 
 diff --recursive fCalendar.orig/R/3A-TimeDateClass.R 
 fCalendar/R/3A-TimeDateClass.R
 433c433
  charvec = format(strptime(charvec, .whichFormat(charvec)), 
 isoFormat)
 ---
  charvec = format(strptime(charvec, format), isoFormat)
 
 You did not provide the output of sessionInfo() (which you are asked 
 for in the posting guide). If you are using Windows and don't know how 
 to apply the patch, you can download a patched binary version here:
 http://www.saar-gate.net/download/fCalendar_251.70.zip
 
 Regards,
 
 Martin
 
 PS: Maybe r-sig-finance is more appropriate for questions concerning 
 Rmetrics.
 
 
 Ola Lindqvist wrote:
 Dear R users,
 I have problem with the library fCalendar.
 
 I am not using the US standard format notations. It seems like it is 
 not possible to have different format than the US standards.
 Anyone how knows a way to go around this problem?
 
 Here is the code I enter:
 myDate = 12.03.2005
 timeDate(myDate, format = %d.%m.%Y)
 
 And I get following error message:
 Error in if (sum(lt$sec + lt$min + lt$hour) == 0) isoFormat = 
 %Y-%m-%d :
 missing value where TRUE/FALSE needed
 
 Thanks,
 Ola
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

OL __
OL R-help@stat.math.ethz.ch mailing list
OL https://stat.ethz.ch/mailman/listinfo/r-help
OL PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
OL and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Any parser generator / code assistance for R?

2007-08-18 Thread Martin Maechler

 A- == Ali - [EMAIL PROTECTED]
 on Sat, 18 Aug 2007 00:40:52 +0100 writes:

A- Hi,
A- Is there any parser generator like www.antlr.org?  Moreover, how does 
simple 
A- code assistance work currently in R? By 'simple code assistance' I 
meant 
A- things like:

A- Object$MTAB -- Object$Method

If you really meant a list with components
or an S4 object with slots,
such code completion works at least since R 2.5.1, because of
the recent 'rcompletion' extensions of Deepayan Sarkar,
and of course in ESS (Emacs Speaks Statistics),
and I think in several other GUI/Environments as well.

But if you are thinking OOP as in Java or C++ (and I think you
*are* thinking along that way), then rather learn
that S (and hence R) do OOP in a function-centric rather than class-centric
way; something which seems to be quite hard to grasp for many
who have been brought up in Java-like schools
If you are still interested in R, look out for documents with
S4 (or formal methods and classes) and R in the title  ;-)

Regards,
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function to find coodinates in an array

2007-08-17 Thread Martin Maechler

 GaGr == Gabor Grothendieck [EMAIL PROTECTED]
 on Thu, 16 Aug 2007 23:46:28 -0400 writes:

GaGr Get the indices using expand.grid and then reorder
GaGr them: set.seed(1); X - array(rnorm(24), 2:4) # input
GaGr X # look at X

GaGr do.call(expand.grid, sapply(dim(X), seq))[order(X),]

Excellent, Gabor!

Definitely the nicest of the solutions so far!

GaGr On 8/16/07, Ana Conesa [EMAIL PROTECTED] wrote:
 Dear list,
 
 I am looking for a function/way to get the array
 coordinates of given elements in an array. What I mean is
 the following: - Let X be a 3D array - I find the
 ordering of the elements of X by ord - order(X) (this
 returns me a vector) - I now want to find the x,y,z
 coordinates of each element of ord
 
 Can anyone help me?
 
 Thanks!
 
 Ana

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching elements from two vectors

2007-08-17 Thread Martin Maechler

 GF == Gonçalo Ferraz [EMAIL PROTECTED]
 on Fri, 17 Aug 2007 11:03:09 -0400 writes:

GF Thanks much. x %in% y works.

yes, indeed.
But I wonder a bit why nobody remarked on the facts that

1) you use match in the subject of your e-mail

2) match() is the basic ingredient you use:
   '%in%' is trivially defined via match().
   Type  `%in%`  in your R session.

Regards, Martin

GF On 8/17/07, jim holtman [EMAIL PROTECTED] wrote:
 
 Also if you want all the matches
 
  x[x %in% y]
 [1] 2 3 3 3
 
 
 On 8/17/07, Gon�alo Ferraz [EMAIL PROTECTED] wrote:
  Hi,
 
  Imagine a vector x with elements (1,2,1,1,3,5,3,3,1) and a vector y 
with
  elements (2,3). I need to find out what elements of x match any of the
  elements of y.
 
  Is there a simple command that will return a vector with elements
  (F,T,F,F,T,F,T,T,F). Ideally, I would like a solution that works with
  dataframe colums as well.
 
  I have tried x==y and it doesn't work.
  x==any(y) doesn't work either. I realize I could write a foor loop and
 go
  through each element of y asking if it matches any element of x, but
 isn't
  there a shorter way?
 
  Thanks,
  Gon�alo
 
 [[alternative HTML version deleted]]
 
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem you are trying to solve?
 

GF [[alternative HTML version deleted]]

GF __
GF R-help@stat.math.ethz.ch mailing list
GF https://stat.ethz.ch/mailman/listinfo/r-help
GF PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
GF and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ung�ltige Versionsspezifikation

2007-08-15 Thread Martin Maechler

 JK == John Kane [EMAIL PROTECTED]
 on Wed, 15 Aug 2007 08:55:59 -0400 (EDT) writes:

JK I think we need more information about your system.
JK Please run
JK sessionInfo()
JK and include the information in another posting.

Yes, indeed.
However R version 2.3.1 seems a bit too old to fit to a current
version of cairo (rather 'cairoDevice' ??).

And BTW: It is a *package*, not a library!!!

Martin Maechler, ETH Zurich

JK --- Mag. Ferri Leberl [EMAIL PROTECTED] wrote:

 Dear everybody,
 excuse me if this question ist trivial, however, I
 have now looked for
 an answer for quite a while and therefore dare
 placing it here.
 I want to export .svg-files and got here the advice
 to employ the
 cairo-library.
 I downloaded the *current*-version here and expanded
 it
 to /usr/local/lib/R/site-library.
 library(cairo) returned ungÃ¼ltige
 Versionsspezifikation (INVALID VERSION
 SPECIFICATION).
 I got some answer here (EPM39), but, stupid enough,
 I could not employ
 it as I don't know where to look for the variable
 named there.
 The R-version I employ is 2.3.1.
 Can anybody solve the (probably simple) problem?
 Thank you in advance.
 Yours, Ferri



JK 

JK __
JK R-help@stat.math.ethz.ch mailing list
JK https://stat.ethz.ch/mailman/listinfo/r-help
JK PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
JK and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mixture of Normals with Large Data

2007-08-08 Thread Martin Maechler

 BertG == Bert Gunter [EMAIL PROTECTED]
 on Tue, 7 Aug 2007 16:18:18 -0700 writes:

  TV Have you considered the situation of wanting to
  TV characterize probability densities of prevalence
  TV estimates based on a complex random sample of some
  TV large population.

BertG No -- and I stand by my statement. The empirical
BertG distribution of the data themselves are the best
BertG characterization of the density. You and others are
BertG free to disagree.

I do agree with you Bert.
From a practical point of view however, you'd still want to use an
approximation to the data ECDF, since the full ecdf is just too
large an object to handle conveniently.

One simple quite small and probably sufficient such
approximation maybe
using the equivalent of quantile(x, probs = (0:1000)/1000)
which is pretty related to just working with a binned version of
the original data; something others have proposed as well.

Martin 

BertG On 8/7/07, Bert Gunter [EMAIL PROTECTED]
BertG wrote:
 Why would anyone want to fit a mixture of normals with
 110 million observations?? Any questions about the
 distribution that you would care to ask can be answered
 directly from the data. Of course, any test of
BertG normality
 (or anything else) would be rejected.
 
 More to the point, the data are certainly not a random
 sample of anything.  There will be all kinds of
 systematic nonrandom structure in them. This is clearly a
 situation where the researcher needs to think more
 carefully
BertG about
 the substantive questions of interest and how the data
 may shed light on them, instead of arbitrarily and
 perhaps reflexively throwing some silly statistical
 methodology at them.
 
 Bert Gunter Genentech Nonclinical Statistics
 
 -Original Message- From:
 [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of
 Tim Victor Sent: Tuesday, August 07, 2007 3:02 PM To:
 r-help@stat.math.ethz.ch Subject: Re: [R] Mixture of
 Normals with Large Data
 
 I wasn't aware of this literature, thanks for the
 references.
 
 On 8/5/07, RAVI VARADHAN [EMAIL PROTECTED] wrote: 
 Another possibility is to use data squashing methods.
 Relevant papers are: (1) DuMouchel et al. (1999), (2)
 Madigan et al. (2002), and (3) Owen (1999).
 
  Ravi.  
 
 
  Ravi Varadhan, Ph.D.   Assistant Professor,  Division
 of Geriatric Medicine and Gerontology  School of
 Medicine  Johns Hopkins University
 
  Ph. (410) 502-2619  email: [EMAIL PROTECTED]
 
 
  - Original Message -  From: Charles C. Berry
 [EMAIL PROTECTED]  Date: Saturday, August 4, 2007
 8:01 pm  Subject: Re: [R] Mixture of Normals with Large
 Data  To: [EMAIL PROTECTED]  Cc:
 r-help@stat.math.ethz.ch
 
 
   On Sat, 4 Aug 2007, Tim Victor wrote:
  
All:

I am trying to fit a mixture of 2 normals with 
 110 million   observations. Iam running R 2.5.1
 on a box with 1gb RAM running 32-bit windows and   I 
   continue to run out of memory. Does anyone have any
 suggestions.
  
  
   If the first few million observations can be regarded
 as a SRS of the
  
   rest, then just use them. Or read in blocks of a
 convenient size and
  
   sample some observations from each block. You can
 repeat this process   a   few times to see if the
 results are sufficiently accurate.
  
   Otherwise, read in blocks of a convenient size
 (perhaps 1 million   observations at a time), quantize
 the data to a manageable number of
  
   intervals - maybe a few thousand - and tabulate
 it. Add the counts   over   all the blocks.
  
   Then use mle() to fit a multinomial likelihood whose
 probabilities   are the   masses associated with each
 bin under a mixture of normals law.
  
   Chuck
  

Thanks so much,

Tim

[[alternative HTML version deleted]]

__  
  R-help@stat.math.ethz.ch mailing list

PLEASE do read the posting guideand provide
 commented, minimal, self-contained, reproducible code.

  
   Charles C. Berry (858) 534-2098   Dept of  
 Family/Preventive Medicine   E UC San Diego   La
 Jolla, San Diego 92093-0901
  
   __  
 R-help@stat.math.ethz.ch mailing list
  
   PLEASE do read the posting guide   and provide
 commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list

Re: [R] deriv for psigamma

2007-08-01 Thread Martin Maechler

 UweL == Uwe Ligges [EMAIL PROTECTED]
 on Tue, 31 Jul 2007 12:13:45 +0200 writes:

UweL francogrex wrote:
 Hi, 2 questions:

[]

 Question 2:
 
 deriv(~gamma(x),x)
 
 expression({
 .expr1 - gamma(x)
 .value - .expr1
 .grad - array(0, c(length(.value), 1), list(NULL, c(x)))
 .grad[, x] - .expr1 * psigamma(x)
 attr(.value, gradient) - .grad
 .value
 })
 
 BUT
 
 deriv3(~gamma(x),x)
 Error in deriv3.formula(~gamma(x), x) : Function 'psigamma' is not in 
the
 derivatives table
 

 What I want is the expression for the second derivative (which I believe 
is
 trigamma(x), or psigamma(x,1)), how can I obtain that?


UweL By using some algebraic software (rather than a numeric one) or 
UweL contributing complete derivatives tables for the next R release.

Yes, but for the present case, one could argue that
the R internal code which knows  
d/dx lgamma(x) = psi(x) = digamma(x) = psigamma(x,0)
should easily be enhanced to also know

d/dx psigamma(x, n) = psigamma(x, n+1)

and consequently (but maybe with an extra clause)

d/dx psigamma(x) = psigamma(x, 1)

The code is in  R*/src/main/deriv.c
and patches which implement the above and (there are few more
'FIXME's there ... ;-) against
https://svn.r-project.org/R/trunk/src/main/deriv.c
are welcome - after useR!2007

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Slightly OT - use of R

2007-07-30 Thread Martin Maechler

 BDR == Prof Brian Ripley [EMAIL PROTECTED]
 on Mon, 30 Jul 2007 11:13:47 +0100 (BST) writes:

BDR On Mon, 30 Jul 2007, [EMAIL PROTECTED] wrote:
 On 30-Jul-07 08:28:15, John Logsdon wrote:
 I am trying to get a measure of how R compares in usage as a
 statistical platform compared to other software. I would guess
 it is the most widely used among statisticians at least by
 virtue of it being open source.

BDR I don't think that is the main reason.  Most of the R users I know 
BDR migrated from commercial statistical software for reasons other than 
cost.
BDR (Cross-platform availability has been one major reason.)

much of this is true here (Switzerland) as well.
{And some have *not* migrated because R is Free Software, but
 that's really another story}

Note however that the (non-PhD-graduate) students we teach here
would not be urged to using R if it was not the combination of
its quality and its Free Software state.
And I have had several acquaintances who have only started using
R because they could get it so easily and quickly, and they have
changed to using R as their main computational/statistical
software tool.

 But is there any study to which I can refer? By asking this
 list I am not exactly adopting a rigorous approach!
 
 I don't know about that -- my own expectation would be that
 serious users of R are likely to be subscribers to the list.
 
 So maybe a good answer to your question would be the number
 of subscribers (which I'm sure Martin Maechler can find out).
 Of course, some people will have subscribed under more than
 one email address, so that would somewhat over-estimate the
 number of people who subscribe. But it can be traded off
 (to a somewhat unknown extent) against R users who do not
 subscribe.

BDR I think it would be a seriously biased estimate.
BDR Few of our hundreds of student users will be subscribed to R-help 
BDR (since their first port of call for help is local).
BDR Also, we get quite a lot of postings via the gmane and nabble gateways.

Yes, yes, yes.
The exact same situation here and I'd believe in many places.

And the problem with the bias ('factor' rather than 'offset' I'd say)
is that it has been changing over time - I'd guess increasing pretty
dramatically.

My very wild subjective guess would be that 

   #{statisticians seriously using R} / 
   #{R-help subscribers} =  
=  N_t / n_t

is nowadays well over 20, maybe even over 100,
of course depending on the definition of the numerator N_t.

I could construct a very accurate time-series for n_t,
but since I agree with Brian,
I haven't done so for several years.

Note that  n_{t = 2007-07-30, 07:00} = 5559

 More to the point, though, is what you mean by usage.
 If you simply mean people who use, that's a matter of
 counting (one way or another). But there's use and use.

BDR Indeed.

amen - Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] options(OutDec) etc {was Strange warning in summary.lm}

2007-07-25 Thread Martin Maechler

Without going into your details,
I think  options() should NEVER influence what as.numeric() does
(which I think you are indirectly suggesting it should).

In my eyes, using a decimal comma instead of decimal point in
scientific computing is an abomination in itself.

Providing an  option for  *output* is one thing,
but having it influence basic R engine functions like
as.numeric(), format(), ...
is an absolute NO!_NO!  for me.
So I am strongly opposed to an 'InDec' option as was mentioned
earlier in this thread.

We could consider adding a 'dec' (or 'decimal.sep')
*argument* to some R functions,
but such an argument must default to .  rather than yet
another option.

R should remain as *functional* as possible.
== options() should be used sparingly.

They should only influence printed output at most, not
computations per se. 
Of course I know that this is not strictly possible since the
computations can use textConnection() etc..

Martin Maechler, ETH Zurich
  

 OT == ONKELINX, Thierry [EMAIL PROTECTED]
 on Wed, 25 Jul 2007 11:04:43 +0200 writes:

OT Dear Peter, Uwe and Brian,
OT I've found some more problems with options(OutDec = ,).

OT 1) as.numeric yields NA where it shouldn't

 z - c(12, 12,34, 12.34)
 options(OutDec = ,)
 as.numeric(z)
OT [1] 12,00NA 12,34
OT Warning message:
OT NAs introduced by coercion in: as.double.default(z) 

OT # should result in c(12, 12.34, NA)

 options(OutDec = .)
 as.numeric(z)
OT [1] 12.00NA 12.34
OT Warning message:
OT NAs introduced by coercion in: as.double.default(z) 


OT 2) anova yields the same warning as summary

 x - runif(100)
 y - rnorm(100)
 options(OutDec = ,)
 summary(lm(y~x))

OT Call:
OT lm(formula = y ~ x)

OT Residuals:
OT Min   1Q   Median   3Q  Max 
OT -2,81744 -0,61680  0,02107  0,66309  2,20599 

OT Coefficients:
OT Estimate Std. Error t value Pr(|t|)
OT (Intercept) -0,073531   0,195880  -0,3750,708
OT x0,007519   0,318159   0,0240,981

OT Residual standard error: 0,9795 on 98 degrees of freedom
OT Multiple R-Squared: 5.699e-06,  Adjusted R-squared: -0.0102 
OT F-statistic: 0.0005585 on 1 and 98 DF,  p-value: 0,9812 

OT Warning message:
OT NAs introduced by coercion in: as.double.default(Cf[okP]) 
 anova(lm(y~x))
OT Analysis of Variance Table

OT Response: y
OT Df Sum Sq Mean Sq F value Pr(F)
OT x  1  0,001   0,001   6e-04 0,9812
OT Residuals 98 94,031   0,960   
OT Warning message:
OT NAs introduced by coercion in: as.double.default(Cf[okP]) 

OT Cheers,

OT Thierry


OT 

OT ir. Thierry Onkelinx
OT Instituut voor natuur- en bosonderzoek / Research Institute for Nature 
and Forest
OT Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, 
methodology and quality assurance
OT Gaverstraat 4
OT 9500 Geraardsbergen
OT Belgium
OT tel. + 32 54/436 185
OT [EMAIL PROTECTED]
OT www.inbo.be 

OT Do not put your faith in what statistics say until you have carefully 
considered what they do not say.  ~William W. Watt
OT A statistical analysis, properly conducted, is a delicate dissection of 
uncertainties, a surgery of suppositions. ~M.J.Moroney

 

 -Oorspronkelijk bericht-
 Van: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] Namens ONKELINX, Thierry
 Verzonden: donderdag 19 juli 2007 13:56
 Aan: Peter Dalgaard
 CC: r-help@stat.math.ethz.ch; Uwe Ligges
 Onderwerp: Re: [R] Strange warning in summary.lm
 
 Dear Peter,
 
 Here's an example. Notice the warning in the last two lines 
 of the summary with options(OutDec = ,). It's not present 
 with options(OutDec = .).
 
 Cheers,
 
 Thierry
 
  x - runif(100)
  y - rnorm(100)
  options(OutDec = ,)
  summary(lm(y~x))
 
 Call:
 lm(formula = y ~ x)
 
 Residuals:
 Min1QMedian3Q   Max 
 -2,389749 -0,607002  0,006969  0,689535  1,713197 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  0,033970,17774   0,1910,849
 x   -0,092190,29518  -0,3120,755
 
 Residual standard error: 0,868 on 98 degrees of freedom 
 Multiple R-Squared: 0.0009943,  Adjusted R-squared: -0.0092
 F-statistic: 0.09754 on 1 and 98 DF,  p-value: 0,7555 
 
 Warning message:
 NAs introduced by coercion in: as.double.default(Cf[okP]) 
  options(OutDec = .)
  summary(lm(y~x))
 
 Call:
 lm(formula = y ~ x)
 
 Residuals:
 Min1QMedian3Q   Max 
 -2.389749 -0.607002  0.006969  0.689535  1.713197 
 
 Coefficients:
 Estimate Std. Error t value Pr(|t|)
 (Intercept)  0.03397

Re: [R] Computing the rank of a matrix.

2007-07-24 Thread Martin Maechler

 RV == RAVI VARADHAN [EMAIL PROTECTED]
 on Sat, 07 Apr 2007 18:39:36 -0400 writes:
 

this is a bit a late reply...  better late than never

RV Hi Martin,

Hi Ravi, thanks for your research.


RV I played a bit with rankMat.  I will first state the
RV conclusions of my numerical experiments and then present
RV the results to support them:

RV 1.  I don't believe that rankMat, or equivalently
RV Matlab's rank computation, is necessarily better than
RV qr(.)$rank or (qr., LAPACK=TRUE)$rank.  In fact, for the
RV notorious Hilbert matrix, rankMat can give poor
RV estimates of rank.

RV 2.  There exists no universally optimal (i.e. optimal
RV for all A) tol in qr(A, tol)$rank that would be
RV optimally close to rankMat.  The discrepancy in the
RV ranks computed by qr(A)$rank and rankMat(A) obviously
RV depends on the matrix A. This is evident from the tol
RV used in rankMat:
RV  tol - max(d) * .Machine$double.eps * abs(singValA[1])
RV So, this value of tol in qr will also minimize the discrepancy. 

RV 3.  The tol parameter is irrelevant in qr(A, tol,
RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize
RV the tol parameter when computing the rank.  However,
RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol.

Yes, indeed!  The help page of qr() actually says so 
{at least now, don't know about 3 months ago}.

RV Now, here are the results:
RV 1.
 hilbert.rank - matrix(NA,20,3)
 hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
 for (i in 1:20) {
RV + himat - hilbert(i)
RV + hilbert.rank[i,1] - rankMat(himat)
RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank
RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK = 
TRUE)$rank
RV + }
 hilbert.rank
RV [,1] [,2] [,3]
RV [1,]111
RV [2,]222
RV [3,]333
RV [4,]444
RV [5,]555
RV [6,]666
RV [7,]777
RV [8,]888
RV [9,]999
RV [10,]   10   10   10
RV [11,]   10   11   11
RV [12,]   11   12   12
RV [13,]   11   12   13
RV [14,]   11   13   14
RV [15,]   12   13   15
RV [16,]   12   15   16
RV [17,]   12   16   17
RV [18,]   12   16   18
RV [19,]   13   17   19
RV [20,]   13   17   20

RV We see that rankMat underestimates the rank for n  10, but qr(himat, 
tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right.

Yes, indeed;  and that's  seems a bit  against the  ``common
knowledge'' that svd() is more reliable than qr()

Hmm I'm still baffled a bit..

Note that with the Hilbert matrices, 
one might argue that  hilbert(20) might really not have a
correct estimated rank of 20,
but at least for hilbert(13) or so, the rank should be == n.

BTW, there's a nice plot, slightly related to this problem,
using rcond() from the Matrix package :

  library(Matrix)

  hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) }
  rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n

  plot(rcHilb, xlab = n, log = y, col = 2, type = b,
   main = reciprocal condition numbers of Hilbert(n))

where one sees that the LAPACK code that underlies
Matrix::rcond() also gets into problem when estimating the
condition number for hilbert(n) when n ~ 16 .

I think if we wanted to make real progress here, we'd have to
consult with numerical analyist specialists.
But for me the topic is too remote to be followed up further at
the moment.

One conclusion for me is that to estimate the rank of a matrix
in current versions of R, one should use

  rankMat - function(x)  qr(x, LAPACK = TRUE)$rank

(as was suggested as one possibility in the original thread)

Regards, and thank you again, Ravi.

Martin Maechler, ETH Zurich


RV 2.  
RV Here I first, created a random rectangular matrix, and then added a 
number of rows to it, where these new rows are the same as some of the original 
rows except for a tiny amount of noise, which I call eps.  So, the degree of 
rank deficiency is controlled by eps.  I let eps take on 3 values: 1.e-07, 
1.e-10, and 1.e-14, and show that the optimal tol (in terms of being close to 
rankMat) depends on the level of precision (eps) in the matrix.
 set.seed(123)
 nrow - 15
 ncol - 20
 nsim - 5000
 ndefic - 4  # number of nearly dependent rows
 eps - 1.e-07
 rnk - matrix(NA, nsim, 5)
 for (j in 1:nsim) {
RV + A - matrix(rnorm(nrow*ncol),nrow, ncol)
RV + newrows - matrix(NA, ndefic, ncol)
RV + for (i in 1:ndefic) {
RV + newrows[i,] - A[nrow-i,] + rnorm(ncol, sd=min(abs(A[nrow-i+1,]))* 
eps)
RV + }
RV + 
RV + B - rbind(A,newrows)
RV + rnk[j,1] - rankMat(B)
RV + rnk[j,2] - qr(B, tol = 1.e-07)$rank
RV + rnk[j,3] - qr(B, tol = 1.e-11)$rank
RV + rnk[j,4] - qr(B, tol = 1.0e-14)$rank
RV + rnk[j,5] - qr(B

Re: [R] Package design, placement of legacy functions

2007-07-23 Thread Martin Maechler

 WA == William Asquith [EMAIL PROTECTED]
 on Sun, 22 Jul 2007 13:29:24 -0500 writes:

WA I have a function XOLD() from a nearly verbatim port of legacy  
WA FORTRAN in a package. I have remplemented this function as XNEW()  
WA using much cleaner native R and built-in functions of R. I have  
WA switched the package to the XNEW(), but for historical reasons would  
WA like to retain the XOLD() somewhere in the package directory  
WA structure. An assertion through a README or other will point to this  
WA historical function and the output from the two should be numerically  
WA equal.

WA Placement in package/R is not an option as XOLD() no longer  
WA constitutes a true user level function, would package/inst/legacyR or  
WA something like that be suitable to the R community?

Yes, put it somewhere inside pkg/inst/
(and end the filename in *.R).

A user of your **installed** package will be able
to use
system.file(legacy.R, package = pkg)
e.g.,
as  source(system.file(legacy.R, package = pkg))

WA Thanks for the guidance. . .

you're welcome.
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nearest correlation to polychoric

2007-07-14 Thread Martin Maechler

 RV == Ravi Varadhan [EMAIL PROTECTED]
 on Fri, 13 Jul 2007 10:52:36 -0400 writes:

RV Martin, I sent you the Matlab code for this which I had
RV obtained from Nich Higham.

RV Cheng, Sheung Hun and Higham, Nick (1998) A Modified
RV Cholesky Algorithm Based on a Symmetric Indefinite
RV Factorization; \emph{SIAM J. Matrix Anal.\ Appl.},
RV \bold{19}, 1097--1110.

RV Do you remember?

I now do .. last November.  I'm embarrassed to admit that I hadn't found the
time at the time and had completely forgotten about it.

But now that Jens already has a version of this in R -- thanks
Jens -- this should really find a home in the R universe

As a matter of fact, I was thinking of porting the algorithm to
the 'Matrix' package eventually,
the specific polychor related case may still better fit to
John Fox' package.

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extending Matrix class

2007-07-14 Thread Martin Maechler

This is from a private question which I'm given permission to
answer in public:

 IF == Ingo Feinerer [EMAIL PROTECTED]
 on Fri, 13 Jul 2007 16:14:07 +0200 writes:

IF Hello, We tried to derive a class from Matrix but had
IF some problems. Maybe you can help us:

library(Matrix)
m - Matrix(c(1:3,rep(0,12),16:20), nrow = 4, ncol = 5)
setClass(TermDocMatrix, representation(Weighting = character),
contains = (Matrix))

IF Now we want to do something like:

IF new(TermDocMatrix, .Data = m, weighting = foobar)

IF which obviously does not work due to the missing .Data
IF slot.  

yes, obviously, indeed.  There is never any .Data slot  in our
matrices. 

IF  Note that we do not know in advance what the
IF matrix m actually is (we only know it is *some*
IF Matrix, e.g., we do not know if it is a dgCMatrix or a
IF lgCMatrix or ...).  Is there a (simple) solution?

Well, yes, but probably not the one you had wanted:

setClass(TD_Matrix,
 representation(data = Matrix, Weighting = character))

A - spMatrix(10,20, i = c(1,3:8),
  j = c(2,9,6:10),
  x = 7 * (1:7))
tdr - new(TD_Matrix, data = A, Weighting = foobar)
tdr



Now I understand that and why you had wanted to do this the
original way you did - which cannot work AFAICS.
OTOH, I wonder if other useRs, particularly those who know about
S4 classes (and the Matrix classes), have better proposals
maybe along the following dream ..

I think what Ingo would want is to say:
let me  extend the full Matrix class hierarchy (or just the
dsparseMatrix sub hierarchy) by a new slot 'Weighting'
and hence by default inherit all methods which I don't
explicitly set myself.
I think this can only partially work, even for a future version
of R, since the inherited methods don't know what to do with
Weighting, but it would already be interesting if all
necessary new methods could be defined semi-automatically:
 1) call the corresponding Matrix method
 2) pass on all my extra slots

Regards,
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] create a matrix from an excel file?

2007-07-14 Thread Martin Maechler

 TS == Tavpritesh Sethi [EMAIL PROTECTED]
 on Sat, 14 Jul 2007 12:59:03 +0530 writes:

TS how do you create a matrix from an excel file read into
TS R by the command read.delim?

data.matrix(.)   {is typical a bit more useful than
as.matrix(.) }

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nearest correlation to polychoric

2007-07-13 Thread Martin Maechler

 DR == Dimitris Rizopoulos [EMAIL PROTECTED]
 on Fri, 13 Jul 2007 14:43:08 +0200 writes:

DR you could also have a look at function posdefify() from
DR package `sfsmisc'.

DR I hope it helps.

Yes, thanks, Dimitris; note that my posdefify() function uses a
pretty arbitrary fudge value for posdefification, namely
  eps.ev = 1e-07

As a matter of fact, earlier this week (in the first
international R/Rmetrics workshop),
I've talked to people from finance who also need that (or
something better?).

Jordi Molins Coronado (Madrid) drew my attention to an idea
he found in the book (English re-edition of French as of 1996)
Jean-Philippe Bouchaud (2000)
Theory of Financial Risk and Derivative Pricing:
From Statistical Physics to Risk Management

which supposedly uses theory of random matrices and
the entailing distribution of random eigenvalues in order to
find a more sensible cutoff than my 'eps.ev' default of 1e-7.
Unfortunately that book is checked out from our library and I
can't have a look.   Googling and Wikipedia seem to indicate to
me that most of the random matrix theory does not directly apply
here, since I'm really interested in the spectrum of X'X where X
is a de-meaned n x p random matrix.


OTOH, help(posdefify) already mentions more sophisticated
approaches to the problem, the one I (as I vaguely remember)
should be made available being 

  Cheng, Sheung Hun and Higham, Nick (1998)
  A Modified Cholesky Algorithm Based on a Symmetric Indefinite Factorization;
  \emph{SIAM J. Matrix Anal.\ Appl.}, \bold{19}, 1097--1110.


Jens, could you make your code (mentioned below) available to
the community, or even donate to be included as a new method of
posdefify() ?

Regards,
Martin Maechler, ETH Zurich


DR Best, Dimitris

DR  Dimitris Rizopoulos Ph.D. Student Biostatistical
DR Centre School of Public Health Catholic University of
DR Leuven

DR Address: Kapucijnenvoer 35, Leuven, Belgium Tel:
DR +32/(0)16/336899 Fax: +32/(0)16/337015 Web:
DR http://med.kuleuven.be/biostat/
DR http://www.student.kuleuven.be/~m0390867/dimitris.htm


DR - Original Message - From: Jens Oehlschlägel
DR [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent:
DR Friday, July 13, 2007 2:25 PM Subject: [R] nearest
DR correlation to polychoric


DR Dear all,

DR Has someone implemented in R (or any other language)

DR Knol DL, ten Berge JMF.  Least-squares approximation of
DR an improper correlation matrix by a proper one.
DR Psychometrika, 1989, 54, 53-61.

DR or any other similar algorithm?

DR Best regards


DR Jens Oehlschlägel


DR Background:

DR I want to factanal() matrices of polychoric correlations
DR which have negative eigenvalue. I coded

DR Highham 2002 Computing the nearest correlation matrix -
DR a problem from finance, IMA Journal of Numerical
DR Analysis (2002), 22, 329-343.

DR which basically works but still leaves very small
DR negative eigenvalues which causes factanal() to fail
DR with

 factanal(cov=ncor$cor, factors=2)
DR Fehler in optim(start, FAfn, FAgr, method = L-BFGS-B,
DR lower = lower, : L-BFGS-B benötigt endliche Werte von
DR 'fn' Zusätzlich: Warning message: NaNs wurden erzeugt
DR in: log(x)
 version
DR_ platform i386-pc-mingw32 arch i386 os
DR mingw32 system i386, mingw32 status major 2 minor 4.1
DR year 2006 month 12 day 18 svn rev 40228 language R
DR version.string R version 2.4.1 (2006-12-18)

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem/bug with smooth.spline and all.knots=T

2007-07-05 Thread Martin Maechler

 H == Hubertus  [EMAIL PROTECTED]
 on Wed, 4 Jul 2007 14:58:44 +0200 writes:

H Dear list,
H if I do

H smooth.spline(tmpSec, tmpT, all.knots=T)

H with the attached data,
Thanks for providing the data via URL (see below)

H with the attached data, I get this error-message:
H Error in smooth.spline(tmpSec, tmpT, all.knots = T) :
H smoothing parameter value too small
H If I do
H smooth.spline(tmpSec[-single arbitrary number], tmpT[-single arbitrary 
number], all.knots=T)

H it works!

not quite (see below)

I have transformed your reports into something reproducible
(as *all* such R-help  messages should be!):

dFile - /sspl_Hub_data.rda
## use a file name you can write to

if(file.exists(dFile)) {
load(dFile)
} else {
load(url(http://phi.stonemonkey.org/smoothspline.rda;))
d.tmp - data.frame(Sec = tmpSec, T = tmpT)
save(d.tmp, file = dFile)
}
str(d.tmp)
## 'data.frame':2055 obs. of  2 variables:
##  $ Sec: num  25743 25744 25745 25746 25747 ...
##  $ T  : num  288 288 288 288 288 ...

## Dear list,
## if I do

ssT - with(d.tmp, smooth.spline(Sec, T, all.knots=TRUE))

## with the attached data, I get this error-message:
##   Error in smooth.spline(tmpSec, tmpT, all.knots = T) :
## smoothing parameter value too small

## MM: it works fine for me on 64-bit, RH5 (lynne),
## I but can confirm the problem on my 32-bit Pentium M notebook:
ssT - with(d.tmp, smooth.spline(Sec, T, all.knots=TRUE,
 control.spar = list(trace = TRUE)))
## gives (on nb-mm):
## sbart (ratio =   1.1399039e-11) iterations; initial tol1 = 3.334042e-05 :
##sparGCV  b - a   e  Kind   NEW lspar 
crit
##  
---
## -0.35410197 0.000220934951 3.e+00   0 GS -- 1.61033e-11   
0.00027127
## -0.35410197 0.000220934951 1.8541e+00  1.8541 FP GS 8.47468e-20  
1.57859e-05
## -0.79179607 1.57858803e-05 1.1459e+00 -1.1459 FP GS 9.41382e-22
3.555e-12
## -1.06230590 3.55500244e-12 7.0820e-01 -0.7082 FP PI 3.86481e-21  
2.08386e-10
## -1.06230590 3.55500244e-12 5.2259e-01-0.27051 FP PI  1.9073e-21  
1.13107e-09
## -1.06230590 3.55500244e-12 4.8014e-010.084898 FP GS 5.83319e-23  
1.04372e-18
## -1.22949017 1.04371872e-18 4.3769e-01-0.43769 FP PI 2.30006e-22  
4.72251e-15
## -1.22949017 1.04371872e-18 3.5298e-01-0.16718 FP PI  1.1561e-22  
2.37808e-16
## -1.22949017 1.04371872e-18 3.1163e-010.082471 FP PI 7.90222e-23  
2.03459e-15
## -1.22949017 1.04371872e-18 2.8876e-010.041121 FP GS  1.0457e-23  
2.03459e-15
##  1.0457e-23  1.04372e-18
## Error in smooth.spline(Sec, T, all.knots = TRUE, control.spar = list(trace = 
2)) :
##  smoothing parameter value too small

## Where it works, I get
ssT
## Call:
## smooth.spline(x = Sec, y = T, all.knots = TRUE)

## Smoothing Parameter  spar= -1.044387  lambda= 1.266990e-21 (22 iterations)
## Equivalent Degrees of Freedom (Df): 2054.954
## Penalized Criterion: 4.420969e-20


## OTOH, need a relaxed convergence criterion:
ssT. - with(d.tmp,
 smooth.spline(Sec, T, all.knots=TRUE,
   control.spar = list(trace = TRUE,
   tol=1e-4, eps= 0.01)))
ssT.
##- Df: 2055.163 --- interpolation 

##-- really almost fails to do spline *interpolation* as a
##limiting case.

In principle that's well know:
the smoothing spline solution tends to an interpolation spline
when lambda - 0, however the formula (algorithm) that is used
for smoothing spline fitting( given lambda) is not applicable to
the limiting case, lambda=0.

I agree that this is a small remaining flaw of an already
somewhat sophisticated algorithm for determining the smoothing
parameter.
I'm sure we could fix the algorithm to work in this case,
however to do the fix in such a way that all other cases remain
returning the exact same solution as now, seems a harder task.
And changing the default behavior of smooth.spline() even if only
slightly, is quite a bit problematic.
I know it, because I did it many many R versions back... 

For you data, see many possibilities;
- don't use all.knots = TRUE
- really use spline interpolation,
  library(splines)
  ?interpSpline 
- use 'df = something reasonable' if you want to apply it to
  many datasets

- don't use 'all.knots = TRUE'  or use 'cv=TRUE' or ... :

## What about proper crossvalidation instead of GCV (default)
## which I'd recommend anyway on todays fast computers :
ssTg - with(d.tmp,
 smooth.spline(Sec, T, all.knots=TRUE, cv= TRUE,
   control.spar = list(trace = TRUE)))

## *does* work (interestingly)
ssTg # hmm, also DF = 2055
plot(T ~ Sec, data = d.tmp, pch = .)
lines(predict(ssTg), col=2)

## don't use all.knots:
ssTg. - with(d.tmp,
  smooth.spline(Sec, T, cv= TRUE,
control.spar =

Re: [R] align() function missing in R ?

2007-06-29 Thread Martin Maechler

Hi Markus,

You can't assume that a typical R users knows much about S+.
R has been beyond S+ for a long time  
   {{ :-) :-) please Insightful staff, don't start to jump at me !}}
Even I, as a very long time S and Splus user (of the past: 
1987--~1997), have never, I think, used align().

Can you give *reproducible examples* of what  align() does for you?
Then, kind R users will typically show you simple ways to achieve the
same.

Also: R is Free Software (i.e. open source and more), so
  we'd be happy to accept offers of an align() function that
  behaved compatibly (``or better'') than the S-plus one.
Note however that you'd typically not be allowed to copy the
S-plus implementation.

Martin


 ML == Markus Loecher [EMAIL PROTECTED]
 on Thu, 28 Jun 2007 11:10:51 -0400 writes:

ML Dear list members, I switched from Splus to R a few
ML years ago and so far found no functionality missing.
ML However, I am struggling to find the equivalent align()
ML function for time series. I did find some reduced
ML functionality such as alignDailySeries in
ML package:fCalendar but the full capability of aligning
ML two timeseries seems to be missing.  Could this be true
ML ? I am sure there must be a need for this useful
ML function.  Any help would be greatly appreciated.

ML Thanks !

ML Markus

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Boxplot issues

2007-06-22 Thread Martin Maechler

 SE == S Ellison [EMAIL PROTECTED]
 on Fri, 22 Jun 2007 13:02:20 +0100 writes:

SE Boxplot and bxp seem to have changed behaviour a bit of late (R 2.4.1). 
Or maybe I am mis-remembering.
SE An annoying feature is that while at=3:6 will work, there is no way of 
overriding the default xlim of 0.5 to n+0.5. That prevents plotting boxes on, 
for example, interval scales - a useful thing to do at times. I really can see 
no good reason for bxp to hard-core the xlim=c(0.5, n+0.5) in the function 
body; it should be a parameter default conditional on horizontal=, not hard 
coded.

SE Also, boxplot does not drop empty groups. I'm sure it used to. I know 
it is good to be able to see where a factor level is unpopulated, but its a 
nuisance with fractional factorials and some nested or survey problems when 
many are known to be missing and are of no interest. Irrespective of whether my 
memory is correct, the option would be useful. How hard can it be to add a 
'drop.empty=F' default to boxplot to allow it to switch?

SE Obviously, these are things I can fix locally. But who 'owns' boxplot 
so I can provide suggested code to them for later releases? 


Legally speaking, I think that's a hard question the answer of
which may even depend on the country where it is answered.
I would like to say it is owned by the R Foundation.

Suggested improvements of the R base code should be made and
discussed on the R-devel mailing list. That's exactly the main
purpose of that list.  
Such propositions typically make it into the code base
if you are convincing and you provide code improvements that
convince at least one member of R core that it's worth his time
to implement, document, *and* test the changes.

Also, as on R-help, it helps to work with small reproducible
(ideally cut-n-pastable) R code examples.

Regards,
Martin Maechler

SE Steve Ellison

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix package, CHOLMOD error: problem too large

2007-06-22 Thread Martin Maechler

[Jose, if you call the Matrix *package* library once more, ...
 GR! ..]
 

 DM == Duncan Murdoch [EMAIL PROTECTED]
 on Fri, 22 Jun 2007 14:04:03 -0400 writes:

DM On 6/22/2007 1:26 PM, Jose Quesada wrote:
 I have a pretty large sparse matrix of integers:
 dim(tasa)
 [1] 91650 37651
 
 I need to add one to it in order to take logs, but I'm
 getting the following error:
 
 tasa = log(tasa + 1)
 CHOLMOD error: problem too large Error in
 asMethod(object) : Cholmod error `problem too large'
 
 I have 2 Gb of RAM, and the current workspace is barely
 300mb.  Is there any workaround to this? Anyone has any
 experience with this error?
 

DM If tasa is sparse, then tasa+1 will not be sparse, so
DM that's likely your problem.

[of course]

DM You might have better luck with

DM log1p(tasa)

{very good point, thank you, Duncan!}

DM if the authors of the Matrix package have written a
DM method for log1p(); if not, you'll probably have to do
DM it yourself.

They have not yet.

Note however that this - and expm1() - would automagically work
for sparse matrices if these two functions were part of the
Math S4 group generic.

I'd say that there's only historical reason for them *not* to be
part of Math, and I am likely going to propose to change this


Martin Maechler

DM Duncan Murdoch

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] if within a function

2007-06-21 Thread Martin Maechler

 HO == Hong Ooi [EMAIL PROTECTED]
 on Thu, 21 Jun 2007 15:49:42 +1000 writes:

HO R doesn't use the 'functionname = result' idiom to return a value from a
HO function. It looks like you're after:

HO aaa - function(a)
HO {
HO   if(a == 1) return(1)
HO   if(a != 1) return(2)
HO }


HO or


HO aaa - function(a)
HO {
HO   if(a == 1) 1
HO   else 2
HO }

HO see ?return

or to continue the Variations on a theme :

   aaa - function(a)  if(a == 1) 1 else 2

(You don't need { .. } :
   some people argue you should
   always use them for defensive programming 
   where I would not use them in simple one liners,
   but would use them otherwise
)

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Could not find lmer function in {Matrix} package

2007-06-20 Thread Martin Maechler

 SB == Steve Brady [EMAIL PROTECTED]
 on Tue, 19 Jun 2007 11:59:15 -0400 writes:

SB I am having trouble calling the lmer function in the {Matrix}  

SB package.  I first installed and loaded {Matrix} as follows:

 install.packages(Matrix)
 library(Matrix)

SB The package loaded successfully, however when I attempted to call  
SB lmer, I received the following message:

SB Error: could not find function lmer

SB I also tried:

SB  ?lmer

SB which produced no search results.

And who told you  lmer() was in the Matrix package ?
It's in the lme4 package, and --- conceptually has always been there --
Only for some maintenance convenience (C code shared between
lme4 and Matrix) reasons, lmer() has actually been in the Matrix
package for some time in the past, 
however you were always supposed to say

require(lme4)  or  library(lme4)

to get to lmer.

Regards,
Martin

SB I have attempted these actions in both MacOSx R Versions 2.4.1 and  
SB 2.5.0.  I have also tried this in Version 2.5.1. beta on a Windows  
SB machine.

SB Thanks in advance for any suggestions.

SB Steve

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Matrix library error: should never happen; please report

2007-06-20 Thread Martin Maechler

Hi Jose,

 JQ == Jose Quesada  [EMAIL PROTECTED]
 on Tue, 19 Jun 2007 21:12:53 +0200 writes:

JQ Hi, I got the following error. Sorry but this time I
JQ couldn't reproduce it with a simple chunk of code:

 .TM.repl.i.2col(): drop 'matrix' case ...
 Error in .nextMethod(x = x, i = i, j = j) :
  'i' has no integer column number should never happen; please report
 In addition: Warning messages:
 1: Ambiguous method selection for %*%, target ddiMatrix#dgCMatrix (the  
 first of the signatures shown will be used)
  diagonalMatrix#CsparseMatrix
  ddenseMatrix#CsparseMatrix
   in: .findInheritedMethods(classes, fdef, mtable)
 

JQ I got 4 other copies of the same warning. Will play
JQ around a bit more...  This is really strange.

Yes, but

- the Matrix library is the file Matrix.so or Matrix.dll which
 is part of the installed (aka binary) Matrix *package*
 Maybe you really need to read the result of
fortune(package.*Maechler)  # after installing package 'fortunes'

- please report was not meant to say to report to R-help,
  but to the package maintainers,

- since you cannot reproduce it yet, we cannot do much about it.
  It may be a bug in the Matrix package (and Jose has told me
  that he's using the latest released version 0.99875-2),
  but in theory it could even be your own mistake, namely by
  wrongly manipulating the slots of a Matrix object.

Please try to produce an R script - even if not small -- with a
reproducible example;
[and then do report to  [EMAIL PROTECTED]

JQ Thanks -- Jose Quesada, PhD.

Best regards,
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Odp: Odp: outlying

2007-06-20 Thread Martin Maechler

[Note: CC'ing to R-SIG-robust, the Special Interest Group on
 using Robust Statistics in R ]

 PP == Petr PIKAL [EMAIL PROTECTED]
 on Tue, 19 Jun 2007 12:55:37 +0200 writes:

PP [EMAIL PROTECTED] napsal dne 19.06.2007
PP 12:23:58:
 Hi
 
 It often depends on your attitude to limits for outlying
 observations.  Boxplot has some identifying routine for
 selecting outlying points.
 
 Any procedure usually requires somebody to choose which
 observation is outlying and why. You can use e.g. all
 values which are beyond some threshold based on sd but
 that holds only if distribution is normal.

yes, and that's never true for the alternative, i.e. for the
case where there *are* outliers.

 set.seed(1) 
 x-rnorm(x)

PP Sorry, it shall be

PP x - rnorm(1000)

PP ul - mean(x) +3*sd(x)
PP ll - mean(x) -3*sd(x)
PP beyond - (xul)  | ( x ll)
PP 
PP  x[beyond]
PP [1] 3.810277

 Regards Petr

No, really, do NOT do the above!
It only works with very few and relatively mild outliers.

There are much more robust alternatives.
I show them for the simple example

x - c(1:10, 100)

1) As mentioned by Petr,  use instead what  boxplot() does,
  just type
 boxplot.stats

  and ``see what to do''.  This gives   Median +/- 1.5 * IQR :
  i.e.,

 ## Boxplot's default rule
 str(bp.st - boxplot.stats(x))
 bp.st$stats[ c(1,5) ]
 ## 1  10

2) Use the recommendations of  Hampel (1985)

   @ARTICLE{HamF85,
 author =   Hampel, F.,
 title =The breakdown points of the mean combined with some
 rejection rules, 
 journal =  Technometrics,
 year = 1985,
 volume =   27,
 pages =95--107,
   }

   
   i.e.   Median +/- 5 * MAD   where MAD = is the *NON*-scaled MAD,
 ~=  mad(*, constant=1)
   i.e., in R

   M - median(x)
   (FH.interval - M +  c(-5, 5) * mad(x, center=M, const=1))
   ## -9 21

3) or something slightly more efficient (under approximate
  normality of the non-outliers),
  e.g., based on  MASS::rlm() :

 n - length(x)
 s.rm - summary(robm - MASS::rlm(x ~ 1))
 s.rm

 (cc - coef(s.rm))

 ## approximate robust degrees of freedom; this is a hack
 ##   which could well be correct
 ##   asymptotically {where the weights would be 0/1} :
 (df.resid - sum(robm$w) - robm$rank)
 (Tcrit - qt(0.995, df = df.resid))

 ## Std.error of mean ~= sqrt(1/n Var(X_i)) =  1/sqrt(n) sqrt(Var(X_i))
 cc[,1] + c(-1,1) * sqrt(n) * Tcrit * cc[,Std. Error]
 ##  -6.391201 18.555177


---
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Augment 'Matrix' matrices

2007-06-19 Thread Martin Maechler

 SH == Scott Hyde [EMAIL PROTECTED]
 on Mon, 18 Jun 2007 16:59:00 -1000 (HST) writes:

SH Martin, How does Matrix implement augmented matrices?  I
SH tried this and got the expected result:


{Replying to  R-help,  since this question has come up several
 times }

 V=matrix(1,2,3)
 V=cbind(V,V)
 V
SH  [,1] [,2] [,3] [,4] [,5] [,6]
SH [1,]111111
SH [2,]111111

SH But when I did it with Matrix instead I got:

 library(Matrix)

 V=Matrix(1,2,3)
 V=cbind(V,V)
 V
SH V V
SH [1,] ? ?

cbind() and rbind() cannot work with S4 objects because their
first formal argument is  '...'
[ and with S3 objects they only work insofar as S3 generics can
  work: i.e. they only work when the first argument is of the
  respective class, but fail, e.g. for  cbind(1, object)
  when object is of a non-standard S3 class.
]
In earlier versions of Matrix, there was a sophisticated hack
that made  cbind() and rbind()   work.

But because it was a hack, and some people called it horrible
rather than sophisticated, we had to give it up.
{well, the really compelling argument was an example of
 do.call(rbind, list of length 1000) which was *very* inefficient}

Instead, cbind2() and rbind2() have been written a
few R versions ago to be used as (S4) generic functions. 
--  help(cbind2)

In 'Matrix', we also define cBind() and rBind()  to be used as
direct (n-argument) substitutes for cbind() or rbind(),
respectively.

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Responding to a posting in the digest

2007-06-18 Thread Martin Maechler

Thanks a lot, Ted, for your comprehensive answer!

[See one short note way below: ]

 TH == Ted Harding [EMAIL PROTECTED]
 on Thu, 14 Jun 2007 09:54:04 +0100 (BST) writes:

TH On 14-Jun-07 07:26:26, Moshe Olshansky wrote:
 Is there a convenient way to respond to a particular
 posting which is a part of the digest?  I mean something
 that will automatically quote the original message,
 subject, etc.
 
 Thank you!
 
 Moshe Olshansky [EMAIL PROTECTED]

TH This will depend on two things.

TH 1. Whether the mail software you use has the capability;
TH 2. Whether the digest format would permit it anyway.

TH Regarding (2), if you are receiving R-help in
TH traditional digest format (all the messages, each with
TH its principal headers, as one single long message-body),
TH then the only way to respond to a particular message is
TH to start to compose a new message and copy what you need
TH from the digest.

TH While I've never reveived R-help in digest format
TH myself, according to Martin Maechler:

TH http://finzi.psych.upenn.edu/R/Rhelp02a/archive/59429.html

TH   Please open the URL at the end of every message
TH https://stat.ethz.ch/mailman/listinfo/r-help go to the
TH bottom and log in -- clicking the [Unsubscribe or Edit
TH Options] field. You need your mailing list password
TH sooner or later. The one you get sent every 1st of the
TH month; or you can have it sent to you again.

TH   Then you are in a page entitled R-help Membership
TH Configuration for foo@bar Scroll down to the section
TH Your R-help Subscription where the 3rd entry is
TH entitled Get MIME or Plain Text Digests?  and now you
TH want MIME.

TH In MIME digest format, each message with its own main
TH headers is a separate MIME attachment, and suitable mail
TH software can bring any message up on its own, You can
TH then reply in the normal way.

TH However (and here is where I'm ignorant as a result of
TH never having received R-help as digest), your reply may
TH not continue the thread -- since this depends on
TH message-identifier headers being present which allow
TH threading software to trace which messages are replies
TH to which message. The JISCMAIL MIME digest for the
TH AllStat mailing list only includes a Message-ID for the
TH digest as a whole, i.e. the ID for the entire digest
TH message.  Message-IDs for the individual messages in the
TH digest (as would be seen by people who received them
TH singly) are absent: you only get the likes of

TH   Date: DoW, DD Mon  HH:MM:SS TZ From: Sender
TH (person who sent the message to the list) Subject:
TH Subject of individual message MIME-Version: 1.0
TH Content-Type: text/plain; charset=iso-8859-1
TH Content-Transfer-Encoding: quoted-printable

TH and no Message ID for the original message from
TH Sender. So any reply to this component message is not
TH identifiable as belonging to its thread.

TH I don't know whether R-help's 'mailman' provides such
TH headers (Martin??). 

Yes, it does (I've checked with a pseudo-user who receives
r-help in digests in MIME format). 
So you can indeed do the following.

In my limited experience, the main problem is the bad quality of
people'e e-mail software which does not properly work with
the (typically invisible) 'References:' and 'In-Reply-To:'
headers which mailman indeed does preserve in its MIME-digests.

TH If it does, then your reply could
TH include an In-Reply-To: which identifies the
TH thread. Otherwise it can't.

TH As to (1), you will probably get several suggestions for
TH suitable mail software. My own (see below) opens an
TH AllStat digest in a window with attachment tags
TH displayed, one for Tablf of Contents, one for each
TH message. Clicking on one of these opens a new window
TH with the message attached to that tag displayed, and now
TH the usual reply/forward etc mail sunctions can be
TH applied to that message.  But it will reply only to the
TH address given in the From: header (i.e. the original
TH sender, as above), not to the AllStat list (so you have
TH to enter that address by hand, if you want to reply to
TH the list).

TH In principle, mailer software could also identify the
TH address of the list from which the digest has been sent,
TH as well as the sender of the original message, so you
TH could get the option to reply to either or both. But my
TH XFMail does not, and only offers the original
TH sender. Whether other mailer software can do this is for
TH others to comment on!

TH Hoping this helps, Ted.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org

Re: [R] Package update announcements

2007-06-12 Thread Martin Maechler

 MHHS == Martin Henry H Stevens [EMAIL PROTECTED]
 on Mon, 11 Jun 2007 17:23:46 -0400 writes:

MHHS Hi Folks, I was wondering what everyone thought about
MHHS adding a sentence to each package update announcement
MHHS that described what the package did. R extensions are
MHHS so numerous that it is difficult to keep up with them.
MHHS Would it be appropriate to ask package developers to
MHHS add a brief sentence about what the package does, when
MHHS they announce updates?

Thanks a lot, Hank!

I've been supporting your suggestion ever since I had created
'R-packages' ( = R package announcements mailing list).

As moderator, I even occasionally rejected postings to
R-packages exactly by requiring such short information preceding
any notice of changes.

I think that 99% of the package authors would also agree, but
then we are all humans and hence sometimes get into peculiar
world views such as there's my package, and there's R and then
rest of the universe ;-)

Martin Maechler, ETH Zurich

MHHS I would benefit from such descriptions.

MHHS Cheers, Hank

MHHS Dr. Hank Stevens, Assistant Professor 338 Pearson Hall
MHHS Botany Department Miami University Oxford, OH 45056

MHHS Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513)
MHHS 529-4243 http://www.cas.muohio.edu/~stevenmh/
MHHS http://www.muohio.edu/ecology/
MHHS http://www.muohio.edu/botany/

MHHS E Pluribus Unum

MHHS __
MHHS R-help@stat.math.ethz.ch mailing list
MHHS https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
MHHS read the posting guide
MHHS http://www.R-project.org/posting-guide.html and
MHHS provide commented, minimal, self-contained,
MHHS reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pnorm how to decide lower-tail true or false

2007-06-09 Thread Martin Maechler

 CM == Carmen Meier [EMAIL PROTECTED]
 on Fri, 08 Jun 2007 19:31:49 +0200 writes:

CM Hi to all, maybe the last question was not clear enough.
CM I did not found any hints how to decide whether it
CM should use lower.tail or not.  As it is an extra
CM R-feature ( written in
CM http://finzi.psych.upenn.edu/R/Rhelp02a/archive/66250.html
CM ) I do not find anything about it in any statistical
CM books of me.

Yes, most statistical books do not consider numerical accuracy
which is the real issue here.
Note that R is much more than a statistical package and hence
to be appreciated properly needs much broader (applied)
mathematical, statistical and computer science knowledge ;-)

When  p ~= 1,  '1 - p' suffers from so called cancellation
(Numerical analysis 101).
If you already know that you will use q := 1 - p,
rather compuate 'q' directly than first compute p, then 1-p,
losing all accuracy.

All of R's  pfoo(..) functions have an argument 'lower.tail'
which is TRUE by default, since after all,

  pfoo(x) = Prob_{foo}[X = x]   

measures the probability of the lower or left tail of the
foo-distribution.
foo = norm  is just a special case.
If you really want 
   q =  1 - pfoo(x) = Prob_{foo}[X  x]   

then you can get this directly via
 
   q - pfoo(x, lower.tail = FALSE, )


Simple example with R :

 pnorm(10)
[1] 1
 1 - pnorm(10)
[1] 0
 pnorm(10, lower.tail=FALSE)
[1] 7.619853e-24


Regards,
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why is the R mailing list so hard to figure out?

2007-06-05 Thread Martin Maechler

VE == Vladimir Eremeev [EMAIL PROTECTED]
on Tue, 5 Jun 2007 01:39:47 -0700 (PDT) writes:

VE irishhacker wrote:

Why does the R mailing list need such an unusual and
customized user interface?

Well, that's VERY MUCH a matter of point of view, and the view
is most probably very highly dependent on your year of birth.
For many old timers like me, e-mail is at least as natural as
web browsing. I see that such an attitude becomes more and more
outdated.

VE There was a discussion of this some time ago on the
VE list. I believe, RSiteSearch(r-help mailing list
VE forum) or some other similar keywords will find it.

VE irishhacker wrote:

What's the best way to view and read discussions in this
group for recent days? Can I view the postings for the
current day via Google Groups?

Well, one way is still *subscribing* to R-help and
occasionally helping each other,
instead of just perusing it as a resource ...

GMANE was mentioned already.

VE I use the Nabble interface
VE (http://www.nabble.com/R-f13819.html), it seems more
VE convenient to me, but it collects less lists than Gmane.

Yes, but it appends its silly footer (aka advertizement) to
all mails sent out, luring people to use it.

I (as mailing list manager) have considered more than once to
just evaporate such footers as I evaporate most other
spam-footers (as e.g. from hotmail, yahoo, gmx, ...).

Martin Maechler, ETH Zurich

VE -- View this message in context:
VE
http://www.nabble.com/Why-is-the-R-mailing-list-so-hard-to-figure-out--tf3868661.html#a10965539
VE Sent from the R help mailing list archive at Nabble.com.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] opinions please: text editors and reporting/Sweave?

2007-05-31 Thread Martin Maechler

 Jared == Jared O'Connell [EMAIL PROTECTED]
 on Thu, 31 May 2007 11:28:11 +0800 writes:

Jared Winshell (http://www.winshell.de/) is another (free) option if you 
want a
Jared Windows editor with good MikTEX integration.

Looks like it.
Note however that the above free is only as in free beer not
as in free speech. 
In other words, Winshell is *not* 'Free Software' / 'Software
Libre' nor is it Open Source Software.

Tinn-R, as R itself, *is* Free Software 
(and Emacs and ESS and (La)TeX are too).

Martin Maechler, ETH Zurich

Jared On 5/31/07, Duncan Murdoch [EMAIL PROTECTED] wrote:
 
 Tim Howard wrote:
  dear all -
 
  I currently use Tinn-R as my text editor to work with code that I 
submit
 to R, with some output dumped to text files, some images dumped to pdf.
 (system: Windows 2K and XP, R 2.4.1 and R 2.5). We are using R for
 overnight runs to create large output data files for GIS, but then I need
 simple output reports for analysis results for each separate data set. 
Thus,
 I create many reports of the same style, but just based on different 
input
 data.
 
  I am recognizing that I need a better reporting system, so that I can
 create clean reports for each separate R run. This obviously means using
 Sweave and some implementation of LaTex, both of which are new to me. 
I've
 installed MikTex and successfully completed a demo or two for creating 
pdfs
 from raw LaTeX.
 
  It appears that if I want to ease my entry into the world of LaTeX, I
 might need to switch editors to something like Emacs (I read somewhere 
that
 Emacs helps with the TeX markup?). After quite a while wallowing at the
 Emacs site, I am finding that ESS is well integrated with R and might be 
the
 way to go. Aaaagh... I'm in way over my head!
 
 
 If you are used to Windows, you might find the shareware editors WinEdt
 or Textpad more familiar.  WinEdt has advantages of lots of LaTeX
 integration.
 
 Duncan Murdoch
  My questions:
 
  What, in your opinion, is the simplest way to integrate text and
 graphics reports into a single report such as a pdf file.
 
  If the answer to this is LaTeX and Sweave, is it difficult to use a 
text
 editor such as Tinn-R or would you strongly recommend I leave behind Tinn
 and move over to an editor that has more LaTeX help?
 
  In reading over Friedrich Leisch's Sweave User Manual (v 1.6.0) I am
 beginning to think I can do everything I need with my simple editor. 
Before
 spending many hours going down that path, I thought it prudent to ask 
the R
 community.
 
  It is likely I am misunderstanding some of the process here and any
 clarifications are welcome.
 
  Thank you in advance for any thoughts.
  Tim Howard
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Jared [[alternative HTML version deleted]]

Jared __
Jared R-help@stat.math.ethz.ch mailing list
Jared https://stat.ethz.ch/mailman/listinfo/r-help
Jared PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Jared and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to check for existence url from within a function?

2007-05-30 Thread Martin Maechler

 Duncan == Duncan Murdoch [EMAIL PROTECTED]
 on Sat, 26 May 2007 08:02:11 -0400 writes:

Duncan On 26/05/2007 7:13 AM, Heinz Tuechler wrote:
 Dear All,
 
 To check if an url exists, I can use try(). This works, as I expected, 
if I
 do it directly, as in the first part of the following example, but I 
could
[..]


Duncan .Last.value isn't set until your function returns.  You should 
write this as

Duncan con.url - try(url(url.string, open='rb'))
Duncan try.error - inherits(con.url, try-error)

Duncan Notice that I used inherits, rather than testing for equality.  
It's 
Duncan documented that the result of try() will be of class 'try-error' 
if an 
Duncan error occurs, but there may be circumstances (in the future?) where 
Duncan different types of errors are signalled by using a more complicated 
class.

There's an additional reason for inherits(.,.)  and hence
against  class(foo) == bar :

Whenever try() does not catch an error, since there was none,
i.e., the result of try(foobar(..)) is just foobar(..),
foobar(..) may well return an object with an (S3) class vector
of length  1.
In those cases, the equality test returns a logical vector,
typically  c(FALSE, FALSE)
and using that in if(.) gives at least a warning.

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ratio distribution - missing attachment

2007-05-28 Thread Martin Maechler

Thank you, Ravi.
You probably will have noticed, that the attachment didn't make
it to the mailing list.

The reason is that the we let the mailing list software strip 
binary attachments which can easily be misused to spread
viruses; see -- http://www.r-project.org/mail.html (search attachment)
or the posting-guide.

OTOH, the software allows attachments with MIME type text/plain.
If you use an e-mail software for sophisticated users, the
software allows you to specify the MIME type of your
attachments;
otherwise (as with most user friendly, modern e-mail
software), attach a *.txt file (Doug Bates uses  foo_R.txt) 
and it should make it to the lists;
as a third alternative, just cut  paste the corresponding
text into your e-mal.

I think your R function should make it to R-help (and its
archives), so I'd be thankful for a repost.

Martin


 Ravi == Ravi Varadhan [EMAIL PROTECTED]
 on Fri, 25 May 2007 14:24:20 -0400 writes:

Ravi Mike, Attached is an R function to do this, along with
Ravi an example that will reproduce the MathCad plot shown
Ravi in your attached paper. I haven't checked it
Ravi thoroughly, but it seems to reproduce the MathCad
Ravi example well.

Ravi Ravi.

Ravi 

Ravi ---

Ravi Ravi Varadhan, Ph.D.

Ravi Assistant Professor, The Center on Aging and Health

Ravi Division of Geriatric Medicine and Gerontology

Ravi Johns Hopkins University

Ravi Ph: (410) 502-2619

Ravi Fax: (410) 614-9625

Ravi Email: [EMAIL PROTECTED]

Ravi Webpage:
Ravi http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html

 

Ravi 

Ravi 


Ravi -Original Message- From:
Ravi [EMAIL PROTECTED]
Ravi [mailto:[EMAIL PROTECTED] On Behalf Of
Ravi Mike Lawrence Sent: Friday, May 25, 2007 1:55 PM To:
Ravi Lucke, Joseph F Cc: Rhelp Subject: Re: [R] Calculation
Ravi of ratio distribution properties

Ravi According to the paper I cited, there is controversy
Ravi over the sufficiency of Hinkley's solution, hence
Ravi their proposed more complete solution.

Ravi On 25-May-07, at 2:45 PM, Lucke, Joseph F wrote:

 The exact ratio is given in
 
 On the Ratio of Two Correlated Normal Random Variables,
 D. V.  Hinkley, Biometrika, Vol. 56, No. 3. (Dec., 1969),
 pp. 635-639.
 

Ravi -- Mike Lawrence Graduate Student, Department of
Ravi Psychology, Dalhousie University

Ravi Website: http://myweb.dal.ca/mc973993 Public calendar:
Ravi http://icalx.com/public/informavore/Public

Ravi The road to wisdom? Well, it's plain and simple to
Ravi express: Err and err and err again, but less and less
Ravi and less.  - Piet Hein

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File path expansion

2007-05-28 Thread Martin Maechler

 Duncan == Duncan Murdoch [EMAIL PROTECTED]
 on Fri, 25 May 2007 15:38:54 -0400 writes:

Duncan On 5/25/2007 1:09 PM, Prof Brian Ripley wrote:
 On Fri, 25 May 2007, Martin Maechler wrote:
 

 path.expand(~)
 [1] /home/maechler
  Yes, but beware that may not do what you want on Windows
 in R = 2.5.0, since someone changed the definition of
 'home' but not path.expand.

Duncan A more basic problem is that the definition of ~
Duncan in Windows is very ambiguous.  Is it my Cygwin home
Duncan directory, where cd ~ would take me while in
Duncan Cygwin?  

most probably not (see below).  The normal R Windows users
needn't know about Cygwin.

Duncan Is it my Windows CSIDL_PERSONAL folder,
Duncan usually %HOMEDRIVE%/%HOMEPATH%/My Documents?  Is it
Duncan the parent of that folder, %HOMEDRIVE%/%HOMEPATH%?

Duncan ~ is a shell concept that makes sense in Unix-like
Duncan shells, but not in Windows.

Hmm..
Let's just say ~ is a short cut for 
The user's home directory.
And yes, that's been a Unix concept for ages, but I think
Windows had adopted that concept, probably with the above 
%HOMEDRIVE%/%HOMEPATH%   

The fact that some of windows software may not work with
the user's home directory (but rather a subdirectory of that),
may be a separate issue, but then,  I'm the windows-non-expert

Martin Maechler

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normality tests [Broadcast]

2007-05-28 Thread Martin Maechler

 LuckeJF == Lucke, Joseph F [EMAIL PROTECTED]
 on Fri, 25 May 2007 12:29:49 -0500 writes:

LuckeJF  Most standard tests, such as t-tests and ANOVA,
LuckeJF are fairly resistant to non-normalilty for
LuckeJF significance testing. It's the sample means that
LuckeJF have to be normal, not the data.  The CLT kicks in
LuckeJF fairly quickly.

Even though such statements appear in too many (text)books,
that's just plain wrong practically:

Even though *level* of the t-test is resistant to non-normality, 
the power is not at all!!  And that makes the t-test NON-robust!
It's an easy exercise to see that  lim T-statistic --- 1  when
one observation goes to infinity, i.e., the t-test will never
reject when you have one extreme outlier; simple proof with R:

 t.test(11:20)

One Sample t-test

data:  c(11:20) 
t = 16.1892, df = 9, p-value = 5.805e-08
alternative hypothesis: true mean is not equal to 0 
95 percent confidence interval:
 13.33415 17.66585 
sample estimates:
mean of x 
 15.5 

##   --- unknown mean highly significantly different from 0
##   But

 t.test(c(11:20, 1000))

One Sample t-test

data:  c(11:20, 1000) 
t = 1.1731, df = 10, p-value = 0.2679
alternative hypothesis: true mean is not equal to 0 
95 percent confidence interval:
 -94.42776 304.42776 
sample estimates:
mean of x 
  105 




LuckeJF Testing for normality prior to choosing a test
LuckeJF statistic is generally not a good idea.

Definitely. Or even: It's a very bad idea ...

Martin Maechler, ETH Zurich


LuckeJF -Original Message- From:
LuckeJF [EMAIL PROTECTED]
LuckeJF [mailto:[EMAIL PROTECTED] On Behalf
LuckeJF Of Liaw, Andy Sent: Friday, May 25, 2007 12:04 PM
LuckeJF To: [EMAIL PROTECTED]; Frank E Harrell Jr Cc:
LuckeJF r-help Subject: Re: [R] normality tests [Broadcast]

LuckeJF From: [EMAIL PROTECTED]
  On 25/05/07, Frank E Harrell Jr
 [EMAIL PROTECTED] wrote:  [EMAIL PROTECTED]
 wrote:   Hi all,
  
   apologies for seeking advice on a general stats
 question. I ve run

   normality tests using 8 different methods:   -
 Lilliefors   - Shapiro-Wilk   - Robust Jarque Bera 
  - Jarque Bera   - Anderson-Darling   - Pearson
 chi-square   - Cramer-von Mises   - Shapiro-Francia
  
   All show that the null hypothesis that the data come
 from a normal

   distro cannot be rejected. Great. However, I don't
 think it looks nice   to report the values of 8
 different tests on a report. One note is

   that my sample size is really tiny (less than 20
 independent cases).Without wanting to start a flame
 war, are there any advices of which   one/ones would be
 more appropriate and should be reported (along with   a
 Q-Q plot). Thank you.
  
   Regards,
  
 
  Wow - I have so many concerns with that approach that
 it's hard to know  where to begin.  But first of all,
 why care about normality?  Why not  use
 distribution-free methods?
 
  You should examine the power of the tests for n=20.
 You'll probably

  find it's not good enough to reach a reliable
 conclusion.
 
 And wouldn't it be even worse if I used non-parametric
 tests?

LuckeJF I believe what Frank meant was that it's probably
LuckeJF better to use a distribution-free procedure to do
LuckeJF the real test of interest (if there is one) instead
LuckeJF of testing for normality, and then use a test that
LuckeJF assumes normality.

LuckeJF I guess the question is, what exactly do you want
LuckeJF to do with the outcome of the normality tests?  If
LuckeJF those are going to be used as basis for deciding
LuckeJF which test(s) to do next, then I concur with
LuckeJF Frank's reservation.

LuckeJF Generally speaking, I do not find goodness-of-fit
LuckeJF for distributions very useful, mostly for the
LuckeJF reason that failure to reject the null is no
LuckeJF evidence in favor of the null.  It's difficult for
LuckeJF me to imagine why there's insufficient evidence to
LuckeJF show that the data did not come from a normal
LuckeJF distribution would be interesting.

LuckeJF Andy

 
   Frank
 
 
  --
  Frank E Harrell Jr Professor and Chair School of
 Medicine  Department of Biostatistics Vanderbilt
 University
 
 
 
 --
 yianni
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide
 http://www.R-project.org/posting-guide.html and provide
 commented, minimal, self-contained, reproducible code.
 
 
 


LuckeJF 

LuckeJF -- Notice: This e-mail message, together with
LuckeJF any attachments,...{{dropped

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Martin Maechler

 Adam == Adam Witney [EMAIL PROTECTED]
 on Fri, 25 May 2007 09:38:29 +0100 writes:

Adam Thanks for your replies Details inline below:

Adam On 24/5/07 17:12, Martin Maechler [EMAIL PROTECTED] wrote:

 UweL == Uwe Ligges [EMAIL PROTECTED]
 on Thu, 24 May 2007 17:34:16 +0200 writes:
 
UweL Some of these test are expected from time to time, since they are
 using 
UweL random numbers. Just re-run.
 
 eehm,  some of these, yes, but not the ones Adam mentioned,
 d-p-q-r-tests.R.
 
 Adam, if you want more info you should report to us the *end*
 (last dozen of lines) of
 your d-p-q-r-tests.Rout[.fail]  file.

Adam Ok, here they are...

  [1] TRUE TRUE TRUE TRUE
   
   ##-- non central Chi^2 :
   xB - c(2000,1e6,1e50,Inf)
   for(df in c(0.1, 1, 10))
  + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) ==1)
  Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
  Execution halted

Ok, thanks;
so, if we want to learn more, we need
the output of something like

  xB - c(2000,1e6,1e50,Inf)
  for(df in c(0.1, 1, 10))
   for(ncp in c(0, 1, 10, 100)) 
   print(pchisq(xB, df=df, ncp=ncp), digits == 15)



UweL BTW: We do have R-2.5.0 these days.
 
 Indeed! 
 
 And gcc 2.95.4 is also very old.
 Maybe you've recovered an old compiler / math-library bug from
 that antique compiler suite ?

Adam Yes, maybe I should start think about upgrading this box!

yes, at least start ... ;-)

Adam Thanks again

Adam adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make check failure for R-2.4.1

2007-05-25 Thread Martin Maechler

 Adam == Adam Witney [EMAIL PROTECTED]
 on Fri, 25 May 2007 14:48:18 +0100 writes:

 ##-- non central Chi^2 :
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
 + for(ncp in c(0, 1, 10, 100)) stopifnot(pchisq(xB, df=df, ncp=ncp) 
==1)
 Error: pchisq(xB, df = df, ncp = ncp) == 1 is not all TRUE
 Execution halted
 
 Ok, thanks;
 so, if we want to learn more, we need
 the output of something like
 
 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
 for(ncp in c(0, 1, 10, 100))
 print(pchisq(xB, df=df, ncp=ncp), digits == 15)

Adam Here is the results:

 xB - c(2000,1e6,1e50,Inf)
 for(df in c(0.1, 1, 10))
Adam +for(ncp in c(0, 1, 10, 100))
Adam +print(pchisq(xB, df=df, ncp=ncp), digits == 15)
Adam Error in print.default(pchisq(xB, df = df, ncp = ncp), digits == 15) :
Adam object digits not found

well, that's a typo - I think - you should have been able to fix
(I said something like ...).
Just do replace the '==' by '='

Martin

Adam Thanks again...

Adam adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] File path expansion

2007-05-25 Thread Martin Maechler


 path.expand(~)
[1] /home/maechler

 RobMcG == McGehee, Robert [EMAIL PROTECTED]
 on Fri, 25 May 2007 11:44:27 -0400 writes:

RobMcG R-Help,
RobMcG I discovered a mis-feature is ghostscript, which is used by the 
bitmap
RobMcG function. It seems that specifying file names in the form 
~/abc.png
RobMcG rather than /home/directory/abc.png causes my GS to crash when I 
open
RobMcG the bitmap device on my Linux box.

RobMcG The easiest solution would seem to be to intercept any file names 
in the
RobMcG form ~/abc.png and replace the ~ with the user's home 
directory. I'm
RobMcG sure I could come up with something involving regular expressions 
and
RobMcG system calls to do this in Linux, but even that might not be system
RobMcG independent. So, I wanted to see if anyone knew of a native R 
solution
RobMcG of converting ~ to its full path expansion.

RobMcG Thanks,
RobMcG Robert

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] windows to unix

2007-05-25 Thread Martin Maechler

 Erin == Erin Hodgess [EMAIL PROTECTED]
 on Fri, 25 May 2007 06:10:10 -0500 writes:

Erin Dear R People:
Erin Is there any way to take a Windows version of R, compiled from 
source, 
Erin compress it, and put it on a Unix-like environment, please?

Since nobody has answered yet, let a die-hard non-windows user
try :

Just 'zip' the corresponding directory and copy the zip
file to your unix like environment.

I assume the only things this does not contain
would be the
- registry entries (which used to be optional anyway;
I'm not sure if that's still true) 
- desktop  links to R
- startup menu links to R

but the last two can easily be recreated after people copy the
zip file and unpack it in their windows enviroment -- 
which I assume is the purpose of the whole procedure..

{Please reply to R-help; not me, I am *the windows-non-expert ..}

Martin


Erin thanks in advance,
Erin Sincerely,
Erin Erin Hodgess
Erin Associate Professor
Erin Department of Computer and Mathematical Sciences
Erin University of Houston - Downtown
Erin mailto: [EMAIL PROTECTED]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Make check failure for R-2.4.1

2007-05-24 Thread Martin Maechler

 UweL == Uwe Ligges [EMAIL PROTECTED]
 on Thu, 24 May 2007 17:34:16 +0200 writes:

UweL Some of these test are expected from time to time, since they are 
using 
UweL random numbers. Just re-run.

eehm,  some of these, yes, but not the ones Adam mentioned,
d-p-q-r-tests.R.

Adam, if you want more info you should report to us the *end*
(last dozen of lines) of
your d-p-q-r-tests.Rout[.fail]  file.


UweL  BTW: We do have R-2.5.0 these days.

Indeed! 

And gcc 2.95.4 is also very old.
Maybe you've recovered an old compiler / math-library bug from
that antique compiler suite ?

Martin

UweL Uwe Ligges


UweL Adam Witney wrote:
 I'm trying to install R-2.4.1, everything configure's and make's OK, but 
the
 make check fails:
 
 running code in 'd-p-q-r-tests.R' ...make[3]: *** [d-p-q-r-tests.Rout] 
Error
 1
 make[3]: Leaving directory `/usr/local/install/R-2.4.1/tests'
 make[2]: *** [test-Specific] Error 2
 make[2]: Leaving directory `/usr/local/install/R-2.4.1/tests'
 make[1]: *** [test-all-basics] Error 1
 make[1]: Leaving directory `/usr/local/install/R-2.4.1/tests'
 make: *** [check] Error 2
 
 This is Debian, gcc 2.95.4. My previous version R-2.1.0 installed ok.
 
 Any idea why this is failing? I have googled the errors, but couldn't 
find
 any resolutions
 
 Thanks for any help
 
 Adam

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reshape a sparse matrix

2007-05-16 Thread Martin Maechler

 Scott == Scott Hyde [EMAIL PROTECTED]
 on Tue, 15 May 2007 17:03:13 -1000 (HST) writes:

Scott Hi,

Scott I'd like to reshape a sparse matrix generated from the Matrix 
package.  I can't seem to do it with the command

Scott dim(A) - c(6,9)

Scott which works perfectly with the base package matrices, but with the 
sparse matrices it errors with

Scott Error in dim(A) = c(6, 9) : dim- : invalid first argument

This *does* work in the current version of Matrix (0.99875-1), actually
already in version 0.99875-0 .

In the next version of Matrix, it will not only work, but also
work sparsely internally via the new class sparseVector and
its daughter classes, on which I've been working during the last
10 days or so...
Interesting that you bring the topic up right now ...


Scott Manipulating the Dim attribute of the sparse Matrix does not produce 
the desired effect. [EMAIL PROTECTED] - c(as.integer(9),as.integer(6)) does 
not produce a column ordering result, which I am assuming is because the data 
is stored in a row (i) and column (j) format instead (class dgTMatrix)

You should not have manipulate slots of S4 classes in general.
Some  people say that you should not even access them directly.

Scott Does a function for this exist?

yes, as I said above  dim(.) - ..  works in the newest versions
of Matrix.

Regards,
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] S-plus coding

2007-05-09 Thread Martin Maechler

 Rolf == Rolf Turner [EMAIL PROTECTED]
 on Fri, 4 May 2007 08:02:21 -0300 (ADT) writes:

Rolf T. Kounouni wrote:
 Hi, how can i use data to forecast next time period value, if data
 has been influenced by a change in legislation?  thank you.

Rolf Well, you could use chicken entrails.

Rolf cheers,

Rolf Rolf Turner
Rolf [EMAIL PROTECTED]

Rolf P. S.  What has this question got to do with ``S-plus coding''?

Yes, indeed.
And even then, what would S-plus coding have to do with R-help?

Please Mr. Kounouni, do read the posting guide *before* 
*any* further posting to R-help!

Rolf PLEASE do read the posting guide
Rolf http://www.R-project.org/posting-guide.html
  ^^^

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] looking for equivalent of matlab's medfilt1 function

2007-05-07 Thread Martin Maechler

 Vladimir == Vladimir Eremeev [EMAIL PROTECTED]
 on Mon, 7 May 2007 07:58:48 -0700 (PDT) writes:

Vladimir Dear all, 
Vladimir I have several files with Matlab code, which I am translating to 
R.

Vladimir For the zero-level approach, I took the very old
Vladimir shell script from R-help archives, which has made
Vladimir some obvious leg-work such as replacement of =
Vladimir with -.

Vladimir Now I am translating indexing, matrix operations and function 
call using
Vladimir this table
Vladimir http://37mm.no/mpy/octave-r.html

You should also look at the 'matlab' package which
defines quite a few R functions such as eyes(), zeros(), repmat(),
to behave as the Matlab functions do.


Vladimir The problem is, I cannot find the R equivalent of the matlab's 
function
Vladimir medfilt1, performing 1-dimensional median filtering of a vector. 
Its summary
Vladimir is here 
http://www-ccs.ucsd.edu/matlab/toolbox/signal/medfilt1.html
To statisticians, this has been known as running medians,
thanks to John Tukey.
The smooth() function contains smart variations of 
running median of 3 which seems to be the matlab default.

For 'k  3', I'd recommend the fast  runmed(x, k)
function which also has a bit more sophisticated end-point
handling than Matlab's medfilt1() seems to provide.

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Issue with the Matrix package

2007-05-04 Thread Martin Maechler

 Tony == Tony Chiang [EMAIL PROTECTED]
 on Fri, 4 May 2007 00:07:04 +0100 writes:

Tony Hi all,

Tony I am wondering if this is a bug in the Matrix package
Tony or if it something that I am just getting wrong...

Tony here is an example: 
[..]

It's a bug.

A shorter example - w/o dimnames - and showing what the semantic
really is for traditional matrices :


 a - matrix(0,4,4); a[c(1,2,1), 2] - 1:3; a
 [,1] [,2] [,3] [,4]
[1,]0300
[2,]0200
[3,]0000
[4,]0000

 A - Matrix(0,4,4); A[c(1,2,1), 2] - 1:3; A
4 x 4 sparse Matrix of class dgCMatrix

[1,] . 4 . .
[2,] . 2 . .
[3,] . . . .
[4,] . . . .


so we see that multiple assignments are supposed to happen
consecutively such that the last wins.


Tony The documentation reads:

Tony  Most of the time, the function works via a traditional (_full_)
Tony 'matrix'.  However, 'Matrix(0, nrow,ncol)' directly constructs an
Tony empty sparseMatrix, as does 'Matrix(FALSE, *)'.

Tony So is this when an exception comes,
no

Tony and if so can someone explain to me why we get the 2?
Tony It would seem that it should just reassign the 1 to a
Tony 1 not add the number of times it is assigning a 1.

If you are interested: Things go via TsparseMatrix, i.e. the
triplet representation.  and there the convention is the
following: an index pair (i_0,j_0) can appear more than once. If
it does it *means* that all the 'x' values are summed up.

  ?dgTMatrix-class

does explain that, too.

Here's an example - using the auxiliary function I had posted a
while ago on R-help:

spMatrix - function(nrow, ncol, i,j,x) {
## Purpose: User friendly construction of sparse Matrix from triple
## --
## Arguments: (i,j,x): 2 integer and 1 numeric vector of the same length:
##
##  The matrix M will have
##   M[i[k], j[k]] == x[k] , for k = 1,2,..., length(i)
##and M[ i', j' ]  ==  0  for `` all other pairs (i',j')
## --
## Author: Martin Maechler, Date:  8 Jan 2007, 18:46
dim - c(as.integer(nrow), as.integer(ncol))
## The conformability of (i,j,x) with itself and with 'dim'
## will be checked automatically
## by an internal validObject() which is part of new(.):
new(dgTMatrix, x = as.double(x), Dim = dim,
## our Tsparse Matrices use  0-based indices :
i = as.integer(i - 1:1),
j = as.integer(j - 1:1))
}

and now uses this lower-level construction of Tsparse Matrices:

 (A - spMatrix(3,4, i= c(1:3, 1:2), j = c(2:4, 3:4), x = 1:5))
3 x 4 sparse Matrix of class dgTMatrix

[1,] . 1 4 .
[2,] . . 2 5
[3,] . . . 3

## ok

 (B - spMatrix(3,4, i= c(1:3, 1:2), j = c(2:4, 2:3), x = 1:5))
3 x 4 sparse Matrix of class dgTMatrix

[1,] . 5 . .
[2,] . . 7 .
[3,] . . . 3

## oops!

 str(B)
Formal class 'dgTMatrix' [package Matrix] with 6 slots
  ..@ i   : int [1:5] 0 1 2 0 1
  ..@ j   : int [1:5] 1 2 3 1 2
  ..@ Dim : int [1:2] 3 4
  ..@ Dimnames:List of 2
  .. ..$ : NULL
  .. ..$ : NULL
  ..@ x   : num [1:5] 1 2 3 4 5
  ..@ factors : list()

## which shows that you have 5 entries in B, with the sum those
## with equal index convention mentioned above.


Thank you for the report, Tony.
This will be fixed in the next release of Matrix.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Analysis for Binary time series

2007-05-04 Thread Martin Maechler

 Megh == Megh Dal [EMAIL PROTECTED]
 on Fri, 4 May 2007 00:12:25 -0700 (PDT) writes:

Megh hi, good morning everyone. I have a time series with
Megh binary outputs like :
Megh 000100100.etc. Now I want to
Megh forecast the future values of that. Can anyone please
Megh tell me whether there is any tools exist in literature
Megh for dealing with this kind of binary observation? If
Megh possible please provide me some good references in net
Megh as well.
 
Since you're asking on R-help :

 The R package VLMC (variable length markov chains) works with
 such finite alphabet time-series and has been applied to such
 successfully in the past.

Do
  install.packages(VLMC)
  library(VLMC)
  ?vlmc

It contains the following

References

  Buhlmann P. and Wyner A. (1998) Variable Length Markov Chains. 
  Annals of Statistics 27, 480--513.

  Mächler M. and Bühlmann P. (2004) Variable Length Markov Chains:
  Methodology, Computing, and Software. 
  J. Computational and Graphical Statistics 2, 435--455.


-- 
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Wiki down?

2007-05-03 Thread Martin Maechler

 Talbot == Talbot Katz [EMAIL PROTECTED]
 on Thu, 03 May 2007 12:35:27 -0400 writes:

Talbot Hi.  I can't access the site
Talbot http://wiki.r-project.org/.  I didn't find any
Talbot notice about this on http://www.r-project.org/.
Talbot Does anyone have any more information about the R
Talbot Wiki status?  Thanks!

Yes, it is currently down.
Thanks for the note.

It's maintainer has hereby __ CC __ be notified as well.

Regards, 
Martin Maechler ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Write text in the

2007-04-28 Thread Martin Maechler

 Matt == Matthew Neilson [EMAIL PROTECTED]
 on Fri, 27 Apr 2007 15:54:20 +0100 writes:

Matt Hey Felix,
Matt So basically what you want is a figure containing a block of four 
plots, with a main title for the figure? If that's the case then something like 
this should work:

Matt # BEGIN CODE #

Matt par(oma=c(0,0,1,0), mfrow=c(2,2))
Matt for(i in 1:4){
Matt plot(NA,xlim=range(0,10),ylim=range(-5,5))
Matt title(paste(Plot ,i,sep=))
Matt }

Matt par(mfrow=c(1,1), oma=c(0,0,1,0))
Matt mtext(Main Title, 3, outer = TRUE, cex = par(cex.main))

Matt # END CODE #


Matt You can play about with the margins, but I think that's the general 
idea. Is this what you're after?

Yes, and since this is so often desired, with our sfsmisc
package, you can simply say

  sfsmisc::mult.fig(4, main = Main Title)
  for(i in 1:4){
  plot(NA,xlim=range(0,10),ylim=range(-5,5))
  title(paste(Plot ,i,sep=))
  }

If you're a good R-citizen, you will want to be able to reset
the graphics parameters, which would extend the above to

  op - sfsmisc::mult.fig(4, main = Main Title) $ old.par
  for(i in 1:4){
  plot(NA,xlim=range(0,10),ylim=range(-5,5))
  title(paste(Plot ,i,sep=))
  }
  par(op)

--
Martin Maechler, ETH Zurich


Matt On Fri Apr 27 15:34 , Felix Wave [EMAIL PROTECTED] sent:

 Hello,
 I started a graphic device:
 par(oma = c(2,0,0,0), mfrow=c(2,2) )
 
 in the cols and rows are three images.
 Now I want to write a text in the device region, it's the
 main title of the graphic device. But when I use mtext() I can
 only write in the figure region of my four plots.
 
 Has anybody an idea?
 
 Thanks a lot.
 
 Felix
 
 
 
 My R Code:
 --
 par(oma = c(2,0,0,0), mfrow=c(2,2) )
 
 mtext(Main title, side = 3, line = 0)
 
 image(zDIV)
 image(zMEDIAN )  
 image(zMEAN) 
 image(zSD)  
 dev.off() 
 
 
 
 graphic:   
 
 ---
 |MAIN TITLE   device region 
 ---
 |   figure region|   figure region|
 |  --||
 |  | |||  ||
 |  | |||  ||
 |  | |||  ||
 |  | |||  ||
 |  -- |
 |  |   
 ---
 |   figure region|   figure region|
 |  --||
 |  | |||  ||
 |  | |||  ||
 |  | |||  ||
 |  | |||  ||
 |  -- |
 |  |


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Drawing Tangent

2007-04-28 Thread Martin Maechler

 Arun == Arun Kumar Saha [EMAIL PROTECTED]
 on Thu, 26 Apr 2007 23:44:03 +0530 writes:

Arun Dear all R-users,
Arun I would like to draw a tangent of a given function for a particular 
(given)
Arun point. However the straight line representing it should not cut any 
axis, it
Arun should be a small line. Can anyone tell me how to do this?

You will eventually call segments() to draw that short line.

par(usr) {and maybe e.g. par(pin)} will probably be relevant when
determining how long the line segment(s) should be.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] pacf

2007-04-28 Thread Martin Maechler

 tom == tom soyer [EMAIL PROTECTED]
 on Sat, 28 Apr 2007 08:15:39 -0500 writes:

tom I wanted to understand exactly how acf and pacf works,
tom so I tried to calculate ac and pac manually. For ac, I
tom used the standard acf formula: acf(k) =
tom sum(X(t)-Xbar)(X(t-k)-Xbar))/sum(X(t)-Xbar)^2. But for
tom pac, I could not figure out how to calculate it by
tom hand. I understand that in both R and EVIEWS, it is
tom done using the Durbin-Levinson algorithm by the
tom computer.  However, I don't understand exactly how the
tom algorithm works just by looking at the algorithm. Does
tom anyone know if there is a short cut to calculate pac by
tom hand (or in a spreadsheet), or is it too complex of a
tom procedure that a computer is absolutely necessary? It
tom seems that there should be a natural relationship
tom between ac and pac so that once ac is calculated, pac
tom can be easily calculated based on ac.

easily, yes, by the Durbin-Levinson algorithm  ;-)

   (is this a homework problem?)

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] digest readability/ thunderbird email client/ R-help

2007-04-28 Thread Martin Maechler

Hi Sam,

 SamMc == Sam McClatchie [EMAIL PROTECTED]
 on Fri, 27 Apr 2007 11:21:56 -0700 writes:

SamMc System:Linux kernel 2.6.15 Ubuntu dapper
...


SamMc Has anyone figured out how to make the R-help digest
SamMc more easily readable in the Thunderbird mail client?
SamMc I think the digest used to be more readable than is
SamMc currently the case with all of the messages as
SamMc attachments.

SamMc I know the work around is to get individual messages
SamMc rather than the digest, use another mail client, or
SamMc just look at the archives on the web...

{and there are at least two alternatives to the standard
 archives, notably Gmane.
 BTW: Just found an URL that should list all R lists carried
  by them :  http://dir.gmane.org/search.php?match=.r.


It had been pointed out more than once that Thunderbird
unfortunately is not adhering to the standard(s) of such
digests-- too bad it still is not.

One alternative is to get the digest as plain digest
((which BTW another standard-complying digest format, that
 e.g. emacs Rmail or VM can easily deal with))
which will most probably just appear as one long e-mail in
Thunderbird, with table of contents of all the subject lines,
but nothing clickable

Regards,
Martin

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Changing working directory

2007-04-24 Thread Martin Maechler

 Petr == Petr Klasterecky [EMAIL PROTECTED]
 on Mon, 23 Apr 2007 14:27:03 +0200 writes:

Petr Hi,
Petr you seem to have mixed 2 different things:
Petr 1) changing a working directory - see ?setwd, ?getwd
Petr However, this will NOT load another .Rdata file.

Petr 2) loading data - see ?load and ?save, ?save.image - loading new data 
Petr image will erase all currently stored objects.

Huuuh??? Not if you use the standard R function  load()!

Nothing is erased.  If you load objects for which you have
objects in the same name in your workspace, then and only then,
those in the workspace will be replaced by the newly loaded
ones.

For that reason,

   attach(some_file.rda)

is sometimes preferred.
But, as Uwe Ligges has already said:
Working with .Rdata files is not recommended: You should work
with script files (foo.R), source() and possibly even own packages
and only save() {and load()/attach()} things that are expensive to rebuild.
And for those, I strongly recommend to use a filename different
from .Rdata.

Martin Maechler, ETH Zurich

Petr Petr


Petr Walter Paczkowski napsal(a):
 Good morning,
 
 I keep copies my .RData file in different directories for different 
projects on Windows XP.  There is an icon on my desktop for each project so all 
I have to do is click on the icon to open R for a specific project, i.e. a 
specific .RData file.  How do I change to another .RData file from within R 
without first closing R?
 
 Thanks,
 
 Walt Paczkowski

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with length of array while calling C from R

2007-04-24 Thread Martin Maechler

 Sven == Sven Knüppel [EMAIL PROTECTED]
 on Tue, 24 Apr 2007 13:53:09 +0200 writes:

Sven Hello,
Sven my problem is that I don't know how long must be an array of double 
while calling C from R.

Sven R-Code:
 array - c(1,1,1)
 save - .C ( Ctest , a = array )

Sven C-Code: void Ctest ( double *array ) { ...  array =
Sven (double*) realloc ( array , new_number *
Sven sizeof(double) ) ; ...  }

Sven The length of array will be compute in C.

Sven At the end save$a has a length of 3 and not the length of the 
allocated array in C.

Sven What can I do?

Either you learn to use  .Call()  where you pass whole R
objects, and can use  length(.) on them in your C code,

or, simpler in this case, but much less powerful in general,
you change your R code to

   ss - .C(Ctest, a = myarray, n = length(myarray))

and the C code to

void Ctest (double *array, int *n) { 
 .
}

and then make use of  *n  inside the C code.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Do NOT repost ! {Re: about R square value}

2007-04-23 Thread Martin Maechler

Dear Nitish,

Please do *NOT*  resend your message several times to the R-help 
mailing list.

This is considered very impolite.

 Nitish == Nitish Kumar Mishra [EMAIL PROTECTED]
 on Sun, 22 Apr 2007 16:39:03 +0530 (IST) writes:

Nitish Hi, I am simply asking about coefficient od
Nitish determination(R square), is its value more than 1
Nitish also posiible.  Because it ranges from 0-1. So I
Nitish want to know that R squre may be more than one. If
Nitish yes what is its interpretation.  Thanking to all of
Nitish You(R help group).


[]


Nitish PLEASE do read the posting guide
Nitish http://www.R-project.org/posting-guide.html and
Nitish provide commented, minimal, self-contained,
Nitish reproducible code.

Please, DEFINITELY do read the posting guide 
before sending another message to R-help.

Martin Maechler, ETH Zurich
R- mailing lists maintainer

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Filtering

2007-04-20 Thread Martin Maechler

 Soare == Soare Marcian-Alin [EMAIL PROTECTED]
 on Fri, 20 Apr 2007 01:32:30 +0200 writes:

Soare Hello Everybody,
Soare How can I filter a dataset?

Have you tried

 help(filter)

?  filter() is a standard function and mentions others.

Further note that statisticians often talk of smoothing
when engineers talks of filtering


Soare Im trying to filter the dataset EuStockMarkets monthly and quarter.

Soare data(EuStockMarkets)
Soare ftse = EuStockMarkets[,4]

Soare Alin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Partitioning around mediods (PAM)

2007-04-20 Thread Martin Maechler

You've asked the almost identical question yesterday.
 {that *de*creases the probability of getting help in some cases!}

If you take a little effort and follow the posting guide
   (keywords  reproducible example; not using HTML in e-mails),
I (and many others) gladly will help you further.

Martin Maechler, ETH Zurich


 nathaniel == nathaniel Grey [EMAIL PROTECTED]
 on Fri, 20 Apr 2007 17:09:09 +0200 (CEST) writes:

nathaniel Hi,
nathaniel I need some help understanding the output from PAM. When I look 
at the output it doesn't list the cluster number by the median vlaues on each 
of the variables (like it does with k-means) Instead I have the following:

nathaniel So I know for instance cluster 1 has a mean for variable1 of 
33.33, however when I run PAM i get:
nathaniel variable 1 variable2
nathaniel 293212
nathaniel 9712 9 
nathaniel 308  106   8
nathaniel 217  62 2

nathaniel Does 29 relate to cluster 1,  and 97 to cluster2 etc.

nathaniel Hope this makes sense,

nathaniel Nathaniel


nathaniel ___ 


nathaniel [[alternative HTML version deleted]]

nathaniel __
nathaniel R-help@stat.math.ethz.ch mailing list
nathaniel https://stat.ethz.ch/mailman/listinfo/r-help
nathaniel PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
nathaniel and provide commented, minimal, self-contained, reproducible 
code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] colSum() in Matrix objects

2007-04-17 Thread Martin Maechler


Hi Jose,

Jose I'd like to simply add column-wise using Matrix objects (Csparse).
Jose It looks like one can apply mosty any base function to these objects  
Jose (i.e., apply, colSums), but there is a nasty conversion to 
traditional  
Jose matrix objects if one does that.

not in this case, see below.

Jose Is there any workaround? I can see colSum listed in the help for 
Class  
colSums (final 's'!)

Jose 'CsparseMatrix' , but I wonder whether I'm using the default 
colSums() or  
Jose the one specific to CsparseMatrix...

  #example
  (z = Matrix(c(0,1,0,0), 10,10))
  zr = rowSums(z)
  class(zr) # numeric; I'd like it to be a CSparseMatrix object

  selectMethod(colSums, class(z))
## or
  showMethods(colSums)

both show you that you are using the class specific one.

However, why do you assume that colSums() should not return a
numeric vector?  From the idea that colSums() and rowSums()
should be fast versions of apply(., marg, sum),
it must return a numeric vector, as it also does for
traditional matrices.

Are your objects so huge that even a 1-row {or 1-column} sparse
matrix would save a lot?

Regards,
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] nls.control( ) has no influence on nls( ) !

2007-04-16 Thread Martin Maechler

 Yuchen == Yuchen Luo [EMAIL PROTECTED]
 on Sun, 15 Apr 2007 12:18:23 -0700 writes:

Yuchen Dear Friends.
Yuchen I tried to use nls.control() to change the 'minFactor' in nls( ), 
but it
Yuchen does not seem to work.

yes, you do not seem to use it correctly.
No reason to jump to the conclusion that you give in the subject
of this posting ... which is hilariously wrong!
You don't really think that a quality software like R has had nls()
and nls.control(), and nls.control() would never have worked for all
those years and tens of thousands of users ???
Please get to your senses, and first read the posting guide (*)
- and then try again, so we can help you further!

Regards,
Martin Maechler, ETH Zurich

(*) Link at the bottom of every R-help message -- and hence
cited below

Yuchen I used nls( ) function and encountered error message step factor
Yuchen 0.000488281 reduced below 'minFactor' of 0.000976563. I then tried 
the
Yuchen following:

Yuchen 1) Put nls.control(minFactor = 1/(4096*128)) inside the brackets 
of nls,
Yuchen but the same error message shows up.

Yuchen 2) Put nls.control(minFactor = 1/(4096*128)) as a separate 
command before
Yuchen the command that use nls( ) function, again, the same thing happens,
Yuchen although the R responds to the nls.control( ) function immediately:
Yuchen -
Yuchen $maxiter

Yuchen [1] 50



Yuchen $tol

Yuchen [1] 1e-05



Yuchen $minFactor

Yuchen [1] 1.907349e-06

Yuchen --


Yuchen I am wondering how may I change the minFactor to a smaller value? 
The manual
Yuchen that comes with the R software about nls( )  is very sketchy --- 
the only
Yuchen relevant example I see is a separate command like 2).

Yuchen A more relevent question might be, is lower the 'minFactor'  the 
only
Yuchen solution to the problem? What are the other options?

Yuchen Best Wishes
Yuchen Yuchen Luo

Yuchen [[alternative HTML version deleted]]

Yuchen __
Yuchen R-help@stat.math.ethz.ch mailing list
Yuchen https://stat.ethz.ch/mailman/listinfo/r-help
Yuchen PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Yuchen and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Netiquette etc {was nls.control() has no influence on nls()!}

2007-04-16 Thread Martin Maechler

 Joerg == Joerg van den Hoff [EMAIL PROTECTED]
 on Mon, 16 Apr 2007 11:15:23 +0200 writes:

Joerg On Mon, Apr 16, 2007 at 09:03:27AM +0200, Martin Maechler wrote:

  Yuchen == Yuchen Luo [EMAIL PROTECTED]

 []
 []


Joerg a final remark (off-topic?), concerning the response
Joerg of m. maechler (but I notice this attitude
Joerg continously on the list): this is supposed to be a
Joerg help list, neither a boot camp nor a thought-police
Joerg school, right?.

In general, Joerg, you are entirely right, and I do behave like
that typically. 

Now, since you responded in public, I do too:
What I did not like at all about Yuchen's posting was it's
subject which claims obvious misfunctioning, using a !, not a
? to make the point.  
The subject of an e-mail is the first and typically most read part
and I'd like users to be a bit more polite there.
{see also below}

Joerg concerning the question at hand, the
Joerg one word response

Joerg `?nls'

Joerg would have been justifiable (it usually _is_ in the
Joerg manpage), if not overly helpful, probably. but the
Joerg actual response is worded such that -- would it be
Joerg directed at me -- I would judge it simply offensive:
Joerg get to your senses is not what one should be told,
Joerg simply because ond did not grasp how to use the
Joerg `control' argument and/or because in the subject
Joerg (contrary to the body) of the mail 'has no influence'
Joerg is used instead of `seems to have no influence'.

Joerg I really do think that if someone deems a certain
Joerg question really stupid, it would be wise simply to
Joerg keep 'radio silence' (i.e. ignore it), which probably
Joerg is the best corrective anyway. 

often the most sensible, but then the users don't know why the
silence is resounding...

Joerg the 'how dare you' responses are annoying, to say the least.

Joerg for heavens sake...

well.. I am (often) glad to help users who ask questions about
their problems, but sometimes *we* feel a bit annoyed by users
shouting (that's what ! in a subject typically means) at the
public and denouncing misfunctioning of something we've put in
quite a bit of work.

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting and using a function

2007-04-13 Thread Martin Maechler

 Thomas == Thomas L Jones [EMAIL PROTECTED]
 on Thu, 12 Apr 2007 22:28:33 -0500 writes:

Thomas I am trying to do what is perhaps the most basic procedure which 
can be done 
Thomas with the R software.

Thomas Under Windows XP Home Edition, I want to get a copy of the function 
gam, 
Thomas then put it in and use it. I intentionaly use informal terms, 
rather than 
Thomas technical terms whose exact meaning I might or might not know.


Thomas I am finding this extremely frustrating. Every time
Thomas I try to do anything, all I get is error
Thomas messages.

So have you really read (and tried to understand) the 
Introduction to R  that's web- available and comes built 
into R?

Thomas only three or four or five steps involved. Well,
Thomas what are they? There is all this opaque
Thomas terminology. There are libraries, and packages, and
Thomas one downloads them, loads them, and installs them,
Thomas but just what all this means is unclear.

Thomas One example among many: I tell it

Thomas library (gam)  all I get is an error message.

Thomas Error in library (gam)  : there is no package called 'gam'



Thomas Well, does this mean what it says, or does it mean something 
different? For 
Thomas example, does it mean that such-and-such computer program has not 
yet been 
Thomas downloaded?

Well it means that for your version of R there is no package
called 'gam' available, i.e. what it says

or do you expect software to tell you something about the world at
large, namely that there is no R package named 'gam' on the
whole wide world ?  :-) 


[ BTW, for using a more modern version of gam(),
  there's the recommended package 'mgcv' - which you don't have
  to install :

  library(mgcv)
  help(gam)

  
]


Thomas I did download and install the 2.4.1 flavor
Thomas (version?) of the gui. I infer, reading between the
Thomas lines a bit, that there may be a sort of standard
Thomas procedure for setting things up, perhaps downloading
Thomas and installing the utils package, or something.

Thomas The R software has much gold in it, but, as far as
Thomas learnability/usability is concerned, I give it poor marks.

Well it is recommended that you learn it from a human rather
than from itself.  One way to learn from humans is by reading
books, manuals, 
and for R, there are quite a few, both freely available with R
and on the web, and others in a bookstore near you ...
A more comfortable way for some is to attend a course / class...

I hope you can bear the somewhat flippy tone of my answer,...
I'm just trying to counter your own humorous one ;-) :-)

Regards,
Martin

Thomas Tom Jones
Thomas DrJones at alum.MIT.edu

Thomas __
Thomas [EMAIL PROTECTED] mailing list
Thomas https://stat.ethz.ch/mailman/listinfo/r-help

Thomas PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

that also may be a quite useful resource to read ...

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] median polishing

2007-04-10 Thread Martin Maechler

 Sorn == Sorn Norng [EMAIL PROTECTED]
 on Tue, 10 Apr 2007 16:11:38 +1000 writes:

Sorn Hi, In SPlus there is a function called twoway for
Sorn median polishing gridded data. Is there an equivalent
Sorn function in R? 
yes
Sorn I have been searching for it in R help
Sorn without much success.

hmm... funny..

For me,

help.search(median polish) 

pretty quickly gives the hint you need:

medpolish()

Sorn Your help is much appreciated.

you're welcome.
Martin

Sorn  [[alternative HTML version deleted]]

Please do read the posting guide -- particularly about
   HTML-ified e-mails.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] data encapsulation with classes

2007-04-09 Thread Martin Maechler

 BDR == Prof Brian Ripley [EMAIL PROTECTED]
 on Mon, 9 Apr 2007 08:24:32 +0100 (BST) writes:

BDR On Sun, 8 Apr 2007, hadley wickham wrote:
 On 4/8/07, Roger Bivand [EMAIL PROTECTED] wrote:
 On Sun, 8 Apr 2007, Peter Williams wrote:

  Hi All,
  
  I'm new to R from a C and Octave/Matlab background.  I
  am trying to construct some classes in R to which I
  want to attach pieces of data.  First, is attr(obj,
  'member name') - data the accepted way of doing this?

  No, it isn't. You seem to be trying to deduce new-style
 classes from a representation used before R 2.4,

BDR (actually, still used)

 but in any case it would not be sensible. Please consult
 John M. Chambers. Programming with Data.  Springer, New
 York, 1998, and/or William N. Venables and Brian
 D. Ripley.  S Programming. Springer, New York, 2000, or
 for a shorter online resource:
 
 http://www.stat.auckland.ac.nz/S-Workshop/Gentleman/Methods.pdf

  Unfortunately, all of those references are at least 4
 years out of date when it comes to S4 methods.  Is there
 any comprehensive reference of the current implementation
 of the S4 OO system apart from the source code?

?Methods {i.e. help(Methods)} is not about the *implementation*
but still explains quite a bit about the method dispatch part of
S4.  It has had a URL to for the How S4 Methods Work technical
report by John Chambers; I've now also added a link to that from
http://developer.R-project.org/ (will only be active in ~ a day).

Martin

BDR Not that I know of, and it is a moving target.  (E.g. I
BDR asked recently about some anomalies in the S4 bit
BDR introduced for 2.4.0 and what the intended semantics
BDR are.)  I've said before that I believe we can only help
BDR solve some of the efficiency issues with S4 if we have
BDR a technical manual.

BDR It is unfair to pick out S4 here, but the 'R Internals'
BDR manual is an attempt to document important
BDR implementation details (mainly by studying the code),
BDR and that has only got most of the way through
BDR src/main/*.c.

BDR -- Brian D. Ripley, [EMAIL PROTECTED] Professor of
BDR Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
BDR University of Oxford, Tel: +44 1865 272861 (self) 1
BDR South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG,
BDR UK Fax: +44 1865 272595

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Computing the rank of a matrix.

2007-04-07 Thread Martin Maechler

 Ravi == Ravi Varadhan [EMAIL PROTECTED]
 on Fri, 6 Apr 2007 12:44:33 -0400 writes:

Ravi Hi, qr(A)$rank will work, but just be wary of the
Ravi tolerance parameter (default is 1.e-07), since the
Ravi rank computation could be sensitive to the tolerance
Ravi chosen.

Yes, indeed.

The point is that   rank(Matrix) 
is well defined in pure math (linear algebra), as well as 
a singular matrix is.

The same typically no longer makes sense as soon as you enter
the real world: A matrix close to singular may have to be
treated as if singular depending on its singularity
closeness {{ learn about the condition number of a matrix }}
and the same issues arise with rank(matrix).

Of course, the matlab programmers know all this (and much more),
and indeed, matlab's  rank(A) really is  
rank(A, tol = tol.default(A))

and is based on the SVD instead of QR decomposition since the
former is said to be more reliable (even though slightly slower).

R's equivalent (with quite a bit of fool-proofing) would be the
following function (assuming correct online documentation of matlab):


rankMat - function(A, tol = NULL, singValA = svd(A, 0,0)$d)
{
## Purpose: rank of a matrix ``as Matlab''
## --
## Arguments: A: a numerical matrix, maybe non-square
##  tol: numerical tolerance (compared to singular values)
## singValA: vector of non-increasing singular values of A
##   (pass as argument if already known)
## --
## Author: Martin Maechler, Date:  7 Apr 2007, 16:16
d - dim(A)
stopifnot(length(d) == 2, length(singValA) == min(d),
  diff(singValA)  0)   # must be sorted decreasingly
if(is.null(tol))
tol - max(d) * .Machine$double.eps * abs(singValA[1])
else stopifnot(is.numeric(tol), tol = 0)
sum(singValA = tol)
}


A small scale simulation with random matrices,
i.e., things like

  ## ranks of random matrices; here will have 5 all the time:
  table(replicate(1000, rankMat(matrix(rnorm(5*12),5, 12) )))#  1 sec.

indicates that qr(.)$rank  gives the same typically,
where I assume one should really use

  qr(., tol = .Machine$double.eps, LAPACK = TRUE)$rank

to be closer to Matlab's default tolerance.

Ok, who has time to investigate further? 
Research exercise:

 1  Is there a fixed number, say  t0 - 1e-15 
 1for which  qr(A, tol = t0, LAPACK=TRUE)$rank is 
 1   ``optimally close'' to rankMat(A) ?
 
   2 how easily do you get cases showing svd(.) to more reliable
   2 than qr(., LAPACK=TRUE)?
  
To solve this in an interesting way, you should probably
investigate classes of almost rank-deficient matrices,
and I'd also be interested if you randomly ever get matrices A
with  rank(A)   min(dim(A)) - 1
(unless you construct some columns/rows exactly from earlier
 ones or use all-0 ones)

Martin Maechler, ETH Zurich




Ravi Ravi.

Ravi -
Ravi Ravi Varadhan, Ph.D.
Ravi Assistant Professor, The Center on Aging and Health
...
Ravi http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html


Ravi 



   How about

 qr(A)$rank

   or perhaps

 qr(A, LAPACK=TRUE)$rank

   Cheers,

   Andy

   
   




 Hi! Maybe this is a silly question, but I need the
 column rank (http://en.wikipedia.org/wiki/Rank_matrix)
 of a matrix and R function 'rank()' only gives me the
 ordering of the elements of my matrix.  How can I
 compute the column rank of a matrix? Is there not an R
 equivalent to Matlab's 'rank()'?  I've been browsing
 for a time now and I can't find anything, so any help
 will be greatly appreciated. Best regards!


 -- -- Jose Luis Aznarte M.
 http://decsai.ugr.es/~jlaznarte Department of Computer
 Science and Artificial Intelligence Universidad de
 Granada Tel. +34 - 958 - 24 04 67 GRANADA (Spain) Fax:
 +34 - 958 - 24 00 79

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Using split() several times in a row?

2007-03-31 Thread Martin Maechler

 SteT == Stephen Tucker [EMAIL PROTECTED]
 on Fri, 30 Mar 2007 18:41:39 -0700 (PDT) writes:

  [..]

SteT For dates, I usually store them as POSIXct classes
SteT in data frames, but according to Gabor Grothendieck
SteT and Thomas Petzoldt's R Help Desk article
SteT http://cran.r-project.org/doc/Rnews/Rnews_2004-1.pdf,
SteT I should probably be using chron date and times...

I don't think you should (and I doubt Gabor and Thomas would
recommend this in every case):

POSIXct (and 'POSIXlt', 'POSIXt'  'Date') are part of standard R,
and whereas they may seem not as convenient in all cases as chron
etc, I'd rather recommed to stick to them in such a case.

SteT Nonetheless, POSIXct casses are what I know so I can
SteT show you that to get the month out of your column
SteT (replace 8.29.97 with your variable), you can do the
SteT following:

SteT month = format(strptime(8.29.97,format=%m.%d.%y),format=%m)

SteT Or,
SteT month = as.data.frame(strsplit(8.29.97,\\.))[1,]

  [..etc..veryuseful..advice]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] CLUSTER Package

2007-03-30 Thread Martin Maechler

It seems nobody else was willing to help here
(when the original poster did not at all follow the posting
guide).

In the mean time, someone else has asked me about part of this,
so let me answer in public :

 MM == Martin Maechler [EMAIL PROTECTED]
 on Mon, 12 Mar 2007 17:23:30 +0100 writes:

MM Hi Vallejo, I'm pretty busy currently, and feel your
MM question has much more to do with how to use R more
MM generally than with using the functions from the cluster
MM package.

MM So you may get help from other R-help readers, but maybe
MM only after you have followed the posting-guide and give
MM a reproducible example as you're asked there.

MM Regards, Martin Maechler

 VallejoR == Vallejo, Roger [EMAIL PROTECTED]
 on Mon, 12 Mar 2007 10:28:01 -0400 writes:

VallejoR Hi Martin, In using the Cluster Package, I have
VallejoR results for PAM and DIANA clustering algorithms
VallejoR (below part and hier objects):



VallejoR part - pam(trout, bestk) # PAM results


VallejoR hier - diana(trout) # DIANA results


VallejoR GeneNames - show(RG$genes) # Gene Names are in this object

(RG is what)?


VallejoR But I would like also to know what genes (NAMES)
VallejoR are included in each cluster. I tried
VallejoR unsuccessfully to send these results to output
VallejoR files (clusters with gene Names). This must be an
VallejoR easy task for a good R programmer. I will
VallejoR appreciate very much directions or R code on how
VallejoR to send the PAM and DIANA results to output files
VallejoR including information on genes (Names) per each
VallejoR cluster.

For diana(), a *hierarchical* clustering {as agnes()}, you need
to decide about the number of clusters yourself.
Then, as the example in  help(diana.object) shows,
you can use cutree() to get the grouping vector:

Here's a reproducible example :

library(cluster)
data(votes.repub)
dv - diana(votes.repub, metric = manhattan, stand = TRUE)
print(dv)
plot(dv)

## Cut into 2 groups:
dv2 - cutree(as.hclust(dv), k = 2)
table(dv2) # 8 and 42 group members
rownames(votes.repub)[dv2 == 1]

## For two groups, does the metric matter ?
dv0 - diana(votes.repub, stand = TRUE) # default: Euclidean
dv.2 - cutree(as.hclust(dv0), k = 2)
table(dv2 == dv.2)## identical group assignments



For pam(), it's even simpler :

data(ruspini)
pr - pam(ruspini, 4)
plot(pr)

# Hit Return to see next plot: 
str(pr)
## or
summary(pr)
## .. shows you that there's a component 'clustering' :

pr$clustering
## a grouping vector with case-labels {your Gene names}; here 1,2,..150:

## and to get them ``visually'':
split(rownames(ruspini), pr$clustering)
## $`1`
##  [1] 1  2  3  4  5  6  7  8  9  10 11 12 13 14 
15
## [16] 16 17 18 19 20

## $`2`
##  [1] 21 22 23 24 25 26 27 28 29 30 31 32 33 34 
35
## [16] 36 37 38 39 40 41 42 43

## $`3`
##  [1] 44 45 46 47 48 49 50 51 52 53 54 55 56 57 
58
## [16] 59 60

## $`4`
##  [1] 61 62 63 64 65 66 67 68 69 70 71 72 73 74 
75

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Implementing trees in R

2007-03-22 Thread Martin Maechler

 Bálint == Bálint Czúcz [EMAIL PROTECTED]
 on Wed, 21 Mar 2007 19:54:28 +0100 writes:

Bálint for an example how this idea is implemented in a working package, 
have
Bálint a look at the help page of BinaryTree class in package party. The
Bálint @tree node of such an object is a recursive list of this kind.

and the informal (i.e. S3) class  dendrogram  in the 'stats'
package, i.e. part of every R, is of that kind as well
{to be used for cluster dendrograms it has attributes for nodes
 and edges, etc}

It features a print(), (too ?!) flexible plot(), a nice str()
method, and more:

  methods(class = dendrogram)
 [1] cophenetic.dendrogram* cut.dendrogram*[[.dendrogram*
 [4] labels.dendrogram* plot.dendrogram*   print.dendrogram* 
 [7] reorder.dendrogram*rev.dendrogram*str.dendrogram*   

Type 
 example(dendrogram)

to get a first impression.

Martin Maechler, ETH Zurich


Bálint On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
 Let me rephrase that.  Lists do not support references but they
 could be used to represent trees.
 
 list(a = list(a = 1, b = list(2, 3, d = list(4, 5)), c = list(4, 5))
 
 is a tree whose top nodes are a, b, c and b contains three nodes
 2, 3 and d and d contains 2 nodes.
 
 However, if you want to do it via references as requested then lists
 are not appropriate.
 
 On 3/16/07, Gabor Grothendieck [EMAIL PROTECTED] wrote:
  Lists are not good for this.  There is an example in section 3.3 of
  the proto vignette of using proto objects for this.  That section
  also references an S4 example although its pretty messy with S4.
 
  You might want to look at the graph, RBGL and graphviz packages
  in Bioconductor and the dynamicgraph, mathgraph and sna packages
  on CRAN.
 
  On 3/16/07, Yuk Lap Yip (Kevin) [EMAIL PROTECTED] wrote:
   Hi all,
  
  I am rather new to R. Recently I have been trying to implement 
some
   tree algorithms in R. I used lists to model tree nodes. I thought
   something like this would work:
  
  parent - list();
  child - list();
  parent$child1 - child;
  child$parent - parent;
  
  When I tried to check whether a node is its parent's first child
   using if (node$parent$child1 == node), it always returned false. 
Then
   I realized that it does not really work because parent$child1 - 
child
   actually makes a copy of child instead of referencing it. I think one
   possible fix is to keep a list of node objects, and make references
   using the positions in the list. For example, I think the following
   would work:
  
  parent - list();
  child - list();
  nodes - list(parent, child);
  parent$child1 - 2;
  child$parent - 1;
  
  Then the first child test can be rewritten as if
   (nodes[[nodes[[nodeId]]$parent]]$child1 == nodeId). However, I would
   prefer not to implement trees in this way, as it requires the
   inconvenient and error-prone manipulations of node IDs.
  
  May I know if there is a way to make object references to lists? 
Or
   are there other ways to implement tree data structures in R?
  
  BTW, I checked how hclust was implemented, and noticed that it 
calls
   an external Fortran program. I would want a solution not involving 
any
   external programs.
  
  Thanks.
  
   --
  
  
  God bless.
  
  Kevin
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

Bálint __
Bálint R-help@stat.math.ethz.ch mailing list
Bálint https://stat.ethz.ch/mailman/listinfo/r-help
Bálint PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Bálint and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Ticks and labels on plots

2007-03-22 Thread Martin Maechler

 Marc == Marc Schwartz [EMAIL PROTECTED]
 on Wed, 21 Mar 2007 14:23:12 -0500 writes:

Marc On Wed, 2007-03-21 at 14:40 -0400, Mike Prager wrote:
 Marc Schwartz [EMAIL PROTECTED] wrote:
 
  On Tue, 2007-03-20 at 18:04 -0400, Michael H. Prager wrote:
   I am generating stacked barplots of age-composition of fish 
populations 
   (Y) over time (X).  As there are many years, not every bars is 
labeled.  
   When looking at the plot, it becomes difficult to associate labels 
with 
   their bars.
   
   We have improved this a bit by using axis() to add a tickmark below 
each 
   bar.  Can anyone suggest a way to draw ticks ONLY at bars where a 
tick 
   label is drawn?  Or to make such ticks longer than those where there 
is 
   no label?
   
   This is going into a function, so I'm hoping for a method that 
doesn't 
   require looking at the plot first.
  
   # sample code (simplified) #
   mp - barplot(t(N.age), xlab = Year, axisnames = FALSE)
   axis(side = 1, at = mp, labels = rownames(N.age), tcl = -0.75)
   
   Thanks!
   
   Mike Prager
   NOAA, Beaufort, NC
  
  Mike,
  
  How about something like this:
  
mp - barplot(1:50, axisnames = FALSE)
  
# Create short tick marks at each bar
axis(1, at = mp, labels = rep(, 50), tcl = -0.25)
  
# Create longer tick marks every 5 years with labels
axis(1, at = mp[seq(1, 50, 5)], 
 labels = 1900 + seq(0, 45, 5), tcl = -0.75, las = 2, 
 cex.axis = 0.75)
  
  
  Just pick which labels you want to be shown (eg. every 5 years) and
  synchronize the values of those with the 'at' argument in axis().
  
  HTH,
  
  Marc Schwartz
  
 
 Thanks, Marc, for this solution and thanks equally to Jim Lemon
 for a similar idea.  This seems promising.  Since this is to go
 into a function (and should work without intervention), I'll
 need to devise an algorithm to decide at what interval the
 labels should be plotted.  Clearly axis() has such an
 algorithm.  Unfortunately, it reports its result only by placing
 the labels.
 
 Mike

Marc Mike,

Marc To get a feel for how axis() creates the default tick positions when
Marc 'at' is the default NULL, see ?axTicks, which provides functionality
Marc similar to the internal C routine.

yes, partly not only similar but the same when it works (by
default) with par(axp)

Marc You could also look at ?pretty

Yes.  *However* there's one important thing which I think hasn't
been mentioned yet.

We have now been talking how and where axis() {i.e. its internal
code} chooses to place tick marks.
The clue is that it draws labels at all tick marks by default or
explicitly with axis(*, at= ., labels = .)
BUT the labels are not shown on your plot as soon as they
would get too much crammed together.

You can see this nicely, interactively, by the following:
Use
 graphics.off()
 plot(1:11)

This will show ticks *and* labels  at  2 , 4 , 6 , 8, 10
and now use your mouse, drag to make the graphics window
narrower (e.g. keeping height constant), in small steps,
releasing the mouse again to let R redraw the graphic.
For quite a while, the labels remain until there's not enough
room, the 5 ticks remain, but only the
 labels  2 , 4, 6, 8  are drawn
 then2 , 6 , 10 
 then2 , 6
 then2 , 8  ( a little surprise to me)
 then2

you always see all ticks but labels are only drawn when they
don't get in each other's way.

Of course, things like

 plot(1:11, xaxt=n)
 axis(1, at=1:11, labels = paste(Lab, 1:11))

show even more when the window is widened or narrowed,
and yes, it depends on the device and on the fonts and its
sizes, see e.g.,

 plot(1:11, xaxt=n)
 axis(1, at=1:11, labels = paste(Lab, 1:11), cex.axis = 0.5)

- -

If you don't like this --- almost always very desirable ---
builtin smartness, you need to descend slightly lower level and use
mtext(), e.g., the analogue of the above is

  plot(1:11, xaxt=n)
  mtext(side=1, at=1:11, text = paste(Lab, 1:11), cex = 0.5)

or rather leave traditional graphics which really gets messy in
such cases {par() .. etc} and upgrade to using the grid
package,  or also packages built on grid, lattice or ggplot.
But infact, someone else needs to tell you how to play similar
goes with these.

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problem with RCurl install on Unix

2007-03-22 Thread Martin Maechler

 Steven == Steven McKinney [EMAIL PROTECTED]
 on Wed, 21 Mar 2007 19:29:43 -0700 writes:

Steven I get the same problem, and haven't
Steven figured it out yet.

Steven Is it a 32bit/64bit clash?

Well, I can install and run  RCurl  on both 32bit and 64bit
(Redhat / FC6  Linux; with own compilers, extra libs, ...).

Steven (Similarly, I don't have RMySQL up and running cleanly.)




 library(RCurl)
Steven Error in dyn.load(x, as.logical(local), as.logical(now)) : 
Steven unable to load shared library 
'/share/apps/R/R-2.4.1/library/RCurl/libs/RCurl.so':
Steven libcurl.so.4: cannot open shared object file: No such file or 
directory
Steven Error: package/namespace load failed for 'RCurl'

You might need to set  LD_LIBRARY_PATH  correctly
before starting R -- typically it would be set the same as it
was when RCurl was installed (which includes a 'configure' !) on
your system?  Or  RCurl's configure is not quite robust enough
and did not check for the presence of a libcurl.so.4

I assume

  ldd /share/apps/R/R-2.4.1/library/RCurl/libs/RCurl.so

also tells about the missing  libcurl.so.4 ?
Make sure you find that (in /usr/lib; /usr/local/lib, ... ?)
and then set your LD_LIBRARY_PATH
or even   'ldconfig' as root to make sure that libcurl.so.4 ``is
found''.
IMO the latter {correct ldconfig call / /etc/ld.so.conf setup}
should have happened as part of the installation of the curl
library.

On the 64-bit architecture, note that

  system(paste(ldd, dir(system.file(libs, package = RCurl), full=TRUE)))

finds all libraries in /lib64 and /usr/lib64 .


Martin Maechler, ETH Zurich


 sessionInfo()
Steven R version 2.4.1 (2006-12-18) 
Steven x86_64-unknown-linux-gnu 

Steven locale:
Steven 
LC_CTYPE=en_US.iso885915;LC_NUMERIC=C;LC_TIME=en_US.iso885915;LC_COLLATE=en_US.iso885915;LC_MONETARY=en_US.iso885915;LC_MESSAGES=en_US.iso885915;LC_PAPER=en_US.iso885915;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.iso885915;LC_IDENTIFICATION=C

Steven attached base packages:
Steven [1] stats graphics  grDevices utils datasets  
methods  
Steven [7] base 

Steven other attached packages:
Steven DBI 
Steven 0.1-12 
 

Steven Steven McKinney

Steven Statistician
Steven Molecular Oncology and Breast Cancer Program
Steven British Columbia Cancer Research Centre

Steven email: [EMAIL PROTECTED]

Steven tel: 604-675-8000 x7561

Steven BCCRC
Steven Molecular Oncology
Steven 675 West 10th Ave, Floor 4
Steven Vancouver B.C. 
Steven V5Z 1L3
Steven Canada




Steven -Original Message-
Steven From: [EMAIL PROTECTED] on behalf of Mark W Kimpel
Steven Sent: Wed 3/21/2007 3:09 PM
Steven To: r-help@stat.math.ethz.ch
Steven Subject: [R] problem with RCurl install on Unix
 
Steven I am having trouble getting an install of RCurl to work properly on 
a 
Steven Unix server. The steps I have taken are:

Steven 1. installed cUrl from source without difficulty
Steven 2. installed RCurl from source using the command
Steven ~/R_HOME/R-devel/bin/R CMD INSTALL -l ~/R_HOME/R-devel/library 
Steven ~/RCurl_0.8-0.tar.gz I received no errors during this install
Steven 3. when I go back to R and require(RCurl), I get

 require(RCurl)
Steven Loading required package: RCurl
Steven Error in dyn.load(x, as.logical(local), as.logical(now)) :
Steven unable to load shared library 
Steven '/N/u/mkimpel/BigRed/R_HOME/R-devel/library/RCurl/libs/RCurl.so':
Steven libcurl.so.4: cannot open shared object file: No such file or 
directory
Steven °1§ FALSE

Steven Outside of R I get

Steven mkimpelàBigRed:¨/R_HOME/R-devel/library/RCurl/libs ls
Steven RCurl.so

Steven I can cat into RCurl and I have even done chmod a+x RCurl.so in 
case 
Steven there was a problem with permission for R to open the file.

Steven Below is my sessionInfo. Thanks, Mark

 sessionInfo()
Steven R version 2.5.0 Under development (unstable) (2007-03-11 r40824)
Steven powerpc64-unknown-linux-gnu

Steven locale:
Steven 
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

Steven attached base packages:
Steven °1§ stats graphics  grDevices datasets  utils 
tools
Steven °7§ methods   base

Steven other attached packages:
Steven limma  affyaffyio   Biobase
Steven 2.9.13 1.13.14   1.3.3 1.13.39
 



Steven -- 
Steven Mark W. Kimpel MD
Steven Neuroinformatics
Steven Department of Psychiatry
Steven Indiana University School of Medicine

Steven __
Steven R-help@stat.math.ethz.ch mailing list
Steven https://stat.ethz.ch/mailman/listinfo

Re: [R] truehist bug?

2007-03-20 Thread Martin Maechler

 Gad == Gad Abraham [EMAIL PROTECTED]
 on Tue, 20 Mar 2007 17:02:18 +1100 writes:

Gad Hi,
Gad Is this a bug in truehist()?

 library(MASS)
 x - rep(1, 10)
 truehist(x)
Gad Error in pretty(data, nbins) : invalid 'n' value

You get something similar though slightly more helpful
from
hist(x, scott)

which then uses the same method for determining the number of bins /
classes for the histogram.

I'd say the main bug is in   
  nclass.scott()   [ and  also nclass.FD() ]

which do not work when  var(x) == 0  as in this case.
One could argue that  

1) truehist(x) should not use scott as
  default when var(x) == 0   {hence a buglet in truehist()}

and either

2) both hist() and truehist() should produce a better error
  message when scott (or FD) is used explicitly and var(x) == 0

or, rather IMO,

3) nclass.scott(x) and nclass.FD(x) should be extended to return a 
  non-negative integer even when  var(x) == 0

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] nclass.scott() and nclass.FD() {Re: truehist bug?}

2007-03-20 Thread Martin Maechler

[This has become entirely a topic for 'R-devel' hence, I'm
 diverting to there keeping R-help once; please follow-up
 only to R-devel ]

 MM == Martin Maechler [EMAIL PROTECTED]
 on Tue, 20 Mar 2007 08:49:16 +0100 writes:

 Gad == Gad Abraham [EMAIL PROTECTED]
 on Tue, 20 Mar 2007 17:02:18 +1100 writes:

Gad Hi,
Gad Is this a bug in truehist()?

 library(MASS)
 x - rep(1, 10)
 truehist(x)
Gad Error in pretty(data, nbins) : invalid 'n' value

MM You get something similar though slightly more helpful
MM from
MMhist(x, scott)

MM which then uses the same method for determining the number of bins /
MM classes for the histogram.

MM I'd say the main bug is in   
MM nclass.scott()   [ and  also nclass.FD() ]

MM which do not work when  var(x) == 0  as in this case.
MM One could argue that  

MM 1) truehist(x) should not use scott as
MM default when var(x) == 0   {hence a buglet in truehist()}

MM and either

MM 2) both hist() and truehist() should produce a better error
MM message when scott (or FD) is used explicitly and var(x) == 0

MM or, rather IMO,

MM 3) nclass.scott(x) and nclass.FD(x) should be extended to return a 
MM non-negative integer even when  var(x) == 0

after some further thought,
I'm proposing to adopt '3)'  {only; '1)' becomes unneeded}
with the following new code  which is back-compatible for the
case where 'h  0' and does something reasonable for the case h == 0 :

nclass.scott - function(x)
{
h - 3.5 * sqrt(stats::var(x)) * length(x)^(-1/3)
if(h  0) ceiling(diff(range(x))/h) else 1L
}

nclass.FD - function(x)
{
h - stats::IQR(x)
if(h == 0) h - stats::mad(x, constant = 2) # c=2: consistent with IQR
if (h  0) ceiling(diff(range(x))/(2 * h * length(x)^(-1/3))) else 1L
}


Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expm() within the Matrix package

2007-03-19 Thread Martin Maechler

 Gad == Gad Abraham [EMAIL PROTECTED]
 on Mon, 19 Mar 2007 09:36:15 +1100 writes:

Gad If you convert to numeric, you can then assign it to Loglik:
  Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q))
  Loglik[1]
Gad [1] 134.5565
 
 
 Hmm, I don't think that's Laura's problem
 (and actually I don't know what her problem is) :
 
 Assignment of a 1 x 1 matrix to a vector is not a problem,
 and hence the  as.numeric(.) above  really has no effect :
 
 ll - 1:2
 (m - matrix(pi, 1,1))
 [,1]
 [1,] 3.141593
 ll[1] - m
 ll
 [1] 3.141593 2.00

Gad Ah but you're using 'matrix' while she's using 'Matrix'
Gad (AFAIK there is no expm for matrix):


Yes, of course, you are absolutely right
(and I'm pretty embarrassed).

Martin

 library(Matrix)

Gad Loading required package: lattice
 m - Matrix(1, nrow=1, ncol=1)
 m
Gad 1 x 1 diagonal matrix of class ddiMatrix
Gad [,1]
Gad [1,]1
Gad Warning message:
Gad Ambiguous method selection for diag, target ddiMatrix (the first 
of 
Gad the signatures shown will be used)
Gad diagonalMatrix
Gad ddenseMatrix
Gad in: .findInheritedMethods(classes, fdef, mtable)
 a - vector(numeric, 0)
 a[1] - m
Gad Error in a[1] - m : incompatible types (from S4 to double) in 
Gad subassignment type fix
 a[1] - as.numeric(m)
 a
Gad [1] 1

[.]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] question on table format

2007-03-19 Thread Martin Maechler

use
print(your.character.matrix,  quote = FALSE)

HTH,
Martin

 Norbert == Norbert NEUWIRTH [EMAIL PROTECTED]
 on Mon, 19 Mar 2007 13:20:57 +0100 writes:

Norbert hi everybody,
Norbert i am hanging in a problem that should be solved quickly, but, in 
fact, i do not get further: building up a SUR/SES-system myself, where the 
regressions are finally used to be poisson and/or negative-binomial type, i 
want to generate a simple table, where i can quickly see the coefficients and 
their markers  for significance.

Norbert so i constructed this final.table (code  outout below). as the  
markers for significance are character-type, the content of each cell in the 
whole table appears as character with quotation marks.these quotations marks 
should be suppressed. how can i handle this?

Norbert thank you in advance


Norbert norbert

Norbert ---
Norbert code:

Norbert final.table3 - as.matrix (cbind(
Norbert round(coef.sur5.lm[,1],4),sigc.sur5.lm[,1],
Norbert round(coef.sur5.lm[,2],4),sigc.sur5.lm[,2],
Norbert round(coef.sur5.lm[,3],4),sigc.sur5.lm[,3],
Norbert round(coef.sur5.lm[,4],4),sigc.sur5.lm[,4],
Norbert round(coef.sur5.lm[,5],4),sigc.sur5.lm[,5]
Norbert ))
Norbert colnames(final.table3) - c(MW,sig.,
Norbert HP,sig.,
Norbert CC,sig.,
Norbert AL,sig.,
Norbert RC,sig.)

Norbert ---
Norbert output:

Norbert MWsig.  HPsig.  CCsig.  ALsig.  RC 
   sig.
Norbert (Intercept) 6.7757  *** -2.2851 *** -0.067  
8.517   *** 11.0767 ***
Norbert FEMALE  -1.6267 *** 2.4045  *** 0.2892  *** 
-1.0887 *** 0.0236  
Norbert AGE -0.0324 .   0.1923  *** 0.0158  *** 
-0.0984 *** -0.0778 ***
Norbert I(AGE^2)1e-04   -0.0019 *** -2e-04  *** 
0.0011  *** 0.001   ***
Norbert C2.H-0.2589 *   0.2051  **  0.5289  *** 
-0.4537 *** -0.0211 
Norbert C2_3.H  -0.4278 **  0.118   0.4837  *** 
-0.1301 -0.0422 
Norbert ...... ... ... 
... ...
Norbert -- 
Norbert ---
Norbert Mag. Norbert NEUWIRTH

Norbert Roubiczekgasse 2/23
Norbert A-1100  WIEN
Norbert mob: +43 699 1835 0704

Norbert __
Norbert R-help@stat.math.ethz.ch mailing list
Norbert https://stat.ethz.ch/mailman/listinfo/r-help
Norbert PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Norbert and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expm() within the Matrix package

2007-03-16 Thread Martin Maechler

 Gad == Gad Abraham [EMAIL PROTECTED]
 on Fri, 16 Mar 2007 09:48:52 +1100 writes:

Gad Laura Hill wrote:
 Hi
 
 Could anybody give me a bit of advice on some code I'm having trouble 
with?
 
 I've been trying to calculate the loglikelihood of a function iterated 
over
 a data of time values and I seem to be experiencing difficulty when I use
 the function expm(). Here's an example of what I am trying to do
 
 
 y-c(5,10)  #vector of 2 survival times
 
 p-Matrix(c(1,0),1,2)   #1x2 matrix
 
 Q-Matrix(c(1,2,3,4),2,2)   #2x2 matrix
 
 q-Matrix(c(1,2),2,1)   #2x1 matrix
 
 Loglik-rep(0,length(y))#creating empty vector of same length as y
 
 for(i in 1:length(y)){
 
 Loglik[i]-log((p %*% (expm(Q*y[i]))) %*% q)  #calculating
 
 #  Loglikelihood for each y[i]
 }
 
 The code works perfectly if I use exp(Q*y[i]) but not for expm()

Gad You need to do a type conversion here, because you get the loglik as a 
Gad 1x1 Matrix, instead of a scalar:

Gad (after running your code)

 log(p %*% expm(Q * y[i]) %*% q)
Gad 1 x 1 Matrix of class dgeMatrix
Gad [,1]
Gad [1,] 134.5565


Gad If you convert to numeric, you can then assign it to Loglik:
 Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q))
 Loglik[1]
Gad [1] 134.5565


Hmm, I don't think that's Laura's problem
(and actually I don't know what her problem is) :

Assignment of a 1 x 1 matrix to a vector is not a problem,
and hence the  as.numeric(.) above  really has no effect :

 ll - 1:2
 (m - matrix(pi, 1,1))
 [,1]
[1,] 3.141593
 ll[1] - m
 ll
[1] 3.141593 2.00
 

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] expm() within the Matrix package

2007-03-16 Thread Martin Maechler

 Gad == Gad Abraham [EMAIL PROTECTED]
 on Fri, 16 Mar 2007 09:48:52 +1100 writes:

Gad Laura Hill wrote:
 Hi
 
 Could anybody give me a bit of advice on some code I'm having trouble 
with?
 
 I've been trying to calculate the loglikelihood of a function iterated 
over
 a data of time values and I seem to be experiencing difficulty when I use
 the function expm(). Here's an example of what I am trying to do
 
 
 y-c(5,10)  #vector of 2 survival times
 
 p-Matrix(c(1,0),1,2)   #1x2 matrix
 
 Q-Matrix(c(1,2,3,4),2,2)   #2x2 matrix
 
 q-Matrix(c(1,2),2,1)   #2x1 matrix
 
 Loglik-rep(0,length(y))#creating empty vector of same length as y
 
 for(i in 1:length(y)){
 
 Loglik[i]-log((p %*% (expm(Q*y[i]))) %*% q)  #calculating
 
 #  Loglikelihood for each y[i]
 }
 
 The code works perfectly if I use exp(Q*y[i]) but not for expm()

Gad You need to do a type conversion here, because you get the loglik as a 
Gad 1x1 Matrix, instead of a scalar:

Gad (after running your code)

 log(p %*% expm(Q * y[i]) %*% q)
Gad 1 x 1 Matrix of class dgeMatrix
Gad [,1]
Gad [1,] 134.5565


Gad If you convert to numeric, you can then assign it to Loglik:
 Loglik[1] - as.numeric(log(p %*% expm(Q * y[i]) %*% q))
 Loglik[1]
Gad [1] 134.5565



Gad -- 
Gad Gad Abraham
Gad Department of Mathematics and Statistics
Gad The University of Melbourne
Gad Parkville 3010, Victoria, Australia
Gad email: [EMAIL PROTECTED]
Gad web: http://www.ms.unimelb.edu.au/~gabraham

Gad __
Gad R-help@stat.math.ethz.ch mailing list
Gad https://stat.ethz.ch/mailman/listinfo/r-help
Gad PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Gad and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] CLUSTER Package

2007-03-12 Thread Martin Maechler

Hi Vallejo,

I'm pretty busy currently, and feel your question has much more
to do with how to use R more generally than with using the
functions from the  cluster package.

So you may get help from other R-help readers,
but maybe only after you have followed the posting-guide
and give a reproducible example as you're asked there.

Regards,
Martin Maechler

 VallejoR == Vallejo, Roger [EMAIL PROTECTED]
 on Mon, 12 Mar 2007 10:28:01 -0400 writes:

VallejoR Hi Martin,
VallejoR In using the Cluster Package, I have results for PAM and DIANA
VallejoR clustering algorithms (below part and hier objects):



VallejoR part - pam(trout, bestk)
VallejoR # PAM results


VallejoR hier - diana(trout)
VallejoR # DIANA results

VallejoR GeneNames - show(RG$genes)
VallejoR # Gene Names are in this object 

 

VallejoR But I would like also to know what genes (NAMES) are included in 
each
VallejoR cluster. I tried unsuccessfully to send these results to output 
files
VallejoR (clusters with gene Names). This must be an easy task for a good R
VallejoR programmer. I will appreciate very much directions or R code on 
how to
VallejoR send the PAM and DIANA results to output files including 
information on
VallejoR genes (Names) per each cluster.

 

VallejoR Thank you very much.

VallejoR Roger

 

 

VallejoR Roger L. Vallejo, Ph.D.

VallejoR Computational Biologist  Geneticist

VallejoR U.S. Department of Agriculture, ARS  

VallejoR National Center for Cool  Cold Water Aquaculture

VallejoR 11861 Leetown Road

VallejoR Kearneysville, WV 25430

VallejoR Voice:(304) 724-8340 Ext. 2141

VallejoR Email:   [EMAIL PROTECTED] mailto:[EMAIL PROTECTED] 

 


VallejoR [[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Off topic:Spam on R-help increase?

2007-03-09 Thread Martin Maechler

 DB == Douglas Bates [EMAIL PROTECTED]
 on Tue, 6 Mar 2007 11:57:28 -0600 writes:

DB On 3/6/07, Bert Gunter [EMAIL PROTECTED] wrote:
 In the past 2 days I have seen a large increase of  spam getting into
 R-help. Are others experiencing this problem? If so, has there been some
 change to the spam filters on the R-servers? If not, is the problem on my
 end?

DB There has indeed been an increase in the amount of spam making it
DB through to the list.  We apologize for the inconvenience.  Regretably
DB we will not be able to do much about it until the beginning of next
DB week.

DB Martin Maechler is on vacation at present and I am administering the
DB lists until he returns.  Most of the time this works even though the
DB mail servers are in Zurich Switzerland and I am in Madison, WI, USA.
DB However, in the last two days we have had a surge in spam and quite a
DB bit of it is getting through the filters.

DB The filters are catching some of the spam.  I think the main
DB difference in the last two days has been that the level of spam to the
DB lists has increased but it could be that something has happened to the
DB filters too.

I've been back today, well relaxed and tanned from the nice
vacation; thanks to all of you for taking such an interest in it:-) ;-) 

With a work back-log of almost 4 weeks, I hadn't dared to look
into my R-lists inbox of  2400 messages until about an hour ago.

Fortunately it's not the spammers that would have become
smarter (well they have or their hired geeks, but already a few
months ago, not just now).
The main problem has ``just'' been disk-server, then network and
file mount problems on the mail server that were unfortunately
not seen at first by our IT staff. 
As a consequence, there had also been enormous ( 24 hours)
delays in mail delivery, maybe less visible on the mailing list
side of it.
As far as I can see/guess now, the spam problem should have
lasted only about one to two days --- too long of course for
you, but at least not till I had returned to work.

Yes indeed, we are sorry for this, but no, we cannot promise it
won't happen again :-\

Martin

DB All the lists except R-help only allow postings from subscribers so
DB there should very little spam on the other lists.

DB This subscriber-only policy can be difficult for people like me who
DB receive email at one address but send it from another.  Either the
DB sender must remember to use the account that is registered for the
DB list or the list administrator must manually approve the posting.
DB Even worse, such a policy dissuades new useRs from posting because
DB they get a response that their message has been held pending manual
DB approval by the administrator.  Sometimes they react by reposting the
DB message, then re-reposting, then ...

DB We have avoided instituting such a policy on R-help because of the
DB level of administrative work that will be involved and our desire not
DB to dissuade new useRs from posting to the list.

DB However, if this keeps up we may need to reconsider.

DB I would ask for the list subscribers to bear with us until Martin
DB returns and can check on whether something has gone wrong with the
DB filters.

DB __
DB R-help@stat.math.ethz.ch mailing list
DB https://stat.ethz.ch/mailman/listinfo/r-help
DB PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
DB and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] XML and str

2007-02-10 Thread Martin Maechler

 DTL == Duncan Temple Lang [EMAIL PROTECTED]
 on Sat, 10 Feb 2007 07:18:30 -0800 writes:

DTL Martin Maechler wrote:
 Ashley == Ashley Ford [EMAIL PROTECTED]
 on Wed, 07 Feb 2007 17:18:56 + writes:
 
Ashley If I read in an .xml file eg with 
 
  xeg - xmlTreeParse(system.file(exampleData, test.xml,
 package=XML))
 
Ashley It appears to be OK however examining it with str() gives an 
apparent
Ashley error
 
  str(xeg, 2)
Ashley List of 2
Ashley $ doc:List of 3
Ashley ..$ file: list()
Ashley .. ..- attr(*, class)= chr [1:2] XMLComment XMLNode
Ashley ..$ version :List of 4
Ashley .. ..- attr(*, class)= chr XMLNode
Ashley ..$ children:Error in obj$children[[...]] : subscript out of bounds
 
Ashley I am unsure if this is a feature or a bug and if the latter whether 
it
Ashley is in XML or str, it is not causing a problem but I would like to
Ashley understand what is happening, any ideas ?
 
 Yes -  thank you for providing a well-reproducible example.
 After setting  
 options(error = recover)
 
 I do
 
  obj - xeg$doc
  mode(obj) # list
 [1] list
  is.list(obj)  # TRUE
 [1] TRUE
  length(obj)   # 3
 [1] 3
  obj[[3]]  # --- the error you see above.
 Error in obj$children[[...]] : subscript out of bounds
 
 Enter a frame number, or 0 to exit   
 
 1: obj[[3]]
 2: `[[.XMLDocumentContent`(obj, 3)
 
 Selection: 0
 
  obj$children  # works, should be identical to obj[[3]]
 $comment
 !--A comment--
 
 $foo
 foo x=1
 element attrib1=my value/
 ..
 
 This shows that the XML package implements the [[ method
 wrongly IMHO and also inconsistently with the $ method.
 
 From a strict OOP view, the XML author could argue that
 this is not a bug in XML but rather str() which assumes that
 x[[length(x)]] works for objects of mode list even when they
 are not of *class* list, but I hope he would still rather
 consider changing [[.XMLDocumentContent ...
 


DTL More likely, the appropriate fix is to have
DTL length() return the relevant value.

Hmm. 

   library(XML)
   xeg - xmlTreeParse(system.file(exampleData, test.xml, package= XML))
   obj - xeg$doc
   mode(obj) # list
  [1] list
   is.list(obj)  # TRUE
  [1] TRUE
   length(obj)   # 3
  [1] 3
   obj[[3]]  # --- the error you see above.
  Error in obj$children[[...]] : subscript out of bounds
   names(obj)
  [1] file version  children
   class(obj)
  [1] XMLDocumentContent
   methods(class=class(obj))
  [1] xmlApply.XMLDocumentContent*  [[.XMLDocumentContent*   
  [3] xmlRoot.XMLDocumentContent*   xmlSApply.XMLDocumentContent*

   XML:::`[[.XMLDocumentContent`
  function (obj, ...) 
  {
  obj$children[[...]]
  }
  environment: namespace:XML

so  length(obj) is 3 and obj is a simple S3 object
which is just a list with 3 named components,
Do you really want to define  length(.) to also return the
length of obj$children instead of the length() of the list
itself?   
With that you'd have your XMLDocumentContent objects ``look''
like lists with three named components on one hand
(and help(xmlTreeParse) does mention these components)
but behave in other contexts as if it was just its own component
'obj$children'.   Of course you then should also define 
  print.XMLDocumentContent() and
  str.XMLDocumentContent()   accordingly, 
so users would barely know about the file and version
component of 'obj'.
But is this really desirable ?
With the above [[.XMLDoc...  you break the basic S-language
premise of  [[ and $ to behave accordingly.

You could solve everything elegantly if you used S4 instead of S3
classes, since there's no defined correspondence between slot
access and [[ (and yes, then (with S4), I'd agree that 

setMethod(length, XMLDocumentContent, 
  function(x) length([EMAIL PROTECTED]))

would be needed too -- and fine.

Martin

DTL I even recall considering this at the time of writing
DTL the package initially.  But that was back in 1999/2000
DTL and S4 and R/S-Plus compatibility were not what they
DTL are now.  It could be changed.  Not certain when I will
DTL get a chance.


Ashley examining components eg 
  str(xeg$doc$children,2)
 
Ashley List of 2
Ashley $ comment: list()
Ashley ..- attr(*, class)= chr [1:2] XMLComment XMLNode
Ashley etc 
 
Ashley is OK.
 
Ashley XML Version 1.4-1, 
Ashley same behaviour on Windows and Linux, R version 2.4.1 (2006-12-18)


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R vs Matlab {R in Industry}

2007-02-09 Thread Martin Maechler


 Ravi == Ravi Varadhan [EMAIL PROTECTED]
 on Thu, 8 Feb 2007 14:39:41 -0500 writes:

Ravi Here is a function to create a Toeplitz matrix of any size, and an 
example
Ravi of a 220 x 220 toeplitz matrix, which was created in almost no time:

Thanks Ravi,
but note two things

- ?toeplitz  tells you that R already has a fast (R-code-only)
  toeplitz() function

- The point of that benchmark is not to measure how fast you can
  build a Toeplitz matrix but simply to exercise a double
  (actually triple) for loop.
  {and the benchmark R script says so as comment in the code}

BTW {not to Ravi, but on the subject}:
 
1) When comparing this (the for-loop benchmark) --- with
   Matlab I would want to be sure that Matlab is not simply
   using an internal short cut since the benchmark is maybe
   too simplistic:
   for(i in 1:N) for(j in 1:N)  b[i,j] - abs(i - j)

   {and it maybe interesting to see if R's experimental byte
compiler would speed that up}

2) The above is very fast (IMO) and I cannot say why this could
   be too slow in a realistic situation.

3) The tables I've seen said that Matlab was about a factor of 2
   faster for the above loop benchmark.  That's scarcely a
   reason for downgrading (from R to Matlab).

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Timings of function execution in R [was Re: R in Industry]

2007-02-09 Thread Martin Maechler

 Ravi == Ravi Varadhan [EMAIL PROTECTED]
 on Thu, 8 Feb 2007 18:41:38 -0500 writes:

Ravi Hi,
Ravi greaterOf is indeed an interesting function.  It is much faster 
than the
Ravi equivalent R function, pmax, because pmax does a lot of checking for
Ravi missing data and for recycling.  Tom Lumley suggested a simple 
function to
Ravi replace pmax, without these checks, that is analogous to greaterOf, 
which I
Ravi call fast.pmax.  

Ravi fast.pmax - function(x,y) {i- xy; x[i]-y[i]; x}

Ravi Interestingly, greaterOf is even faster than fast.pmax, although you 
have to
Ravi be dealing with very large vectors (O(10^6)) to see any real 
difference.

Yes. Indeed, I have a file, first version dated from 1992
where I explore the slowness of pmin() and pmax() (in S-plus
3.2 then). I had since added quite a few experiments and versions to that
file in the past.

As consequence, in the robustbase CRAN package (which is only a bit
more than a year old though), there's a file, available as
  https://svn.r-project.org/R-packages/robustbase/R/Auxiliaries.R
with the very simple content {note line 3 !}:

-
### Fast versions of pmin() and pmax() for 2 arguments only:

### FIXME: should rather add these to R
pmin2 - function(k,x) (x+k - abs(x-k))/2
pmax2 - function(k,x) (x+k + abs(x-k))/2
-

{the funny argument name 'k' comes from the use of these to
 compute Huber's psi() fast :

  psiHuber - function(x,k)  pmin2(k, pmax2(- k, x))
  curve(psiHuber(x, 1.35), -3,3, asp = 1)
}

One point *is* that I think proper function names would be pmin2() and
pmax2() since they work with exactly 2 arguments,
whereas IIRC the feature to work with '...' is exactly the
reason that pmax() and pmin() are so much slower.

I've haven't checked if Gabor's 
 pmax2.G - function(x,y) {z - x  y; z * (x-y) + y}
is even faster than the abs() using one.
It may have the advantage of giving *identical* results (to the
last bit!)  to pmax()  which my version does not --- IIRC the
only reason I did not follow my own 'FIXME' above.

I  had then planned to implement pmin2() and pmax2() in C code, trivially,
and and hence get identical (to the last bit!) behavior as
pmin()/pmax(); but I now tend to think that the proper approach is to
code pmin() and pmax() via .Internal() and hence C code ...

[Not before DSC and my vacations though!!]

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data.frame columns in R console

2007-02-09 Thread Martin Maechler

 Petr == Petr Pikal [EMAIL PROTECTED]
 on Fri, 09 Feb 2007 09:42:13 +0100 writes:

Petr Hi
Petr On 9 Feb 2007 at 10:17, Lauri Nikkinen wrote:

 Thank you for your answer. When I set options(width=250) I still get
 the same result when I print the data.frame on my Rgui console (R
 2.4.1, Windows XP). Colums become underneath each other. I also get an
 error (?) message  
 [ reached getOption(max.print) -- omitted 3462 rows ]]. 

As Petr explains below (and Brian Ripley), you
*really* should use different means here ---
but I think this is the first time that  the relatively new
option 'max.print' has hit R-help, hence one other hint, maybe
useful to the public:

Note that the 'max.print' option was introduced exactly for the
purpose of **protecting** the inadvertent user from a flood of output
spilling into his console/gui/..
(and apparently locking up R completely, we have even seen
 crashes when people wanted to print dataframes/matrices/arrays
 with millions of entries).

So, given the above message (yes, not an error),
why did you not try to read
 help(getOption)
and look for the word 'max.print' there ?

-- if you really really don't want to follow the advice of
Brian and Petr, then say something like
  options(max.print = 1e6)

Martin Maechler, ETH Zurich


 For example if I have a data.frame with 4000 rows and 200
 columns I would like to be able to use scroll bars in
 Rconsole to investigate the whole data.frame.

Petr I am not sure if it is the best idea. You shall probably use other 
Petr means for checking your data frame.

Petr Try ?summary, ?str or if you really want to check all values in data 
Petr frame you can use

Petr invisible(edit(test))

Petr to open a spreadsheet like editor.

Petr HTH
Petr Petr

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data.frame columns in R console

2007-02-09 Thread Martin Maechler

 Lauri == Lauri Nikkinen [EMAIL PROTECTED]
 on Fri, 9 Feb 2007 14:21:26 +0200 writes:


Lauri This still does not solve the issue that when I print in R console I 
get
Lauri columns that don't fit in the window underneath each other. Thanks 
anyway!

But Brian did give you all you needed (even more I'd say) to
solve that !?!?
Please apologize if I use a bit frank language, but using R, you
*really* are expected to read the documentation which is written
pretty carefully {probably that's what some people don't like
about it and call confusing ??}.

Specifically, Brian said

 BDR 200 columns will take far more than 250 characters.  The help says

and then pointed you to the docu for options(width = .).
I think you need to reread that paragraph, particularly the word
'character' and then you will understand that your original
approach of using options(width = 250) can *not* be what you
want if your dataframe has 200 columns.

Martin


Lauri 2007/2/9, Martin Maechler [EMAIL PROTECTED]:
 
  Petr == Petr Pikal [EMAIL PROTECTED]
  on Fri, 09 Feb 2007 09:42:13 +0100 writes:
 
Petr Hi
Petr On 9 Feb 2007 at 10:17, Lauri Nikkinen wrote:
 
  Thank you for your answer. When I set options(width=250) I still get
  the same result when I print the data.frame on my Rgui console (R
  2.4.1, Windows XP). Colums become underneath each other. I also get
 an
  error (?) message
  [ reached getOption(max.print) -- omitted 3462 rows ]].
 
 As Petr explains below (and Brian Ripley), you
 *really* should use different means here ---
 but I think this is the first time that  the relatively new
 option 'max.print' has hit R-help, hence one other hint, maybe
 useful to the public:
 
 Note that the 'max.print' option was introduced exactly for the
 purpose of **protecting** the inadvertent user from a flood of output
 spilling into his console/gui/..
 (and apparently locking up R completely, we have even seen
 crashes when people wanted to print dataframes/matrices/arrays
 with millions of entries).
 
 So, given the above message (yes, not an error),
 why did you not try to read
 help(getOption)
 and look for the word 'max.print' there ?
 
-- if you really really don't want to follow the advice of
 Brian and Petr, then say something like
 options(max.print = 1e6)
 
 Martin Maechler, ETH Zurich
 
 
  For example if I have a data.frame with 4000 rows and 200
  columns I would like to be able to use scroll bars in
  Rconsole to investigate the whole data.frame.
 
Petr I am not sure if it is the best idea. You shall probably use
 other
Petr means for checking your data frame.
 
Petr Try ?summary, ?str or if you really want to check all values in
 data
Petr frame you can use
 
Petr invisible(edit(test))
 
Petr to open a spreadsheet like editor.
 
Petr HTH
Petr Petr
 
 

Lauri [[alternative HTML version deleted]]

Lauri __
Lauri R-help@stat.math.ethz.ch mailing list
Lauri https://stat.ethz.ch/mailman/listinfo/r-help
Lauri PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Lauri and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Timings of function execution in R [was Re: R in Industry]

2007-02-09 Thread Martin Maechler

 TL == Thomas Lumley [EMAIL PROTECTED]
 on Fri, 9 Feb 2007 08:13:54 -0800 (PST) writes:

TL On 2/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
 The other reason why pmin/pmax are preferable to your functions is that
 they are fully generic.  It is not easy to write C code which takes into
 account that , [, [- and is.na are all generic.  That is not to say 
that
 it is not worth having faster restricted alternatives, as indeed we do
 with rep.int and seq.int.
 
 Anything that uses arithmetic is making strong assumptions about the
 inputs.  It ought to be possible to write a fast C version that worked 
for
 atomic vectors (logical, integer, real and character), but is there
 any evidence of profiled real problems where speed is an issue?


TL I had an example just last month of an MCMC calculation where profiling 
showed that pmax(x,0) was taking about 30% of the total time.  I used

TL function(x) {z - x0; x[z] - 0; x}

TL which was significantly faster. I didn't try the
TL arithmetic solution. 

I did - eons ago as mentioned in my message earlier in this
thread. I can assure you that those (also mentioned)

  pmin2 - function(k,x) (x+k - abs(x-k))/2
  pmax2 - function(k,x) (x+k + abs(x-k))/2

are faster still, particularly if you hardcode the special case of k=0!
{that's how I came about these:  pmax(x,0) is also denoted  x_+, and
x_+ := (x + |x|)/2
x_- := (x - |x|)/2
}

TL Also, I didn't check if a solution like this would still
TL be faster when both arguments are vectors (but there was
TL a recent mailing list thread where someone else did).

indeed, and they are faster.
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Timings of function execution in R [was Re: R in Industry]

2007-02-09 Thread Martin Maechler

 Duncan == Duncan Murdoch [EMAIL PROTECTED]
 on Fri, 09 Feb 2007 13:52:25 -0500 writes:

Duncan On 2/9/2007 1:33 PM, Prof Brian Ripley wrote:
 x - rnorm(1)
 system.time(for(i in 1:1000) pmax(x, 0))
 user  system elapsed
 4.430.054.54
 pmax2 - function(k,x) (x+k + abs(x-k))/2
 system.time(for(i in 1:1000) pmax2(x, 0))
 user  system elapsed
 0.640.030.67
 pm - function(x) {z - x0; x[z] - 0; x}
 system.time(for(i in 1:1000) pm(x))
 user  system elapsed
 0.590.000.59
 system.time(for(i in 1:1000) pmax.int(x, 0))
 user  system elapsed
 0.360.000.36
 
 So at least on one system Thomas' solution is a little faster, but a 
 C-level n-args solution handling NAs is quite a lot faster.

Duncan For this special case we can do a lot better using

Duncan pospart - function(x) (x + abs(x))/2

Indeed, that's what I meant when I talked about doing the
special case 'k = 0' explicitly -- and also what my timings
where based on.

Thank you Duncan -- and Brian for looking into providing an even
faster and more general C-internal version!
Martin

Duncan The less specialized function

Duncan pmax2 - function(x,y) {
Duncan diff - x - y
Duncan y + (diff + abs(diff))/2
Duncan }

Duncan is faster on my system than pm, but not as fast as pospart:

 system.time(for(i in 1:1000) pm(x))
Duncan [1] 0.77 0.01 0.78   NA   NA
 system.time(for(i in 1:1000) pospart(x))
Duncan [1] 0.27 0.02 0.28   NA   NA
 system.time(for(i in 1:1000) pmax2(x,0))
Duncan [1] 0.47 0.00 0.47   NA   NA



Duncan Duncan Murdoch

 
 On Fri, 9 Feb 2007, Martin Maechler wrote:
 
 TL == Thomas Lumley [EMAIL PROTECTED]
 on Fri, 9 Feb 2007 08:13:54 -0800 (PST) writes:
 
TL On 2/9/07, Prof Brian Ripley [EMAIL PROTECTED] wrote:
  The other reason why pmin/pmax are preferable to your functions is 
that
  they are fully generic.  It is not easy to write C code which takes 
into
  account that , [, [- and is.na are all generic.  That is not to 
say that
  it is not worth having faster restricted alternatives, as indeed we 
do
  with rep.int and seq.int.
 
  Anything that uses arithmetic is making strong assumptions about the
  inputs.  It ought to be possible to write a fast C version that 
worked for
  atomic vectors (logical, integer, real and character), but is there
  any evidence of profiled real problems where speed is an issue?
 
 
TL I had an example just last month of an MCMC calculation where profiling 
showed that pmax(x,0) was taking about 30% of the total time.  I used
 
TL function(x) {z - x0; x[z] - 0; x}
 
TL which was significantly faster. I didn't try the
TL arithmetic solution.
 
 I did - eons ago as mentioned in my message earlier in this
 thread. I can assure you that those (also mentioned)
 
 pmin2 - function(k,x) (x+k - abs(x-k))/2
 pmax2 - function(k,x) (x+k + abs(x-k))/2
 
 are faster still, particularly if you hardcode the special case of k=0!
 {that's how I came about these:  pmax(x,0) is also denoted  x_+, and
 x_+ := (x + |x|)/2
 x_- := (x - |x|)/2
 }
 
TL Also, I didn't check if a solution like this would still
TL be faster when both arguments are vectors (but there was
TL a recent mailing list thread where someone else did).
 
 indeed, and they are faster.
 Martin
 


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Attachments to R-help postings (Re: Singular Gradient)

2007-02-08 Thread Martin Maechler

 DB == Douglas Bates [EMAIL PROTECTED]
 on Wed, 7 Feb 2007 17:24:40 -0600 writes:

DB On 2/7/07, This Wiederkehr [EMAIL PROTECTED] wrote:
 I tried to fit data with the following function:
 
 fit-nls(y~ Is*(1-exp(-l*x))+Iph,start=list(Is=-2e-5,l=2.3,Iph=-0.3
 ),control=list(maxiter=500,minFactor=1/1,tol=10e-05),trace=TRUE)
 But I get only a singular Gradient warning...

DB Did you get any trace output at all?  It is not clear if you got the
DB singular gradient warning before the first iteration completed, which
DB means there is a problem at the starting estimates, or after a few
DB iterations.  Without the data it is difficult to decide.

 the data can by found attached(there are two sampels of data col 1/2 and
 3/4).

DB Thanks for offering to include the data.  My copy of your message did
DB not have the data enclosed.  Did you perhaps forget to attach the
DB file?

More probably he did not attach them with mime type text/plain.
Many e-mail clients nowadays attach everything and notably text
as unspecified binary (application/octet-stream).
For security (and anti-spam) reasons, such attachments are
eliminated from postings.

I've now slightly modified this content-filtering option for
R-help, such that (I think) such e-mails will be *rejected* instead 
of just the attachment removed -- I'm just trying that now,
attaching a 2 line text file


Martin





 I tried to fix it by chanching the start parameters but that didn't solve
 the problem.

 Would it be a possibiliti to use the selfstart Model? How?

DB Yes.  Try SSasymp.  I believe that model is equivalent to your model
DB but in a different parameterization.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the plotting position of theoretical quantile for qqnorm

2007-02-08 Thread Martin Maechler

 LizF == fengfeng  [EMAIL PROTECTED]
 on Wed, 7 Feb 2007 23:08:31 -0600 writes:

LizF Hello,
LizF I have a doubt about the plotting position of the theoretical 
quantile for
LizF the qqnorm
LizF command in R.

LizF Let F be the theoretical distribution of Y, we observed a sample of 
size n,
LizF y1,y2, ...,
LizF yn. We then sort it and comspare these empirical quantiles to the 
expected
LizF ones
LizF from F. For the plotting poition, there are several options:

LizF 1. i/(n+1)
LizF 2. (i-.375)/(n+.25)
LizF 3. (i- .3175)/ (n + .365)
LizF etc.

yes, particularly etc  ;-)

LizF Which one is qqnorm used?

It's right in front of you if you read  help(qqnorm)  carefully :

   See Also:
   
  'ppoints', used by 'qqnorm' to generate approximations to expected
  order statistics for a normal distribution.
   

So it uses ppoints()  and  help(ppoints) tells you what's going
on: The formula used depends on (n = 10) but see that help page.

LizF Thx a lot!

You're welcome,
Martin Maechler, ETH Zurich

LizF Liz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R vs Matlab {Re: R in Industry}

2007-02-08 Thread Martin Maechler

 Albr == Albrecht, Dr Stefan (AZ Private Equity Partner) [EMAIL 
 PROTECTED]
 on Thu, 8 Feb 2007 16:38:18 +0100 writes:

Albr Dear all,
Albr I was reading with great interest your comments about the use of R in
Albr the industry. Personally, I use R as scripting language in the 
financial
Albr industry, not so much for its statistical capabilities (which are
Albr great), but more for programming. I once switched from S-Plus to R,
Albr because I liked R more, it had a better and easier to use 
documentation
Albr and it is faster (especially with loops).
 
Albr Now some colleagues of mine are (finally) eager to join me in my
Albr quantitative efforts, but they feel that they are more at ease with
Albr Matlab. I can understand this. Matlab has a real IDE with symbolic
Albr debugger, integrated editor and profiling, etc. The help files are
Albr great, very comprehensive and coherent. It also could be easier to
Albr learn.
 
Albr And, I was very astonished to realise, Matlab is very, very much 
faster
Albr with simple for loops, which would speed up simulations 
considerably.
Can you give some evidence for this statement, please?

At the moment, I'd bet that you use forgot to pre-allocate a
result array in R and do something like the notorious horrible (:-)
1-dimensional

  r - NULL
  for(i in 1:1) {
r[i] - verycomplicatedsimulation(i)
  }

instead of the correct

  r - numeric(1)
  for(i in 1:1) {
r[i] - verycomplicatedsimulation(i)
  }

If r is a matrix or even higher array, and you are using rbind()
or cbind() inside the loop to build up the result,
the problem will become even worse.

Albr So I have trouble to argue for a use of R (which I like) instead of
Albr Matlab. The price of Matlab is high, but certainly not prohibitive. R 
is
Albr great and free, but maybe less comfortable to use than Matlab.
 
Albr Finally, after all, I have the impression that in many job offerings 
in
Albr the financial industry R is much less often mentioned than Matlab.
 
Albr I would very much appreciate any comments on my above remarks. I know
Albr there has been some discussions of R vs. Matlab on R-help, but these
Albr could be somewhat out-dated, since both languages are evolving quite
Albr quickly.
 
Albr With many thanks and best regards,
Albr Stefan Albrecht

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R Search

2007-02-07 Thread Martin Maechler

The official R Search place has been

http://search.R-project.org/

for quite a while now.
It does mention others including 'rseek' below.
BTW: It's main fault for me is that it does not include the
R-devel mailing list archives (hint hint :-)

Martin Maechler, ETH Zurich

 IM == İbrahim Mutlay [EMAIL PROTECTED]
 on Wed, 7 Feb 2007 02:44:24 -0500 writes:

IM I know that two of the search engine for R is available:
IM http://www.rseek.org/

IM http://www.dangoldstein.com/search_r.html


IM -- �brahim Mutlay

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R in Industry

2007-02-07 Thread Martin Maechler

 Frank == Frank E Harrell [EMAIL PROTECTED]
 on Tue, 06 Feb 2007 21:59:45 -0600 writes:

Frank Matthew Keller wrote:
 Bob,
 
 Far from flaming you, I think you made a good point - one
 that I imagine most people who use R have come
 across. The name R is a big impediment to effective
 online searches. As a check, I entered R software, SAS
 software, SPSS software, and S+ software into
 google. The R 'hit rate' was only ten out of the first 20
 results (I didn't look any further). For the other three
 software packages, the hit rates were all 100% (20/20).
 
 I do wonder if anything can/should be done about this. I
 generally search using the term CRAN but of course,
 that omits lots of stuff relevant to R. Any ideas about
 how to do effective online searches for R related
 materials?

I don't think we (the R foundation) will ever change away from
R..

 
 Matt

Frank I just googled for R and www.r-project.org was the
Frank first hit.  Don't see a problem at present.

We are getting really off-topic, but that's interesting:

We all know that Google is helping the Chinese government to
censor their own people, so searches there can lead to
completely different results.  But even here in Zurich
Switzerland, I get quite a different hitlist :

 1) stat.ethz.ch/~statsoft/stat.programme/R.html [in German]

 2) Our local CRAN mirror:  stat.ethz.ch/CRAN/

 3) R - (German-language) Wikipedia about letter R: de.wikipedia.org/wiki/R

 4) DVD-R - (German-language) Wikipedia  de.wikipedia.org/wiki/DVD-R

 5) The R Project for Statistical Computing http://www.r-project.org/

So 3/5 are related to R which sounds good, but actually these 3
are all from the first twenty: 3/20.

Martin


 On 2/6/07, Wensui Liu [EMAIL PROTECTED] wrote:
 I've been looking for job that allows me to use R/S+
 since I got out of graduate school 2 years ago but with
 no success. I am wondering if there is something that
 can be done to promote the use of R in industry.
 
 It's been very frustrating to see people doing
 statistics using excel/spss and even more frustrating to
 see people paying $$$ for something much inferior to R.
 
 
 On 2/6/07, Doran, Harold [EMAIL PROTECTED] wrote:
 The other day, CNN had a story on working at
 Google. Out of curiosity, I went to the Google
 employment web site (I'm not looking, but just
 curious). In perusing their job posts for
 statisticians, preference is given to those who use R
 and python. Other languages, S-Plus and something
 called SAS were listed as lower priorities.
 
 When I started using Python, I noted they have a
 portion of the web site with job postings. CRAN does
 not have something similar, but think it might be
 useful. I think R is becoming more widely used in
 industry and I wonder if helping it move along a bit,
 the maintainer of CRAN could create a section of the
 web site devoted to jobs where R is a requirement.
 
 Hence, we could have our own little monster.com kind
 of thing going on. Of the multitude of ways the gospel
 can be spread, this is small.  But, I think every small
 step forward is good.
 
 Anyone think this is useful?
 
 Harold
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide
 http://www.R-project.org/posting-guide.html and provide
 commented, minimal, self-contained, reproducible code.
 
 
 --
 WenSui Liu A lousy statistician who happens to know a
 little programming
 (http://spaces.msn.com/statcompute/blog)
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do
 read the posting guide
 http://www.R-project.org/posting-guide.html and provide
 commented, minimal, self-contained, reproducible code.
 



Frank -- Frank E Harrell Jr Professor and Chair School of
Frank Medicine Department of Biostatistics Vanderbilt
Frank University

Frank __
Frank R-help@stat.math.ethz.ch mailing list
Frank https://stat.ethz.ch/mailman/listinfo/r-help PLEASE
Frank do read the posting guide
Frank http://www.R-project.org/posting-guide.html and
Frank provide commented, minimal, self-contained,
Frank reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] XML and str

2007-02-07 Thread Martin Maechler

 Ashley == Ashley Ford [EMAIL PROTECTED]
 on Wed, 07 Feb 2007 17:18:56 + writes:

Ashley If I read in an .xml file eg with 

 xeg - xmlTreeParse(system.file(exampleData, test.xml,
   package=XML))

Ashley It appears to be OK however examining it with str() gives an 
apparent
Ashley error

 str(xeg, 2)
Ashley List of 2
Ashley $ doc:List of 3
Ashley ..$ file: list()
Ashley .. ..- attr(*, class)= chr [1:2] XMLComment XMLNode
Ashley ..$ version :List of 4
Ashley .. ..- attr(*, class)= chr XMLNode
Ashley ..$ children:Error in obj$children[[...]] : subscript out of bounds

Ashley I am unsure if this is a feature or a bug and if the latter whether 
it
Ashley is in XML or str, it is not causing a problem but I would like to
Ashley understand what is happening, any ideas ?

Yes -  thank you for providing a well-reproducible example.
After setting  
  options(error = recover)

I do

obj - xeg$doc
mode(obj) # list
   [1] list
is.list(obj)  # TRUE
   [1] TRUE
length(obj)   # 3
   [1] 3
obj[[3]]  # --- the error you see above.
   Error in obj$children[[...]] : subscript out of bounds

   Enter a frame number, or 0 to exit   

   1: obj[[3]]
   2: `[[.XMLDocumentContent`(obj, 3)

   Selection: 0

obj$children  # works, should be identical to obj[[3]]
   $comment
   !--A comment--

   $foo
   foo x=1
element attrib1=my value/
   ..

This shows that the XML package implements the [[ method
wrongly IMHO and also inconsistently with the $ method.

From a strict OOP view, the XML author could argue that
this is not a bug in XML but rather str() which assumes that
x[[length(x)]] works for objects of mode list even when they
are not of *class* list, but I hope he would still rather
consider changing [[.XMLDocumentContent ...

Martin

Ashley examining components eg 
 str(xeg$doc$children,2)

Ashley List of 2
Ashley $ comment: list()
Ashley ..- attr(*, class)= chr [1:2] XMLComment XMLNode
Ashley etc 

Ashley is OK.

Ashley XML Version 1.4-1, 
Ashley same behaviour on Windows and Linux, R version 2.4.1 (2006-12-18)




Ashley The information contained in this E-Mail and any subsequent
Ashley correspondence is private and is intended solely for the intended
Ashley recipient(s).  The information in this communication may be 
confidential
Ashley and/or legally privileged.  Nothing in this e-mail is intended to
Ashley conclude a contract on behalf of QinetiQ or make QinetiQ subject to 
any
Ashley other legally binding commitments, unless the e-mail contains an 
express
Ashley statement to the contrary or incorporates a formal Purchase Order.

Ashley For those other than the recipient any disclosure, copying,
Ashley distribution, or any action taken or omitted to be taken in 
reliance on
Ashley such information is prohibited and may be unlawful.

Ashley Emails and other electronic communication with QinetiQ may be 
monitored
Ashley and recorded for business purposes including security, audit and
Ashley archival purposes.  Any response to this email indicates consent to
Ashley this.

Ashley Telephone calls to QinetiQ may be monitored or recorded for quality
Ashley control, security and other business purposes.

Ashley QinetiQ Group plc,

Ashley Company Registration No: 4586941,  

Ashley Registered office: 85 Buckingham Gate, London SW1E 6PD

Ashley __
Ashley R-help@stat.math.ethz.ch mailing list
Ashley https://stat.ethz.ch/mailman/listinfo/r-help
Ashley PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
Ashley and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download stock prices

2007-02-05 Thread Martin Maechler

Hi Matthew,

 Matthew == Matthew Keller [EMAIL PROTECTED]
 on Sun, 4 Feb 2007 23:06:46 -0500 writes:

Matthew Hi Mihai, You might check out the Rmetrics bundle,
Matthew available on the cran website. I've used its
Matthew fBasics library

it's the fBasics *package*; a library is something (actually
more than one thing!) different!

Matthew to download stock prices. Try the
Matthew yahooImport() function and the keystats() function
Matthew for downloading specific stock prices. I had to
Matthew fiddle with the keystats function to get it to work
Matthew properly, but I wrote to the writer of the library

you mean the maintainer of the *package* 

Matthew and it may have been fixed by now.

the function is (now) called  keystatsImport() and is part of
'fCalendar' -- which is automatically required from package
'fBasics'.
help(keystatsImport) contains several examples, unfortunately
explicitly not available through example(), but I can
successfully execute all of them -- and they do work, including
the yahooImport() one.

Martin Maechler, ETH Zurich

Matthew Best of luck,

Matthew Matt

Matthew On 2/4/07, Mihai Nica [EMAIL PROTECTED] wrote:
 gReetings:
 
 Is there any way to download a (or a sample of a)
 crossection of stock market prices? Or is it possible to
 use get.hist.quote with a *wild card*?
 
 Thanks,
 
 mihai
 
 Mihai Nica 170 East Griffith St. G5 Jackson, MS 39201
 601-914-0361


Matthew -- Matthew C Keller Postdoctoral Fellow Virginia
Matthew Institute for Psychiatric and Behavioral Genetics

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] can't find mailing list commands

2007-01-31 Thread Martin Maechler

Hmm,
I thought that learning to read was pretty wide-spread in your
country...

 Richard == Richard Müller [EMAIL PROTECTED]
 on Wed, 31 Jan 2007 10:01:34 +0100 writes:

Richard Sorry for inconvenience - but since I can't find
Richard any hints on this list at stat.ethz.ch, I'll ask here:
Richard  How do I unsubscribe from this list? 

Richard __
Richard R-help@stat.math.ethz.ch mailing list
Richard https://stat.ethz.ch/mailman/listinfo/r-help 
 
why do you think this does *not* mention 'unsubscribe' ???

Richard PLEASE do read the posting guide ...

yes, indeed.

Spamming  3000 readers of R-help with your problem is rather
impolite. I apologize for doing the same with my answer, but
maybe it prevents similar questsion at least for the very near
future..

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] %*% in Matrix objects

2007-01-29 Thread Martin Maechler

 Jose == Jose Quesada [EMAIL PROTECTED]
 on Sat, 27 Jan 2007 23:42:34 +0100 writes:

Jose Hi Martin, Thanks for your detailed answer.

Jose x - Matrix(1:12, 3,4, sparse = TRUE)

 I hope that you are aware of the fact that it's not
 efficient at all to store a dense matrix (it has *no* 0
 entry) as a sparse one..
 
 and your posting is indeed an incentive for the Matrix
 developers to improve that part ... ;-)
 

Jose Yes, the toy example is not sparse but the actual data
Jose is, and very large; I'm aware that coercing a dense
Jose matrix into the Sparse format is not leading to any
Jose saving (on the contrary). I'm talking about a real
Jose application with large sparse matrices; from now on,
Jose I'll post small examples using sparse matrices as well
Jose to avoid confusion.

ok.

  Jose so I tried

  Jose x = matrix(1:12,3,4)
  Jose x = as(x, CsparseMatrix)
  Jose xnorms  = sqrt(colSums(x^2))
  Jose xnorms = as(xnorms, CsparseMatrix)
  Jose (xnormed = t(x) * (1/xnorms))

  Jose But now, instead of a warning I get
  Jose Error: Matrices must have same dimensions in t(x) * (1/xnorms)

 yes.  And the same happens with traditional matrices -- and well so:
 For arithmetic with matrices (traditional or Matrices),
 
 A o B   (o in {+, *, ^, })
 -
 
 does require that matrices A and B are ``conformable'', i.e.,
 have exact same dimensions.
 
 Only when one of A or B is *not* a matrix,
 then the usual S-language recycling rules are applied,
 and that's what you were using in your first example
 (Matrix * numeric) above.
 

Jose Right. So this means that the * operator is not
Jose overloaded in Matrix (that is, if I use it, I'll get
Jose my Matrix coherced to matrix. Is that correct?

no.  The *  is overloaded (read on)

Jose Does this mean that there is no easy way to do element-by-element
Jose multiplication without leaving the sparse Matrix format?

No. There is an easy way:
If you multiply (or add or ..) two sparse matrices of matching dim(), the
result will be sparse. Also if use a scalar (length-1 vector)
with a Matrix, the result remains sparse (where appropriate) :

   (x - Matrix(c(0,1,0,0), 3,3))
  3 x 3 sparse Matrix of class dtCMatrix

  [1,] . . .
  [2,] 1 . .
  [3,] . 1 .
  Warning message:
  data length [4] is not a sub-multiple or multiple of the number of rows [3] 
in matrix 
   (2 * x) + t(x)
  3 x 3 sparse Matrix of class dgCMatrix

  [1,] . 1 .
  [2,] 2 . 1
  [3,] . 2 .
   ((2 * x) + t(x)) * t(x)
  3 x 3 sparse Matrix of class dgCMatrix

  [1,] . 1 .
  [2,] . . 1
  [3,] . . .


What you tried to do,  sparse * vector-of-length-gt-1, will
only result in a sparse matrix in the next version of the Matrix
package. 

Jose I suspect I'm facing the drop=T as before...
 why??

Jose Because when I got a row out of a Matrix object, the
Jose resulting vector is not of class Matrix but numeric,
Jose and then (Matrix * numeric) is applied.


Jose Last, I shouldn't consider myself the most standard
Jose user of the matrix package, since my lineal algebra is
Jose really basic. But in any case, you should know that
Jose your package is being enormously useful for me. Keep
Jose up the good work. And if I can help by posting my very
Jose basic questions, I'm glad to help.

Ok, thanks for the flowers :-)
Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] %*% in Matrix objects

2007-01-26 Thread Martin Maechler

 Jose == Jose Quesada [EMAIL PROTECTED]
 on Fri, 26 Jan 2007 05:24:12 +0100 writes:

Jose Dear R users,
Jose I need to normalize a bunch of row vectors. At a certain point I need 
to divide a matrix by a vector of norms. I find that the behavior of Matrix 
objects differs from normal matrix objects.

I believe you are showing evidence for that; though I know it's
still true, and will be less true for the next release of the
Matrix package.

Jose Example the following code examples differ only in
Jose xnormed changing from normal to Matrix object:

Jose x = matrix(1:12,3,4)
Jose x = as(x, CsparseMatrix)

or directly
   
   x - Matrix(1:12, 3,4, sparse = TRUE)

I hope that you are aware of the fact that it's not efficient at
all to store a dense matrix (it has *no* 0 entry) as a sparse one..

Jose xnorms  = sqrt(colSums(x^2))
Jose (xnormed = t(x) * (1/xnorms))

Jose This produces a warning: coercing sparse to dense matrix for 
arithmetic
Jose in: t(x) * 1/xnorms. but gets the result (a 4 x 3 matrix)

Jose I want to stay in sparse format anyway
Jose  (if it helps!) 

what should it help for? Are you talking about a real
application with a proper sparse matrix as opposed to the toy
example here? In that case I agree, and as a matter of fact, the
source code of the Matrix package leading to the above warning
is the following

  } else {
  ## FIXME: maybe far from optimal:
  warning(coercing sparse to dense matrix for arithmetic)
  callGeneric(as(e1, dgeMatrix), e2)
  }

and your posting is indeed an incentive for the Matrix developers
to improve that part ... ;-)

Jose so I tried

Jose x = matrix(1:12,3,4)
Jose x = as(x, CsparseMatrix)
Jose xnorms  = sqrt(colSums(x^2))
Jose xnorms = as(xnorms, CsparseMatrix)
Jose (xnormed = t(x) * (1/xnorms))

Jose But now, instead of a warning I get
Jose Error: Matrices must have same dimensions in t(x) * (1/xnorms)

yes.  And the same happens with traditional matrices -- and well so:
For arithmetic with matrices (traditional or Matrices),

A o B   (o in {+, *, ^, })   
-

does require that matrices A and B are ``conformable'', i.e.,
have exact same dimensions.

Only when one of A or B is *not* a matrix,
then the usual S-language recycling rules are applied,
and that's what you were using in your first example 
(Matrix * numeric) above.

Jose If I transpose the norms, the error dissapears, but the result is 1 x 
4 (not 3 x 4 as before).

That would be a bug of *not* giving an error... but I can't see
it. Can you please give an exact example -- as you well gave otherwise; and 
BTW, do you use a sensible sparse matrix such as
 (x - Matrix(c(0,0,1,0), 3,4))


Jose I suspect I'm facing the drop=T as before...
why??

Jose Also, it seems that in normal matrix objects %*%
Jose behaves the same as *,

not at all!!  If that was the case, the inventors of the S
language would never have introduced '%*%' !

Jose but in Matrix objects that is not the case.

Jose What am I missing?

I guess several things, see above, notably how basic
matrix+vector - arithmetic is defined to in the S language.

Regards,
Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Matrix subsetting {was ... vectorized nested loop...}

2007-01-24 Thread Martin Maechler

Hi Jose,
I'm answering your second batch of questions, since
Chuck Berry has already well done so with the first one

 Jose == Jose Quesada [EMAIL PROTECTED]
 on Tue, 23 Jan 2007 21:46:27 +0100 writes:

[]

Jose # example
Jose library(Matrix)
Jose x = as(x,CsparseMatrix)

[..]

Jose Also, I have noticed that getting a row from a Matrix
Jose object produces a normal array (i.e., it does not
Jose inherit Matrix class). 

This is very much on purpose, following the principle of least
surprise so I'm surprised you're suprised.. :

The 'Matrix' behavior has been modelled to follow the more than
20 years old 'matrix' behavior :

  matrix(1:9, 3) [,2]
 [1] 4 5 6
  matrix(1:9, 3) [,2 , drop=FALSE]
  [,1]
 [1,]4
 [2,]5
 [3,]6
  library(Matrix)
 Loading required package: lattice
  Matrix(1:9, 3) [,2]
 [1] 4 5 6
  Matrix(1:9, 3) [,2, drop = FALSE]
 3 x 1 Matrix of class dgeMatrix
  [,1]
 [1,]4
 [2,]5
 [3,]6
  

But then I should not be surprised, because
there has been the R FAQ

 7.5 Why do my matrices lose dimensions?

for quite a while.

*And* I think that there is only one thing in the S language
about which every knowledgable one agrees that it's a design
bug, and that's the fact that 'drop = TRUE' is the default, and
not 'drop = FALSE' {but it's not possible to change now, please
don't start that discussion!}


Given what I say above, I wonder if our (new-style) 'Matrix'
objects should not behave differently than (old-style) 'matrix' and
indeed do use a default 'drop = FALSE'.
This might break some Matrix-based code though, but then
'Matrix' is young enough, and working Matrix indexing is much
younger,  and there are only about 4 CRAN/Bioconductor
packages depending on 'Matrix'.
-- This discussion (about changing this behavior in the
Matrix package) should definitely be lead on the R-devel
mailing list -- CC'ing to R-devel
{hence one (but please *only* one !) cross-post}

Jose However, selecting 1 rows,
Jose does produce a same-class matrix. If I convert with
Jose as() the output of selecting one row, am I losing
Jose performance? Is there any way to make the resulting
Jose vector be a 1-D Matrix object?

yes, , drop = FALSE, see above

Martin

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] as.numeric(.1)

2007-01-24 Thread Martin Maechler

 NOEL == NOEL Yvonnick [EMAIL PROTECTED]
 on Wed, 24 Jan 2007 17:29:14 +0100 writes:

NOEL Hello,
NOEL I noticed the following strange behavior under R-2.4.0 (Linux 
Mandriva 
NOEL 2007) :

 options(OutDec)
NOEL $OutDec
NOEL [1] .

 as.numeric(.1)
NOEL [1] NA
NOEL Warning message:
NOEL NAs introduits lors de la conversion automatique

 as.numeric(,1)
NOEL [1] 0,1

Oops !  Should not happen, given your  getOption(OutDec) 

NOEL So I need to use the comma as the decimal separator,
NOEL at least as input.  Moreover, the last output also use
NOEL a comma, though the OutDec option was set to
NOEL .. Basic arithmetic ops on the command line work OK
NOEL with decimal dots.

NOEL I am pretty sure as.numeric(.1) used to work under older versions 
of 
NOEL R.  Could it be a localization problem ?

maybe / probably.  We cannot easily reproduce.
Instead of the output below, can you please give the full

 sessionInfo()

output?


NOEL I would like to use the dot as the decimal separator
NOEL both for input and output.

Well definitely!  (Using , is crazyness in my eyes!!)

NOEL Any suggestion ?

NOEL Thank you very much in advance,

NOEL Yvonnick Noel
NOEL Dprt of Psychology
NOEL U. of Rennes
NOEL France


NOEL platform   i686-pc-linux-gnu
NOEL arch   i686
NOEL os linux-gnu
NOEL system i686, linux-gnu
NOEL status
NOEL major  2
NOEL minor  4.0
NOEL year   2006
NOEL month  10
NOEL day03
NOEL svn rev39566
NOEL language   R

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Robust PCA?

2007-01-19 Thread Martin Maechler

 BertG == Bert Gunter [EMAIL PROTECTED]
 on Thu, 18 Jan 2007 15:28:47 -0800 writes:

BertG You seem not to have received a reply.  You can use
BertG cov.rob in MASS or cov.Mcd in robustbase or
BertG undoubtedly others to obtain a robust covariance
BertG matrix and then use that for PCA.

BertG Bert Gunter Nonclinical Statistics 

Indeed. Thank you Bert.

BTW, (for the archives) do note that their is a
R special interest group (=: R-SIG) on robust statistics,
and mailing list R-SIG-robust
(- https://stat.ethz.ch/mailman/listinfo/r-sig-robust, also for
 archives) with precisely the goal to foster coordinated
programming and porting of robust statistics functionality in R.

Expect to see more on this topic there, within the next few
days.

Martin Maechler, ETH Zurich


 -Original Message- From:
 [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf
 Of Talbot Katz Sent: Thursday, January 18, 2007 11:44
 AM To: r-help@stat.math.ethz.ch Subject: [R] Robust
 PCA?

 Hi.

 I'm checking into robust methods for principal
 components analysis.  There seem to be several
 floating around.  I'm currently focusing my attention
 on a method of Hubert, Rousseeuw, and Vanden Branden
 (http://wis.kuleuven.be/stat/Papers/robpca.pdf)
 mainly because I'm familiar with other work by
 Rousseeuw and Hubert in robust methodologies.  Of
 course,

 I'd like to obtain code for this method, or another
 good robust PCA method, if there's one out there.  I
 haven't noticed the existence on CRAN of a package
 for robust PCA (the authors of the ROBPCA method do
 provide MATLAB code).

 -- TMK -- 212-460-5430 home 917-656-5351 cell

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Where can I get the latex format manual?

2007-01-19 Thread Martin Maechler

 ronggui == ronggui  [EMAIL PROTECTED]
 on Sat, 20 Jan 2007 00:26:23 +0800 writes:

ronggui In www.r-project.org  I can find html and pdf format, and the
ronggui R-x.x-x/doc/manual has texinfo format only.

ronggui I can not find latex format manual. Am I miss something? Thanks 
for your hints.

Hint: Apart from the Reference (all the help pages), there is
  none, so it's understandable you can't find it.. 
Texinfo is a (very simple)  dialect of  TeX,
LaTeX is a different (much more sophisticated) dialect.

BTW, I still read (and search!) these manuals in the *info*
 format via Emacs [C-h i ..] which (for me) is faster than
 anything else.. 

Martin Maechler, ETH Zurich

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with C extension

2007-01-19 Thread Martin Maechler

 Markus == Markus Schmidberger [EMAIL PROTECTED]
 on Fri, 19 Jan 2007 15:02:31 +0100 writes:

Markus Hello,
Markus I try to write an extension in C, to get a faster functions.
Markus Therefore I have to add an element (vector) to a vector. The 
command in 
Markus R is very simple: x = c(x,a)
aka
x - c(x,a)

see that's why you'd probably rather stick with R a bit longer,
and profile [- help(Rprof)] your code and try to speedup quite
a bit before thinking about using C ...

Markus But in C I have the problem to reallocate my vector for getting 
more 
Markus space. Everything I tried, I get a Segmentation fault.

Markus So, how can I combine two vectors in C and give the result back to 
R 
Markus (return(x))?

and you have a copy of Writing R Extensions right in front of
you, electronically at least, I mean?

To return the new vector via C's return() you'd definitely need
to work with .Call() {which is a good thing but really not for
C-beginners}, and that needs a bit time of reading the above manual from
cover to cover. Note that you probably should start with (5.8) on
.Call.

I'm also tending to recommend even starting to peek into R's own
C source, notably the header files (src/include/...), and maybe
src/main/* in order to see how R objects (SEXPs) are handled,
how you add names(), dimnames() etc internally.. 

Hoping that helps,
Martin Maechler, ETH Zurich

Markus Thanks
Markus Markus Schmidberger

Markus -- 
Markus Dipl.-Tech. Math. Markus Schmidberger

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 905 matches

Mail list logo