from:"Martin Maechler"

Re: [Rd] Suggestion: help()

2005-06-07 Thread Martin Maechler

> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]>
> on Tue, 07 Jun 2005 12:12:57 -0400 writes:

  .

>>> The current .Rd files don't just document functions, they also document 
>>> data objects and classes.
>>> 
>>> But the main point here is that it's not good to have multiple 
>>> disconnected sets of documentation for a package.  Users should be able 
>>> to say the equivalent of "give me help on foo", and get help on foo, 
>>> whether it's a function, a data object, a package, a method, a class, 
or 
>>> whatever.  It's a bad design to force them to ask for the same sort of 
>>> thing in different ways depending on the type of thing they're asking 
for.
... On 6/7/2005 11:59 AM, Robert Gentleman wrote:

>> 
>> Hi Duncan and others,
>> I think they are linked. There are tools available both in R and in 
>> Bioconductor and some pop things up and some don't. It doesn't take much 
>> work to add vignettes to the windows menu bar - as we have done in BioC 
>> for some time now - it would be nice if this was part of R, but no one 
>> seems to have been interested in achieving that. Fixing the help system 
>> to deal with more diverse kinds of help would be nice as well - but 
>> taking one part of it and saying, "now everyone must do it this way" is 
>> not that helpful.

>> I respectfully disagree about the main point. My main point is, I 
>> don't want more things imposed on me; dealing with  R CMD check is 
>> enough of a burden in its current version, without someone deciding that 
>> it would be nice to have a whole bunch more requirements. Folks should 
>> feel entirely free to do what they want - but a little less free to tell 
>> me what I should be doing.

Duncan> And I disagree pretty strenuously about that.  One
Duncan> of the strengths of R is that it does impose
Duncan> standards on contributed packages, and these make
Duncan> them easier to use, less likely to conflict with
Duncan> each other, and so on.

Duncan> We shouldn't impose things lightly, but if they do
Duncan> make packages better, we should feel no reason not
Duncan> to tell you what you should be doing.

As Kurt mentioned early in this thread, we currently have
the auto-generated information from
either

help(package = )# or (equivalently!)
library(help = )

which shows  
  DESCRIPTION + 
  (user-written/auto-generated) INDEX +
  mentions vignettes and other contents in inst/doc/

Now if Duncan would write some R code that produces a   man/.Rd
file from the above information -- and as he mentioned also
added some of that functionality to package.skeleton(), 
I think everyone could become "happy", i.e.,
we could improve the system in the future with only a very light
burden on the maintainers of currently existing packages: You'd
have to run the new R function only once for every package you
maintain.

Also, the use of a user-written INDEX file could eventually
completely be abandoned in favor of maintaining
man/.Rd, which is much nicer;  
I'd welcome such a direction quite a bit.

And as much as I do like (and read) the vignettes that are
available, I also do agree that writing one other *.Rd file is
easier for many new package authors than to write a
vignette -- the package author already had to learn *.Rd syntax
anyway -- and it's nice to be able to produce something where
hyperlinks to the other existing reference material (ie. help
pages) just works out of the box.

OTOH, we should still keep in mind that it's worth to try to
get  bi-directional linking between (PDF) vignettes and help
files  (assuming all relevant files are installed by R CMD
INSTALL of course).

Martin

Duncan> Currently R has 3 types of help: the .Rd files in
Duncan> the man directory (which are converted into plain
Duncan> text, HTML, compiled HTML, LaTex, DVI, PDF, etc),
Duncan> the vignettes, and unstructured files in inst/doc.
Duncan> We currently require .Rd files for every function
Duncan> and data object.  Adding a requirement to also
Duncan> document the package that way is not all that much
Duncan> of a burden, and will automatically give all those
Duncan> output formats I listed above.  It will help to
Duncan> solve the often-complained about problem of packages
Duncan> that contain no overview at all.  (Requiring a
Duncan> vignette and giving a way to display it would also
Duncan> do that, but I think requiring a .Rd file is less of
Duncan> a burden, and for anyone who has gone to the trouble
Duncan> of creating a vignette, gives a natural place for a
Duncan> link to it.  Vignettes aren't used as much as they
Duncan> should be, because they are hidden away where users
Duncan> don't see them.)

Duncan> Duncan Murdoch

>> 
>> Best wishes,
>> Robert
>> 
>> 
>>> I

Re: [Rd] Re: [R] p-value > 1 in fisher.test()

2005-06-04 Thread Martin Maechler

>>>>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
>>>>> on Sat, 04 Jun 2005 11:43:34 +0200 writes:

UweL> (Ted Harding) wrote:
>> On 03-Jun-05 Ted Harding wrote:
>> 
>>> And on mine
>>> 
>>> (A: PII, Red Had 9, R-1.8.0):
>>> 
>>> ff <- c(0,10,250,5000); dim(ff) <- c(2,2);
>>> 
>>> 1-fisher.test(ff)$p.value
>>> [1] 1.268219e-11
>>> 
>>> (B: PIII, SuSE 7.2, R-2.1.0beta):
>>> 
>>> ff <- c(0,10,250,5000); dim(ff) <- c(2,2);
>>> 
>>> 1-fisher.test(ff)$p.value
>>> [1] -1.384892e-12
>> 
>> 
>> I have a suggestion (maybe it should also go to R-devel).
>> 
>> There are many functions in R whose designated purpose is
>> to return the value of a probability (or a probability
>> density). This designated purpose is in the mind of the
>> person who has coded the function, and is implicit in its
>> usage.
>> 
>> Therefore I suggest that every such function should have
>> a built-in internal check that no probability should be
>> less than 0 (and if the primary computation yields such
>> a value then the function should set it exactly to zero),
>> and should not exceed 1 (in which case the function should
>> set it exactly to 1). [And, in view of recent echanges,
>> I would suggest exactly +0, not -0!]
>> 
>> Similar for any attempts to return a negative probability
>> density; while of course a positive value can be allowed
>> to be anything.
>> 
>> All probabilities would then be guaranteed to be "clean"
>> and issues like the Fisher exact test above would no longer
>> be even a tiny problem.
>> 
>> Implementing this in the possibly many cases where it is
>> not already present is no doubt a long-term (and tedious)
>> project.
>> 
>> Meanwhile, people who encounter problems due to its absence
>> can carry out their own checks and adjustments!

UweL> [moved to R-devel]

UweL> Ted, my (naive?) objection:
UweL> Many errors in the underlying code have been detected by a function 
UweL> returning a nonsensical value, but if the probability is silently set 
to 
UweL> 0 or 1 ...
UweL> Hence I would agree to do so in special cases where it makes sense 
UweL> because of numerical issues, but please not globally.

I agree very much with Uwe's point.

Further to fisher.test(): This whole thread is
re-hashing a pretty recent  bug report on fisher.test() 
{ "negative p-values from fisher's test (PR#7801)", April '05}
I think that only *because* of the obviously wrong P-values have
we found and confirmed that the refereed and published code
underlying fisher.test() is bogous.   Such knowledge would have
been harder to gain if the P-values would have been cut into [0,1].

Martin Maechler

UweL> Uwe Ligges

>> Best wishes to all,
>> Ted.
>> 
>> 
>> 
>> E-Mail: (Ted Harding) <[EMAIL PROTECTED]>
>> Fax-to-email: +44 (0)870 094 0861
>> Date: 04-Jun-05   Time: 00:02:32
>> -- XFMail --

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] 1/tan(-0) != 1/tan(0)

2005-06-01 Thread Martin Maechler

Testing the code that Morten Welinder suggested for improving
extreme tail behavior of  qcauchy(),
I found what you can read in the subject.
namely that the tan() + floating-point implementation on all
four different versions of Redhat linux, I have access to on
i686 and amd64 architectures,

> 1/tan(c(-0,0))
gives
-Inf  Inf

and of course, that can well be considered a feature, since
after all, the tan() function does jump from -Inf to +Inf at 0. 
I was still surprised that this even happens on the R level,
and I wonder if this distinction of "-0" and "0" shouldn't be
mentioned in some place(s) of the R documentation.

For the real problem, the R source (in C), It's simple
to work around the fact that
qcauchy(0, log=TRUE)
for Morten's code proposal gives -Inf instead of +Inf.

Martin

>>>>> "MM" == Martin Maechler <[EMAIL PROTECTED]>
>>>>> on Wed,  1 Jun 2005 08:57:18 +0200 (CEST) writes:

>>>>> "Morten" == Morten Welinder <[EMAIL PROTECTED]>
>>>>> on Fri, 27 May 2005 20:24:36 +0200 (CEST) writes:

  .

Morten> Now that pcauchy has been fixed, it is becoming
Morten> clear that qcauchy suffers from the same problems.

Morten> 
Morten> qcauchy(pcauchy(1e100,0,1,FALSE,TRUE),0,1,FALSE,TRUE)

Morten> should yield 1e100 back, but I get 1.633178e+16.
Morten> The code below does much better.  Notes:

Morten> 1. p need not be finite.  -Inf is ok in the log_p
Morten> case and R_Q_P01_check already checks things.

MM> yes

Morten> 2. No need to disallow scale=0 and infinite
Morten> location.

MM> yes

Morten> 3. The code below uses isnan and finite directly.
Morten> It needs to be adapted to the R way of doing that.

MM> I've done this, and started testing the new code; a version will
MM> be put into the next version of R.

MM> Thank you for the suggestions.

>>> double
>>> qcauchy (double p, double location, double scale, int lower_tail, int 
log_p)
>>> {
>>> if (isnan(p) || isnan(location) || isnan(scale))
>>> return p + location + scale;

>>> R_Q_P01_check(p);
>>> if (scale < 0 || !finite(scale)) ML_ERR_return_NAN;

>>> if (log_p) {
>>> if (p > -1)
>>> lower_tail = !lower_tail, p = -expm1 (p);
>>> else
>>> p = exp (p);
>>> }
>>> if (lower_tail) scale = -scale;
>>> return location + scale / tan(M_PI * p);
>>> }

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Rout for library/base/R-ex/Extract.data.frame.R

2005-05-25 Thread Martin Maechler

>>>>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
>>>>> on Wed, 25 May 2005 11:08:18 +0200 writes:

UweL> Vadim Ogranovich wrote:
>> Hi,
>> 
>> I am writing a light-weight data frame class and want to
>> borrow the test cases from the standard data frame. I
>> found the test cases in
>> library/base/R-ex/Extract.data.frame.R, but surprisingly
>> no corresponding .Rout files. In fact there is no *.Rout
>> file in the entire tarball. Not that I cann't generate
>> them, but I am just curious why they are not there? How
>> does the base package get tested?
>> 
>> Thanks, Vadim

UweL> The base packages have their test cases in ...R/tests
UweL> rather than R/src/library/packagename

yes, and the *examples* from the help pages are just run, and
not compared to prespecified output in *.Rout.save (sic!) files.
In an *installed* (not the source!) version of R or an R package 
you find the R code for all the examples from the help pages
in /R-ex/*.R.   
That's the same for all R packages, not just the standard
packages.

Martin Maechler

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-exts.texi: nuke-trailing-whitespace has changed name (PR#7888)

2005-05-25 Thread Martin Maechler

Thank you,
Bjørn-Helge,

> "BjøHM" == Bjørn-Helge Mevik <[EMAIL PROTECTED]>
> on Sun, 22 May 2005 22:56:49 +0200 (CEST) writes:

  ..

BjøHM> In Appendix B R coding standards of the Writing R
BjøHM> Extensions manual, Emacs/ESS users are encouraged to
BjøHM> use

BjøHM> ..

BjøHM> However, as of ess-5.2.4 (current is 5.2.8)
BjøHM> `nuke-trailing-whitespace' has changed name to
BjøHM> `ess-nuke-trailing-whitespace'.

BjøHM> In addition: by default, ess-nuke-trailing-whitespace
BjøHM> is a no-op (as was nuke-trailing-whitespace).  To
BjøHM> `activate' it one must set
BjøHM> ess-nuke-trailing-whitespace-p to t or 'ask (default
BjøHM> is nil), e.g.  (setq ess-nuke-trailing-whitespace-p t)

Thank you. I've now changed it to

-

.
.

(add-hook 'local-write-file-hooks
  (lambda ()
(ess-nuke-trailing-whitespace)
(setq ess-nuke-trailing-whitespace-p 'ask)
;; or even
;; (setq ess-nuke-trailing-whitespace-p t)

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: Fwd: Re: [Rd] Implementation of the names attribute of attribute lists

2005-05-11 Thread Martin Maechler

>>>>> "Gabriel" == Gabriel Baud-Bovy <[EMAIL PROTECTED]>
>>>>> on Tue, 10 May 2005 19:00:53 +0200 writes:

Gabriel> Hi Martin,
Gabriel> Thanks for your reply. I am responding on r-devel to
Gabriel> provide some examples of outputs of the function that
Gabriel> I had list in the post-scriptum of my previous
Gabriel> email (BTW, did my post went through the list? I
Gabriel> subscribed only after mailing it).

Gabriel> You wrote:

>> Just to ask the obvious:
>> 
>> Why is using  str() not sufficient for you and instead,
>> you use  'print.object' {not a good name, BTW, since it looks like a
>> print() S3 method but isn't one} ?

Gabriel> Would printObject or printSEXP a better name?

definitely better because not interfering with the S3 pseudo-OO convention...
Still not my taste though :   
  every R object is an object (:-) -- and a SEXP internally --
  and we don't use 'fooObject' for other function names even
  though their arguments are R objects
My taste would rather lead to something like
 'displayStructure' (or 'dissectInternal' ;-) or a shorter
version of those.

>> The very few cases I found it was insufficient,
>> certainly  dput()  was, possibly even using it as
>> dput(. , control = ).

Gabriel> As I wrote in my email, I might have reinvented
Gabriel> the wheel. I did not know str! 

(amazingly ... ;-)

Gabriel> The output of str and print.object is quite similar
Gabriel> for atomic and list objects. I might look at this
Gabriel> function to change the argument names of the
Gabriel> print.object function.

Gabriel> However, the output of str is quite different
Gabriel> for language expressions and does not show as well
Gabriel> the their list-like strcuture since it respects
Gabriel> the superficial C-like syntax of the R language
Gabriel> (at the textual level).

Ok, thanks for clarifying this aspect, and the difference to
both str() and dput() here.

< much omitted ...>

Martin Maechler,
ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re: [R] Unbundling gregmisc (was: loading gap package)

2005-05-04 Thread Martin Maechler

> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
> on Wed, 4 May 2005 16:29:33 +0100 (BST) writes:

...


BDR> .. we need some education about how to use the
BDR> power of *.packages (and we need to get the MacOS
BDR> versions in place).

and maybe  we should even consider adding regression tests for
the *.packages() functions so chances increase they will work on
all platforms

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Enhanced version of plot.lm()

2005-04-27 Thread Martin Maechler

>>>>> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
>>>>> on 27 Apr 2005 16:54:02 +0200 writes:

PD> Martin Maechler <[EMAIL PROTECTED]> writes:
>> I'm about to commit the current proposal(s) to R-devel,
>> **INCLUDING** changing the default from 
>> 'which = 1:4' to 'which = c(1:3,5)
>> 
>> and ellicit feedback starting from there.
>> 
>> One thing I think I would like is to use color for the Cook's
>> contours in the new 4th plot.

PD> Hmm. First try running example(plot.lm) with the modified function and
PD> tell me which observation has the largest Cook's D. With the suggested
PD> new 4th plot it is very hard to tell whether obs #49 is potentially or
PD> actually influential. Plots #1 and #3 are very close to conveying the
PD> same information though...

I shouldn't be teaching here, and I know that I'm getting into fighted
territory (regression diagnostics; robustness; "The" Truth, etc,etc)
but I believe there is no unique way to define "actually influential"
(hence I don't believe that it's extremely useful to know
exactly which Cook's D is largest).

Partly because there are many statistics that can be derived from a
multiple regression fit all of which are influenced in some way. 
AFAIK, all observation-influence measures g(i) are functions of
(r_i, h_{ii}) and the latter are the quantities that "regression
users" should really know {without consulting a text book} and
that are generalizable {e.g. to "linear smoothers" such as
gam()s (for "non-estimated" smoothing parameter)}.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Enhanced version of plot.lm()

2005-04-27 Thread Martin Maechler

>>>>> "MM" == Martin Maechler <[EMAIL PROTECTED]>
>>>>> on Tue, 26 Apr 2005 12:13:38 +0200 writes:

>>>>> "JMd" == John Maindonald <[EMAIL PROTECTED]>
>>>>> on Tue, 26 Apr 2005 15:44:26 +1000 writes:

JMd> The web page http://wwwmaths.anu.edu.au/~johnm/r/plot-lm/
JMd> now includes files:
JMd> plot.lm.RData: Image for file for plot6.lm, a version of plot.lm in 
JMd> which
JMd> David Firth's Cook's distance vs leverage/(1-leverage) plot is plot 6.
JMd> The tick labels are in units of leverage, and the contour labels are
JMd> in units of absolute values of the standardized residual.

JMd> plot6.lm.Rd file: A matching help file

JMd> Comments will be welcome.

MM> Thank you John!

MM> The *.Rd has the new references and a new example but
MM> is not quite complete: the \usage{} has only 4 captions,
MM> \arguments{ .. \item{which} ..}  only mentions '1:5' --- but
MM> never mind.

MM> One of the new examples is

MM> ## Replace Cook's distance plot by Residual-Leverage plot
MM> plot(lm.SR, which=c(1:3, 5))

MM> and -- conceptually I'd really like to change the default from
MM> 'which = 1:4' to the above
MM> 'which=c(1:3, 5))' 

MM> This would be non-compatible though for all those that have
MM> always used the current default 1:4. 
MM> OTOH, "MASS" or Peter Dalgaard's book don't mention  plot( )
MM> or at least don't show it's result.

MM> What do others think?
MM> How problematic would a change be in the default plots that
MM> plot.lm() produces?

JMd> Another issue, discussed recently on r-help, is that when the model
JMd> formula is long, the default sub.caption=deparse(x$call) is broken
JMd> into multiple text elements and overwrites.  
MM> good point!

JMd> The only clean and simple way that I can see to handle
JMd> is to set a default that tests whether the formula is
JMd> broken into multiple text elements, and if it is then
JMd> omit it.  Users can then use their own imaginative
JMd> skills, and such suggestions as have been made on
JMd> r-help, to construct whatever form of labeling best
JMd> suits their case, their imaginative skills and their
JMd> coding skills.

MM> Hmm, yes, but I think we (R programmers) could try a bit harder
MM> to provide a reasonable default, e.g., something along

MM> cap <- deparse(x$call, width.cutoff = 500)[1]
MM> if((nc <- nchar(cap)) > 53) 
MM>   cap <- paste(substr(cap, 1, 50), "", substr(cap, nc-2, nc))

MM> {untested;  some of the details will differ;
MM> and the '53', '50' could depend on par("..") measures}

In the mean time, I came to quite a nice way of doing this:

if(is.null(sub.caption)) { ## construct a default:
cal <- x$call
if (!is.na(m.f <- match("formula", names(cal {
cal <- cal[c(1, m.f)]
names(cal)[2] <- "" # drop  " formula = "
}
cc <- deparse(cal, 80)
nc <- nchar(cc[1])
abbr <- length(cc) > 1 || nc > 75
sub.caption <-
if(abbr) paste(substr(cc[1], 1, min(75,nc)), "...") else cc[1]
}

I'm about to commit the current proposal(s) to R-devel,
**INCLUDING** changing the default from 
  'which = 1:4' to 'which = c(1:3,5)

and ellicit feedback starting from there.

One thing I think I would like is to use color for the Cook's
contours in the new 4th plot.

Martin

<.. lots deleted ..>

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] smooth.spline(): residuals(), fitted(),...

2005-04-27 Thread Martin Maechler

It has bothered me for quite some time that a smoothing spline
fit doesn't allow access to residuals or fitted values in
general, since after
  fit <- smooth.spline(x,y, *)

the resulting fit$x is really equal to the unique (up to 1e-6
precision) sorted original x values and fit$yin (and $y) is accordingly.

There are several possible ways to implement the missing
feature.  My current implementation would add a new argument
'keep.data' which when set to TRUE would make sure that the
original (x, y, w) are kept such that fitted values and (weighted
or unweighted) residuals are sensibly available from the result.

My main RFC (:= request for comments) is about the
acceptance of the new behavior to become the *default*
(i.e. 'keep.data = TRUE' would be default) such that by default
residuals(smooth.spline(...)) will work.

The drawback of the new default behavior would be that
potentially a 'fit' can become quite a bit larger than previously, e.g.
in the following extremely artificial example

  x0 <- seq(0,1, by = 0.1)
  x <- sort(sample(x0, 1000, replace = TRUE))
  ff <- function(x) 10*(x-1/4)^2 + sin(7*pi*x)
  y <- ff(x) + rnorm(x) / 2
  fit <- smooth.spline(x,y)

but typically the size increase will only be less than about 40%.

Comments are welcome.

Martin Maechler, ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Enhanced version of plot.lm()

2005-04-26 Thread Martin Maechler

> "JMd" == John Maindonald <[EMAIL PROTECTED]>
> on Tue, 26 Apr 2005 15:44:26 +1000 writes:

JMd> The web page http://wwwmaths.anu.edu.au/~johnm/r/plot-lm/
JMd> now includes files:
JMd> plot.lm.RData: Image for file for plot6.lm, a version of plot.lm in 
JMd> which
JMd> David Firth's Cook's distance vs leverage/(1-leverage) plot is plot 6.
JMd> The tick labels are in units of leverage, and the contour labels are
JMd> in units of absolute values of the standardized residual.

JMd> plot6.lm.Rd file: A matching help file

JMd> Comments will be welcome.

Thank you John!

The *.Rd has the new references and a new example but
is not quite complete: the \usage{} has only 4 captions,
\arguments{ .. \item{which} ..}  only mentions '1:5' --- but
never mind.

One of the new examples is

## Replace Cook's distance plot by Residual-Leverage plot
plot(lm.SR, which=c(1:3, 5))

and -- conceptually I'd really like to change the default from
'which = 1:4' to the above
'which=c(1:3, 5))' 

This would be non-compatible though for all those that have
always used the current default 1:4. 
OTOH, "MASS" or Peter Dalgaard's book don't mention  plot( )
or at least don't show it's result.

What do others think?
How problematic would a change be in the default plots that
plot.lm() produces?

JMd> Another issue, discussed recently on r-help, is that when the model
JMd> formula is long, the default sub.caption=deparse(x$call) is broken
JMd> into multiple text elements and overwrites.  
good point!

JMd>  The only clean and simple way that I can see to handle
JMd> is to set a default that tests whether the formula is
JMd> broken into multiple text elements, and if it is then
JMd> omit it.  Users can then use their own imaginative
JMd> skills, and such suggestions as have been made on
JMd> r-help, to construct whatever form of labeling best
JMd> suits their case, their imaginative skills and their
JMd> coding skills.

Hmm, yes, but I think we (R programmers) could try a bit harder
to provide a reasonable default, e.g., something along

 cap <- deparse(x$call, width.cutoff = 500)[1]
 if((nc <- nchar(cap)) > 53)
 cap <- paste(substr(cap, 1, 50), "", substr(cap, nc-2, nc))

{untested;  some of the details will differ;
 and the '53', '50' could depend on par("..") measures}

JMd> John Maindonald.

JMd> On 25 Apr 2005, at 8:00 PM, David Firth wrote:

>> From: David Firth <[EMAIL PROTECTED]>
>> Date: 24 April 2005 10:23:51 PM
>> To: John Maindonald <[EMAIL PROTECTED]>
>> Cc: r-devel@stat.math.ethz.ch
>> Subject: Re: [Rd] Enhanced version of plot.lm()
>> 
>> 
>> On 24 Apr 2005, at 05:37, John Maindonald wrote:
>> 
>>> I'd not like to lose the signs of the residuals. Also, as
>>> plots 1-3 focus on residuals, there is less of a mental
>>> leap in moving to residuals vs leverage; residuals vs
>>> leverage/(1-leverage) would also be in the same spirit.
>> 
>> Yes, I know what you mean.  Mental leaps are a matter of 
>> taste...pitfalls, etc, come to mind.
>> 
>>> 
>>> Maybe, one way or another, both plots (residuals vs
>>> a function of leverage, and the plot from Hinkley et al)
>>> should go in.  The easiest way to do this is to add a
>>> further which=6.  I will do this if the consensus is that
>>> this is the right way to go.  In any case, I'll add the
>>> Hinkley et al reference (author of the contribution that
>>> includes p.74?) to the draft help page.
>> 
>> Sorry, I should have given the full reference, which (in BibTeX format 
>> from CIS) is
>> 
>> @inproceedings{Firt:gene:1991,
>> author = {Firth, D.},
>> title = {Generalized Linear Models},
>> year = {1991},
>> booktitle = {Statistical Theory and Modelling. In Honour of Sir 
>> David Cox, FRS},
>> editor = {Hinkley, D. V. and Reid, N. and Snell, E. J.},
>> publisher = {Chapman \& Hall Ltd},
>> pages = {55--82},
>> keywords = {Analysis of deviance; Likelihood}
>> }
>> 
>> David
>> 
JMd> John Maindonald email: [EMAIL PROTECTED]
JMd> phone : +61 2 (6125)3473fax  : +61 2(6125)5549
JMd> Centre for Bioinformation Science, Room 1194,
JMd> John Dedman Mathematical Sciences Building (Building 27)
JMd> Australian National University, Canberra ACT 0200.

JMd> __
JMd> R-devel@stat.math.ethz.ch mailing list
JMd> https://stat.ethz.ch/mailman/listinfo/r-devel

> "JMd" == John Maindonald <[EMAIL PROTECTED]>
> on Tue, 26 Apr 2005 15:44:26 +1000 writes:

JMd> The web page
JMd> http://wwwmaths.anu.edu.au/~johnm/r/plot-lm/ now
JMd> includes files: plot.lm.RData: Image for file for
JMd> plot6.lm, a version of plot.lm in which David Firth's
JMd> Cook's distance vs leverage/(1-leverage) plot

Re: [Rd] Speeding up library loading

2005-04-25 Thread Martin Maechler

> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
> on Mon, 25 Apr 2005 18:51:50 +0200 writes:

UweL> Ali - wrote:
>> (1) When R tries to load a library, does it load 'everything' in the 
>> library at once?

UweL> No, see ?lazyLoad

are you sure Ali is talking about *package*s.
He did use the word "library" though, and most of us (including
Uwe!) know the difference...

>> (2) Is there any options to 'load as you go'?

UweL> Well, this is the way R does it

for packages yes, because of lazyloading, as Uwe mentioned above.

For libraries, (you know: the things you get from compiling and
linking C code ..), it may be a bit different.

What do you really mean, packages or libraries,
Ali?

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Overloading methods in R

2005-04-20 Thread Martin Maechler

>>>>> "Ali" == Ali - <[EMAIL PROTECTED]>
>>>>> on Wed, 20 Apr 2005 15:45:09 + writes:

Ali> Thanks a lot Tony. I am trying to apply the overloading
Ali> to the methods created by R.oo package and,
Ali> unfortunately, R.oo uses S3-style classes; so I cannot
Ali> use the features of S4 methods as you described. On the
Ali> other hand, I caouldn't find a decent OO package which
Ali> is based on S4 AND comes with the official release of
Ali> R.

Ali, maybe we R-core members are not decent enough.
But we strongly believe that we don't want to advocate yet
another object system additionally to the S3 and S4 one,
and several of us have given talks and classes, even written
books on how to do "decent" object oriented programming 
`just' with the S3 and/or S4 object system.

No need of additional "oo" in our eyes.
Your main problem is that you assume what "oo" means {which may
well be true} but *additionally* you also assume that OO has to
be done in the same way you know it from Python, C++, or Java..

Since you are new, please try to learn the S4 way,
where methods belong to (generic) functions more than
to classes in some way, particularly if you compare with other
OO systems where methods belong entirely to classes.
This is NOT true for R (and S-plus) and we don't want this to
change {and yes, we do know about C++, Python, Java,... and
their way to do OO}.

Please also read in more details the good advice given by Tony
Plate and Sean Davis.

Martin Maechler,
ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RFC: hexadecimal constants and decimal points

2005-04-18 Thread Martin Maechler

>>>>> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
>>>>> on Sun, 17 Apr 2005 12:38:10 +0100 (BST) writes:

BDR> These are some points stimulated by reading about C history (and 
BDR> related in their implementation).

<.>

BDR> 2) R does not have integer constants.  It would be
BDR> convenient if it did, and I can see no difficulty in
BDR> allowing the same conversions when parsing as when
BDR> coercing.  This would have the side effect that 100
BDR> would be integer (but the coercion rules would come
BDR> into play) but 20 would be double.  And
BDR> x <- 0xce80 would be valid.

Hmm, I'm not sure if this (parser change, mainly) is worth the
potential problems.  Of course you (Brian) know better than
anyone here that, when that change was implemented for S-plus, I think
Mathsoft (the predecessor of 'Insightful') did also change all
their legacy S code and translate all '' to '.'  just in
order to make sure that things stayed back compatible.  
And, IIRC, they recommended users to do so similarly with their
own S source files. I had found this extremely ugly at the time,
but it was mandated by the fact they didn't want to break
existing code which in some places did assume that e.g. '0' was
a double but became an integer in the new version of S-plus
{and e.g., as.double(.) became absolutely mandated before passing
 things to C  --- of course, using as.double(.) ``everywhere''
 before passing to C has been recommended for a long time which
 didn't prevent people to rely on the current behavior (in R) that
 almost all numbers are double}. 

We (or rather the less sophisticated members of the R community)
may get into similar problems when, e.g.,
matrix(0, 3,4)  suddenly produces an integer matrix instead of a
double precision one.

BDR> 3) We do allow setting LC_NUMERIC, but it partially breaks R if the 
BDR> decimal point is not ".".  (I know of no locale in which it is not "." 
or 
BDR> ",", and we cannot allow "," as part of numeric constants when 
parsing.) 
BDR> E.g.:

>> Sys.setlocale("LC_NUMERIC", "fr_FR")
BDR> [1] "fr_FR"
BDR> Warning message:
BDR> setting 'LC_NUMERIC' may cause R to function strangely in: 
BDR> setlocale(category, locale)
>> x <- 3.12
>> x
BDR> [1] 3
>> as.numeric("3,12")
BDR> [1] 3,12
>> as.numeric("3.12")
BDR> [1] NA
BDR> Warning message:
BDR> NAs introduced by coercion

BDR> We could do better by insisting that "." was the decimal point in all 
BDR> interval conversions _to_ numeric.  Then the effect of setting 
LC_NUMERIC 
BDR> would primarily be on conversions _from_ numeric, especially printing 
and 
BDR> graphical output.  (One issue would be what to do with scan(), which 
has a 
BDR> `dec' argument but is implemented assuming LC_NUMERIC=C.  I would hope 
to 
BDR> continue to have `dec' but perhaps with a locale-dependent default.)  
The 
BDR> resulting asymmetry (R would not be able to parse its own output) 
would be 
BDR> unhappy, but seems inevitable. (This could be implemented easily by 
having 
BDR> a `dec' arg to EncodeReal and EncodeComplex, and using LC_NUMERIC to 
BDR> control that rather than actually setting the local category.  For 
BDR> example, deparsing needs to be done in LC_NUMERIC=C.)

Yes, I like this quite a bit:

 -  Only allow "." as decimal point in conversions to numeric.

 -  Allowing "," (or other locale settings if there are) for
conversions _from_ numeric will be very attractive to some
(not to me) and will make the use of R's ``reporting
facility' much more natural to them. 

  That the asymmetry is bit unhappy -- and that will be a good reason
  to advocate (to the user community) that using "," for decimal
  point may be a bad idea in general.

Martin Maechler
ETH Zurich

BDR> All of these could be implemented by customized versions of 
BDR> strtod/strtol.

BDR> -- 
BDR> Brian D. Ripley,  [EMAIL PROTECTED]
BDR> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
BDR> University of Oxford, Tel:  +44 1865 272861 (self)
BDR> 1 South Parks Road, +44 1865 272866 (PA)
BDR> Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RFC: hexadecimal constants and decimal points

2005-04-18 Thread Martin Maechler

> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]>
> on Mon, 18 Apr 2005 03:33:42 -0400 (EDT) writes:

>> On Sun, 17 Apr 2005, Jan T. Kim wrote:
>> 
>>> On Sun, Apr 17, 2005 at 12:38:10PM +0100, Prof Brian Ripley wrote:
 These are some points stimulated by reading about C history (and
 related in their implementation).

 1) On some platforms

> as.integer("0xA")
 [1] 10

 but not all (not on Solaris nor Windows).  We do not define what is
 allowed, and rely on the OS's implementation of strtod (yes, not
 strtol).
 It seems that glibc does allow hex: C99 mandates it but C89 seems not
 to
 allow it.

 I think that was a mistake, and strtol should have been used.  Then C89
 does mandate the handling of hex constants and also octal ones.  So
 changing to strtol would change the meaning of as.integer("011").
>>> 
>>> I think interpretation of a leading "0" as a prefix indicating an octal
>>> representation should indeed be avoided. People not familiar to C will
>>> have a hard time understanding and getting used to this concept, and
>>> in addition, it happens way too often that numeric data are provided
>>> left-
>>> padded with zeros.

Duncan> I agree with this:  011 should be 11, it should not be 9.

I agree (with Duncan and Jan).

I'm sure the current (decimal) behavior is implicitly used in
many places of people's code that reads text files and
manipulates it.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] ESS 5.2.7 released

2005-04-18 Thread Martin Maechler

Dear ESS users, {BCC'ed to RPM and Debian maintainers of ESS}

We have now released ESS 5.2.7.  This is a bug fix release against 5.2.6
where - the new UTF-8 "support" gave problems for Xemacs, and
  - accidentally, 'auto-fill-mode' was activated for *.R buffers
with a few new features, see "New Features" below , notably some
extended Sweave supported, originally contributed by David Whiting.

I'm crossposting to R-devel just to make you aware that R 2.1.0,
bound to be released today, comes with UTF-8 (unicode) support and
that doesn't work correctly in ESS versions prior to 5.2.6.

Downloads from the ESS site http://ESS.R-project.org/ or
directly http://ess.r-project.org/downloads/ess/ as *.zip and
*.tar.gz files. Hopefully, *.deb and *.rpm will also be made
available in due time.

For the ESS core team,
Martin Maechler, ETH Zurich.

--- ANNOUNCE --


ANNOUNCING ESS
**

   The ESS Developers proudly announce the release of ESS

   5.2.7

   Emacs Speaks Statistics (ESS) provides an intelligent, consistent
interface between the user and the software.  ESS interfaces with
S-PLUS, R, SAS, BUGS and other statistical analysis packages under the
Unix, Microsoft Windows, and Apple Mac OS operating systems.  ESS is a
package for the GNU Emacs and XEmacs text editors whose features ESS
uses to streamline the creation and use of statistical software.  ESS
knows the syntax and grammar of statistical analysis packages and
provides consistent display and editing features based on that
knowledge.  ESS assists in interactive and batch execution of
statements written in these statistical analysis languages.

   ESS is freely available under the GNU General Public License (GPL).
Please read the file COPYING which comes with the distribution, for
more information about the license. For more detailed information,
please read the README files that come with ESS.

Getting the Latest Version
==

   The latest released version of ESS is always available on the web at:
ESS web page (http://ess.r-project.org) or StatLib
(http://lib.stat.cmu.edu/general/ESS/)

   The latest development version of ESS is available via
`https://svn.R-project.org/ESS/', the ESS Subversion repository.  If
you have a Subversion client (see `http://subversion.tigris.org/'), you
can download the sources using:
 % svn checkout https://svn.r-project.org/ESS/trunk PATH

which will put the ESS files into directory PATH.  Later, within that
directory, `svn update' will bring that directory up to date.
Windows-based tools such as TortoiseSVN are also available for
downloading the files.  Alternatively, you can browse the sources with a
web browser at: ESS SVN site (https://svn.r-project.org/ESS/trunk).
However, please use a subversion client instead to minimize the load
when retrieving.

   If you remove other versions of ESS from your emacs load-path, you
can then use the development version by adding the following to .emacs:

 (load "/path/to/ess-svn/lisp/ess-site.el")

   Note that https is required, and that the SSL certificate for the
Subversion server of the R project is

 Certificate information:
  - Hostname: svn.r-project.org
  - Valid: from Jul 16 08:10:01 2004 GMT until Jul 14 08:10:01 2014 GMT
  - Issuer: Department of Mathematics, ETH Zurich, Zurich, Switzerland, CH
  - Fingerprint: c9:5d:eb:f9:f2:56:d1:04:ba:44:61:f8:64:6b:d9:33:3f:93:6e:ad

(currently, there is no "trusted certificate").  You can accept this
certificate permanently and will not be asked about it anymore.

Current Features


   * Languages Supported:
* S family (S 3/4, S-PLUS 3.x/4.x/5.x/6.x/7.x, and R)

* SAS

* BUGS

* Stata

* XLispStat including Arc and ViSta

   * Editing source code (S family, SAS, BUGS, XLispStat)
* Syntactic indentation and highlighting of source code

* Partial evaluation of code

* Loading and error-checking of code

* Source code revision maintenance

* Batch execution (SAS, BUGS)

* Use of imenu to provide links to appropriate functions

   * Interacting with the process (S family, SAS, XLispStat)
* Command-line editing

* Searchable Command history

* Command-line completion of S family object names and file
  names

* Quick access to object lists and search lists

* Transcript recording

* Interface to the help system

   * Transcript manipulation (S family, XLispStat)
* Recording and saving transcript files

* Manipulating and editing saved transcripts

* Re-evaluating commands from transcript files

   * Help File Editing (R)
* Syntactic indentation and highlighting of source code.

* Sending Examples to running ESS process.

* Previewing

Requirement

[Rd] recent spam on mailing lists

2005-04-17 Thread Martin Maechler

Probably not many of you have noticed, since I assume you
have your own active spam filters:
But we have recently (first time on Friday April 15) had problems
with spam filtering on our mail server.  The spamassassin daemon
(spamd) has "died" for no apparent reason, and hence the mail
has been passed through without spam filtering.

We have had a look at the log files and haven't got a real clue
about the reasons.  As stop-gap measure we now have a "nanny
script" that tries to see if 'spamd' lives, and restarts it in
case it isn't there anymore.

Martin Maechler
ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] exists("loadings.default") ...

2005-04-11 Thread Martin Maechler

Paul Gilbert asked me the following, about a topic that was
dealt here (on R-devel) a few weeks ago (~ March 21):

> "PaulG" == Paul Gilbert <[EMAIL PROTECTED]>
> on Mon, 11 Apr 2005 10:35:03 -0400 writes:

PaulG> Martin, a while ago you suggested:

>> For S3, it's a bit uglier, but I think you could still do -- in your
>> package --

>> if(!exists("loadings.default", mode="function")) {
>>   loadings.default <- loadings
>>   loadings <- function(x, ...) UseMethod("loadings")
>> }

PaulG> I don't think exists works properly here if namespaces are used and
PaulG> loadings.default is not exported. (i.e. it always gives false.) I can
PaulG> redefine loadings and loadings.default, but I can't guard against the
PaulG> possibility that those might actually be defined in stats someday.

Yes, you are correct, one cannot easily use exists() for this
when namespaces are involved.

For S3 methods, instead of exists(), I think one should use
something like

  > !is.null(getS3method("loadings", "default", optional = TRUE))
  [1] FALSE
  > !is.null(getS3method("predict", "ppr", optional = TRUE))
  [1] TRUE

Apart from the need to mention something along this line on
'exists' help page,  I wonder if we shouldn't even consider providing an  
existsS3method() wrapper, or alternatively and analogously to getAnywhere() an
existsAnywhere() function.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] strange error with rw2010dev

2005-04-11 Thread Martin Maechler

> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
> on 11 Apr 2005 09:46:11 +0200 writes:

 .

 MM> Thanks again for the report; this should be fixable
 MM> before release.

PD> Preferably before code freeze! (today)

PD> I think we (Thomas L.?) got it analysed once before: The
PD> issue is that summary.matrix is passing
PD> data.frame(object) back to summary.data.frame without
PD> removing the AsIs class.

PD> I don't a simple unclass() will do here.

or, a bit more cautiously,

summary.matrix <- function(object, ...)
summary.data.frame(data.frame(if(inherits(object,"AsIs")) unclass(object)
else object), ...)

That does cure the problem in the Kjetil's example and the equivalent

 ## short 1-liner:
 summary(df <- data.frame(mat = I(matrix(1:8, 2

I'm currently make-checking the above.
Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] strange error with rw2010dev

2005-04-10 Thread Martin Maechler

> "Kjetil" == Kjetil Brinchmann Halvorsen <[EMAIL PROTECTED]>
> on Sun, 10 Apr 2005 14:00:52 -0400 writes:

Kjetil> The error reported below still occurs in todays
Kjetil> (2005-04-08) rw2010beta, should I file a formal bug
Kjetil> report?

Thank you, Kjetil.

It seems nobody has found time to look at this in the mean time.
However,
I can confirm the bug on quite a different platform
(Linux Redhat 64-bit on AMD 64).
The problem is infinite recursion which you see more easily,
when you set something like options(expressions=500).

Further note that the bug is not new, it also happens in
previous versions of R ( -> i.e. no reason to stop using "R 2.1.0 beta"!)

Here's a ``pure script''

testmat <- matrix(1:80, 20,4)
dim(testmat)
#
testframe <- data.frame(testmat=I(testmat),
x=rnorm(20), y=rnorm(20), z=sample(1:20))
str(testframe)

options(expressions=100)
summary(testframe)
##> Error: evaluation nested too deeply: infinite recursion / 
options(expression=)?
## -- or --
##> Error: protect(): protection stack overflow

### In the second case, I at least get a useful trace back:

traceback() ## longish output, shows the infinite recursion:

..
...

17: summary.data.frame(data.frame(object), ...)
16: summary.matrix(object, digits = digits, ...)
15: summary.default(X[[1]], ...)
14: FUN(X[[1]], ...)
13: lapply(as.list(object), summary, maxsum = maxsum, digits = 12, 
...)
12: summary.data.frame(data.frame(object), ...)
11: summary.matrix(object, digits = digits, ...)
10: summary.default(X[[1]], ...)
9: FUN(X[[1]], ...)
8: lapply(as.list(object), summary, maxsum = maxsum, digits = 12, 
   ...)
7: summary.data.frame(data.frame(object), ...)
6: summary.matrix(object, digits = digits, ...)
5: summary.default(X[[1]], ...)
4: FUN(X[[1]], ...)
3: lapply(as.list(object), summary, maxsum = maxsum, digits = 12, 
   ...)
2: summary.data.frame(testframe)
1: summary(testframe)

Thanks again for the report;
this should be fixable before release.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] orphaning CRAN packages

2005-04-09 Thread Martin Maechler

> "Ted" == Ted Harding <[EMAIL PROTECTED]>
> on Sat, 09 Apr 2005 13:02:22 +0100 (BST) writes:

Ted> On 09-Apr-05 Uwe Ligges wrote:
>> [EMAIL PROTECTED] wrote:
>> 
>>> Dear R Developers,
>>> 
>>> the following CRAN packages do not cleanly pass R CMD
>>> check for quite some time now and did not have any
>>> updates since the time given. Several attempts by the
>>> CRAN admins to contact the package maintainers had no
>>> success.
>>> 
>>> norm, 1.0-9, 2002-05-07, WARN

Ted> It would be serious if 'norm' were to lapse, since it
Ted> is part of the 'norm+cat+mix+pan' family, and people
Ted> using any of these are likely to have occasion to use
Ted> the others.

Indeed!  I had a very similar thought but couldn't afford your
offer (below), so thanks a lot !

Ted> I'd offer to try to clean up 'norm' myself if only I
Ted> were up-to-date on R itself (I'm waiting for 2.1.0 to
Ted> come out, which I understand is scheduled to happen
Ted> soon, yes?).

yes, as Uwe has already confirmed.

Since R 2.1.0 is now in beta testing, we consider it very
stable, and having less bugs than any other version of R, so
please ("everyone!") follow Uwe's advice and install R 2.1.0"beta"

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] NaN and linear algebra

2005-03-23 Thread Martin Maechler

>>>>> "Bill" == Bill Northcott <[EMAIL PROTECTED]>
>>>>> on Wed, 23 Mar 2005 10:19:22 +1100 writes:

Bill> On 23/03/2005, at 12:55 AM, Simon Urbanek wrote:
>>> As I see it, the MacOS X behaviour is not IEEE-754 compliant.
>>> 
>>> I had a quick look at the IEEE web site and it seems quite clear that 
>>> NaNs should not cause errors, but propagate through calculations to 
>>> be tested at some appropriate (not too frequent) point.
>> 
>> This is not quite correct and in fact irrelevant to the problem you 
>> describe. NaNs may or may not signal, depending on how they are used. 
>> Certain operations on NaN must signal by the IEEE-754 standard. The 
>> error you get is not a trap, and it's not a result of a signal check, 
>> either. The whole problem is that depending on which algorithm is 
>> used, the NaNs will be used different ways and thus may or may not use 
>> signaling operations.

Bill> It may not violate the letter of IEEE-754 because matrix calculations 
Bill> are not covered, but it certainly violates the spirit that arithmetic 
Bill> should be robust and programs should not halt on these sorts of 
errors.
>> 
>> I don't consider the `solve' error a bug - in fact I would rather get 
>> an error telling me that something is wrong (although I agree that the 
>> error is misleading - the error given in Linux is a bit more helpful) 
>> than getting a wrong result.

Bill> You may prefer the error, but it is not in the sprit of robust 
Bill> arithmetic. ie
>> d<-matrix(NaN,3,3)
>> f<-solve(d)
Bill> Error in solve.default(d) : Lapack routine dgesv: system is exactly 
Bill> singular
>> f
Bill> Error: Object "f" not found

>> If I would mark something in your example as a bug that would be 
>> det(m)=0, because it should return NaN (remember, NaN==NaN is FALSE; 
>> furthermore if det was calculated inefficiently using Laplace 
>> expansion, the result would be NaN according to IEEE rules). det=0 is 
>> consistent with the error given, though. Should we check this in R 
>> before calling Lapack - if the vector contains NaNs, det/determinant 
>> should return NaN right away?

Bill> Clearly det(d) returning 0 is wrong.  As a result based on a 
Bill> computation including a NaN, it should return NaN.  The spirit of 
Bill> IEEE-754 is that the programmer should choose the appropriate point 
at 
Bill> which to check for NaNs.  I would interpret this to mean the R 
Bill> programmer not the R library developer.  Surely that is why R 
provides 
Bill> the is.nan function.

>> d
Bill> [,1] [,2] [,3]
Bill> [1,]  NaN  NaN  NaN
Bill> [2,]  NaN  NaN  NaN
Bill> [3,]  NaN  NaN  NaN
>> is.nan(solve(d))
Bill> Error in solve.default(d) : Lapack routine dgesv: system is exactly 
Bill> singular

Bill> This is against the spirit of IEEE-754 because it halts the program.

>> is.nan(det(d))
Bill> [1] FALSE

Bill> That is plain wrong.

>> 
>> Many functions in R will actually bark at NaN inputs (e.g. qr, eigen, 
>> ...) - maybe you're saying that we should check for NaNs in solve 
>> before proceeding and raising an error?

Bill> However, this problem is in the Apple library not R.

Bill> Bill Northcott

Indeed!

I pretty much entirely agree with your points, Bill, and would
tend to declare that this Apple library is ``broken''
for building a correctly running R.

Let me ask one question I've been wondering about now for a
while:

  Did you run "make check" after building R,
  and "make check" ran to completion without an error?

If yes (which I doubt quite a bit), there *is* a bug in R's
quality control / quality assurance tools -- and I would want to
add a check for the misbehavior you've mentioned.

Martin Maechler, ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] loadings generic?

2005-03-21 Thread Martin Maechler

> "PaulG" == Paul Gilbert <[EMAIL PROTECTED]>
> on Sun, 20 Mar 2005 10:37:29 -0500 writes:

PaulG> Can loadings in stats be made generic?

It becomes a (S4) generic automagically when you define an S4 method
for it.  ;-)

{yes, I know this isn't the answer you wanted to hear;
 but really, maybe it's worth considering to use S4 classes and
 methods ?}

For S3, it's a bit uglier, but I think you could still do -- in your
package --

if(!exists("loadings.default", mode="function")) {
  loadings.default <- loadings
  loadings <- function(x, ...) UseMethod("loadings")
}

loadings. <- function(x, ...) {
   . 
}

and S3-export these.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Buglet in install.packages warning message

2005-03-21 Thread Martin Maechler

> "Seth" == Seth Falcon <[EMAIL PROTECTED]>
> on Sun, 20 Mar 2005 18:34:13 -0800 writes:

Seth> I've been experimenting with install.packages and it's
Seth> new ability to track down dependencies from a list of
Seth> repositories and encountered this:

Seth> install.packages(c("foo", "bar"),
Seth> repos="http://cran.r-project.org";,
Seth> dependencies=c("Depends", "Suggests"))

Seth> dependencies 'foo' are not availabledependencies 'bar'
Seth> are not available 

Seth> With the following change (see below) I get what I
Seth> suspect is the intended warning message:

Seth>dependencies 'foo', 'bar' are not available

Indeed.
Thank you Seth! - I've committed your change 
to be in '` R-alpha of 2005-03-22 ''

Apropos: Please, all users of R-2.1.0 (alpha)  {aka "R-devel"}:
  ``keep your eyes open'' for not quite correctly
  formatted error messages, or even other problems in
  error and warning messages.

The large amount of work that was put in (mostly by Prof Brian
Ripley) rationalizing these messages in order to make them
more consistent (for translation, e.g.!) may have lead to a few
typos that are unavoidable when changing those thousand of
lines of code efficiently.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] package.skeleton

2005-03-19 Thread Martin Maechler

Thanks a lot, Jim,

yes, I can confirm the behavior;
clearly a bug in R-devel (only!)

Martin Maechler

>>>>> "JimMcD" == James MacDonald <[EMAIL PROTECTED]>
>>>>> on Fri, 18 Mar 2005 13:28:18 -0500 writes:

>> R.version.string
JimMcD> [1] "R version 2.1.0, 2005-03-17"

JimMcD> I don't see anything in either
JimMcD> https://svn.r-project.org/R/trunk/NEWS or in the
JimMcD> Changes file for R-2.1.0 about changes in
JimMcD> package.skeleton() (nor in the help page), but when
JimMcD> I run this function, all the .Rd files produced are
JimMcD> of the data format even if all I have in my
JimMcD> .GlobalEnv are functions.

JimMcD> A trivial example is to run the examples from the
JimMcD> package.skeleton() help page. I believe there should
JimMcD> be two data type and two function type .Rd files,
JimMcD> but instead they are all of the data type.

JimMcD> Best,

JimMcD> Jim

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Small suggestion for stripchart help

2005-03-15 Thread Martin Maechler

Thank you, Kevin.

I've just committed your improvement (to
 /src/library/graphics/man/stripchart.Rd).

Martin

> "KevinW" == Kevin Wright <[EMAIL PROTECTED]>
> on Mon, 14 Mar 2005 12:39:50 -0800 (PST) writes:

KevinW> I needed to look up information about the 'jitter'
KevinW> argument of stripchart.  When I opened the help page
KevinW> for jitter I found:

KevinW> jitter  when jittering is used, jitter gives the amount of 
jittering applied.

KevinW> which is slightly confusing/self-referential if you
KevinW> are lazy and don't read the entire page.

KevinW> It might be clearer to say

KevinW> jitter when \code{method="jitter"} is used, jitter
KevinW> gives the amount of jittering applied.

KevinW> Just my opinion.  Thanks for listening.

KevinW> Kevin Wright

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Use of htest class for different tests

2005-03-14 Thread Martin Maechler

>>>>> "Torsten" == Torsten Hothorn <[EMAIL PROTECTED]>
>>>>> on Mon, 14 Mar 2005 13:43:32 +0100 (CET) writes:

Torsten> On Sun, 13 Mar 2005, Gorjanc Gregor wrote:
>> Hello!
>> 
>> First of all I must appologize if this has been raised
>> previously, but search provided by Robert King at the
>> University of Newcastle seems to be down these
>> days. Additionally let me know if such a question should
>> be sent to R-help.
>> 
>> I did a contribution to function hwe.hardy in package
>> 'gap' during the weekend. That functions performs
>> Hardy-Weinberg equilibrium test using MCMC. The return of
>> the function does not have classical components for htest
>> class so I was afcourse not successfull in using
>> it. However, I managed to copy and modify some part of
>> print.htest to accomplish the same task.
>> 
>> Now my question is what to do in such cases? Just copy
>> parts of print.htest and modify for each test or anything
>> else. Are such cases rare? If yes, then mentioned
>> approach is probably the easiest.
>> 

Torsten> you can use print.htest directly for the components
Torsten> which _are_ elements of objects of class `htest'
Torsten> and provide your one print method for all
Torsten> others. If your class `foo' (essentially) extends
Torsten> `htest', a simple version of `print.foo' could by

   Torsten>  print.foo <- function(x, ...) {
   Torsten>  
   Torsten> # generate an object of class `htest'
   Torsten> y <- x
   Torsten> class(y) <- "htest"
   Torsten> # maybe modify some thinks like y$method
   Torsten> ...
   Torsten> # print y using `print.htest' without copying code
   Torsten> print(y)
   Torsten>  
   Torsten> # and now print additional information
   Torsten> cat(x$whatsoever)
   Torsten>  
   Torsten>  }

and if you want to really `comply to standards'
you should end your print method with

invisible(x)

Martin Maechler, ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Re: Packages and Libraries (was: Re: lme4 "package" etc ..)

2005-02-08 Thread Martin Maechler

>>>>> "tony" == A J Rossini <[EMAIL PROTECTED]>
>>>>> on Tue, 8 Feb 2005 13:33:23 +0100 writes:

tony> For OBVIOUS reasons, is there any chance that we could introduce
tony> "package()" and deprecate "library()"?

This idea is not new {as you must surely have guessed}. In fact,
there's a much longer standing proposition of  "usePackage()"
(IIRC, or "use.package()" ?).  However, we (R-core) always had
wanted to also provide a ``proper'' class named "package" 
along with this, but for several reasons didn't get around to it.. yet.

-- I've diverted to R-devel now that we are really talking about
   desired future behavior of R

tony> (well, I'll also ask if we could deprecate "=" for assignment, but
tony> that's hopeless).
:-)

tony> On Tue, 8 Feb 2005 11:49:39 +0100, Martin Maechler
tony> <[EMAIL PROTECTED]> wrote:
>> >>>>> "Pavel" == Pavel Khomski <[EMAIL PROTECTED]>
>> >>>>> on Tue, 08 Feb 2005 10:20:03 +0100 writes:
>> 
Pavel> this is a question, how can i specify the random part
Pavel> in the GLMM-call (of the lme4 library) for compound
Pavel> matrices just in the the same way as they defined in
Pavel> the lme-Call (of the nlme library).
>> 
>> ``twice in such a short paragraph -- yikes !!'' ... I'm getting
>> convulsive...
>> 
>> There is NO lme4 library nor an nlme one !
>> There's the lme4 *PACKAGE* and the nlme *PACKAGE* -- please --
>> 
>> 

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-devel daily snapshots

2005-01-26 Thread Martin Maechler

> "Kurt" == Kurt Hornik <[EMAIL PROTECTED]>
> on Tue, 25 Jan 2005 21:57:32 +0100 writes:

> apjaworski  writes:
>> I just noticed that as of January 22, the daily snapshots
>> of the R-devel tree (in
>> ftp://ftp.stat.math.ethz.ch/Software/R/) are only about
>> 1Mb (instead of about 10Mb).  When the January 25 file is
>> downloaded and uncompressed, it seems to be missing the
>> src directory.

Kurt> We are working on this.  Building the daily snapshot
Kurt> for R-devel now requires Makeinfo 4.7, and the system
Kurt> creating the tarball currently only has 4.5 installed.

There's now a new one in
  ftp://ftp.stat.math.ethz.ch/Software/R/

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] p.adjust(s), was "Re: [BioC] limma and p-values"

2005-01-18 Thread Martin Maechler

>>>>> "MM" == Martin Maechler <[EMAIL PROTECTED]>
>>>>> on Mon, 17 Jan 2005 22:02:39 +0100 writes:

 >>>>> "GS" == Gordon Smyth <[EMAIL PROTECTED]>
 >>>>> on Sun, 16 Jan 2005 19:55:35 +1100 writes:

  <..>

 GS> 7. The 'n' argument is removed. Setting this argument
 GS> for any methods other than "none" or "bonferroni" make
 GS> the p-values indeterminate, and the argument seems to be
 GS> seldom used.
 GS>  (It isn't used in the R default distribution.) 

that's only any indication it *might* be seldom used...
we really have to *know*, because not allowing it anymore will
break all code calling p.adjust(p, meth, n = *) 

 GS> I think trying to combine this argument with NAs would get you
 GS> into lots of hot water. For example, what does
 GS> p.adjust(c(NA,NA,0.05),n=2) mean?  Which 2 values
 GS> should be adjusted?

The case where n < length(p) should simply give an error which
should bring you into cool water...

MM> I agree that I don't see a good reason to allow specifying 'n'
MM> as argument unless e.g. for "bonferroni".
MM> What do other think ?

no reaction yet.

I've thought a bit more in the mean time:
Assume someone has 10 P values and knows that he
only want to adjust the smallest ones.
Then, only passing the ones to adjust and setting 'n = 10'
can be useful and will certainly work for "bonferroni" but
I think it can't work in general for any other method.

In sum, I still tend to agree that the argument 'n' should be
dropped -- but maybe with "deprecation" -- i.e. still allow it
for 2.1.x giving a deprecation warning.

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] p.adjust(s), was "Re: [BioC] limma and p-values"

2005-01-17 Thread Martin Maechler

> "GS" == Gordon Smyth <[EMAIL PROTECTED]>
> on Sun, 16 Jan 2005 19:55:35 +1100 writes:

GS> I append below a suggested update for p.adjust().  

thank you.

GS> 1. A new method "yh" for control of FDR is included which is
GS> valid for any dependency structure. Reference is
GS> Benjamini, Y., and Yekutieli, D. (2001).  The control of
GS> the false discovery rate in multiple testing under
GS> dependency. Annals of Statistics 29, 1165-1188.

good, thanks!

GS> 2. I've re-named the "fdr" method to "bh" but kept "fdr"
GS> as a synonym for backward compatability.
ok

GS> 3. Upper case values for method "BH" or "YH" are also
GS> accepted.

I don't see why we'd want this.  The S language is
case-sensitive and we don't want to lead people to believe
that case wouldn't matter.

GS> 4. p.adust() now preserves attributes like names for
GS> named vectors (as does cumsum and friends for example).

good point; definitely desirable!!

GS> 5. p.adjust() now works columnwise on numeric
GS> data.frames (as does cumsum and friends).

well, "cusum and friends" are either generic or groupgeneric
(for the "Math" group) -- there's a Math.data.frame group
method.
This is quite different for p.adjust which is not generic and
I'm not (yet?) convinced it should become so.

People can easily use sapply(d.frame, p.adjust, method) if needed; 

In any case it's not in the spirit of R's OO programming to
special case "data.frame" inside a function such as p.adjust

GS> 6. method="hommel" now works correctly even for n=2

ok, thank you (but as said, in R 2.0.1 the behavior was much
more problematic)

GS> 7. The 'n' argument is removed. Setting this argument
GS> for any methods other than "none" or "bonferroni" make
GS> the p-values indeterminate, and the argument seems to be
GS> seldom used. (It isn't used in the R default
GS> distribution.) I think trying to combine this argument
GS> with NAs would get you into lots of hot water. For
GS> example, what does p.adjust(c(NA,NA,0.05),n=2) mean?
GS> Which 2 values should be adjusted?

I agree that I don't see a good reason to allow specifying 'n'
as argument unless e.g. for "bonferroni".
What do other think ?

GS> 8. NAs are treated in na.exclude style. This is the
GS> correct approach for most applications. The only other
GS> consistent thing you could do would be to treat the NAs
GS> as if they all had value=1. But then you would have to
GS> explain clearly that the values being returned are not
GS> actually the correct adjusted p-values, which are
GS> unknown, but are the most conservative possible values
GS> assuming the worst-case for the missing values. This
GS> would become arbitrarily unreasonable as the number of
GS> NAs increases.

I now agree that your proposed default behavior is more sensible
than my proposition.
I'm not sure yet if it wasn't worth to allow for other NA
treatment, like the "treat as if 1" {which my code proposition
was basically doing} or rather mre sophisticated procedure like
"integrating" over all P ~ U[0,1] marginals for each missing
value, approximating the integral possibly by "Monte-Carlo" 
even quasi random numbers.

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] p.adjust(s), was "Re: [BioC] limma and p-values"

2005-01-17 Thread Martin Maechler

> "GS" == Gordon Smyth <[EMAIL PROTECTED]>
> on Sun, 16 Jan 2005 19:44:26 +1100 writes:

GS> The new committed version of p.adjust() contains some
GS> problems:
>> p.adjust(c(0.05,0.5),method="hommel")
GS> [1] 0.05 0.50

GS> No adjustment!

yes, but that's still better than what the current version of 
R 2.0.1 does, namely to give NA NA + two warnings ..

GS> I can't see how the new treatment of NAs can be
GS> justified. One needs to distinguish between NAs which
GS> represent missing p-values and NAs which represent
GS> unknown p-values. In virtually all applications giving
GS> rise to NAs, the NAs represent missing p-values which
GS> could not be computed because of missing data. In such
GS> cases, the observed p-values should definitely be
GS> adjusted as if the NAs weren't there, because NAs
GS> represent p-values which genuinely don't exist.

hmm, "definitely" being a bit strong.  One could argue that
ooonoe should use multiple imputation of the underlying missing
data, or .. other scenarios.

I'll reply to your other, later, more detailed message
separately and take the liberty to drop the other points here...

Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] p.adjust(s), was "Re: [BioC] limma and p-values"

2005-01-08 Thread Martin Maechler

I've thought more and made experiements with R code versions
and just now committed a new version of  p.adjust()  to R-devel
--> https://svn.r-project.org/R/trunk/src/library/stats/R/p.adjust.R
which does sensible NA handling by default and 
*additionally* has an "na.rm" argument (set to FALSE by
default).  The extended 'Examples' secion on the help page
https://svn.r-project.org/R/trunk/src/library/stats/man/p.adjust.Rd
shows how the new NA handling is typically much more sensible
than using "na.rm = TRUE".

Martin 

>>>>> "MM" == Martin Maechler <[EMAIL PROTECTED]>
>>>>> on Sat, 8 Jan 2005 17:19:23 +0100 writes:

>>>>> "GS" == Gordon K Smyth <[EMAIL PROTECTED]>
>>>>> on Sat, 8 Jan 2005 01:11:30 +1100 (EST) writes:

MM> <.>

GS> p.adjust() unfortunately gives incorrect results when
GS> 'p' includes NAs.  The results from topTable are
GS> correct.  topTable() takes care to remove NAs before
GS> passing the values to p.adjust().

MM> There's at least one bug in p.adjust(): The "hommel"
MM> method currently does not work at all with NAs (and I
MM> have an uncommitted fix ready for this bug).  OTOH, the
MM> current version of p.adjust() ``works'' with NA's, apart
MM> from Hommel's method, but by using "n = length(p)" in
MM> the correction formulae, i.e. *including* the NAs for
MM> determining sample size `n' {my fix to "hommel" would do
MM> this as well}.

MM> My question is what p.adjust() should do when there are
MM> NA's more generally, or more specifically which `n' to
MM> use in the correction formula. Your proposal amounts to
MM> ``drop NA's and forget about them till the very end''
MM> (where they are wanted in the result), i.e., your sample
MM> size `n' would be sum(!is.na(p)) instead of length(p).

MM> To me it doesn't seem obvious that this setting "n =
MM> #{non-NA observations}" is desirable for all P-value
MM> adjustment methods. One argument for keeping ``n = #{all
MM> observations}'' at least for some correction methods is
MM> the following "continuity" one:

MM> If only a few ``irrelevant'' (let's say > 0.5) P-values
MM> are replaced by NA, the adjusted relevant small P-values
MM> shouldn't change much, ideally not at all.  I'm really
MM> no scholar on this topic, but e.g. for "holm" I think I
MM> would want to keep ``full n'' because of the above
MM> continuity argument.  BTW, for "fdr", I don't see a
MM> straightforward way to achieve the desired continuity.
MM> 5D Of course, p.adjust() could adopt the possibility of
MM> chosing how NA's should be treated e.g. by another
MM> argument ``use.na = TRUE/FALSE'' and hence allow both
MM> versions.

MM> Feedback very welcome, particularly from ``P-value
MM> experts'' ;-)

MM> Martin Maechler, ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] p.adjust(s), was "Re: [BioC] limma and p-values"

2005-01-08 Thread Martin Maechler

>>>>> "GS" == Gordon K Smyth <[EMAIL PROTECTED]>
>>>>> on Sat, 8 Jan 2005 01:11:30 +1100 (EST) writes:

<.>

GS> p.adjust() unfortunately gives incorrect results when
GS> 'p' includes NAs.  The results from topTable are
GS> correct.  topTable() takes care to remove NAs before
GS> passing the values to p.adjust().

There's at least one bug in p.adjust():  The "hommel" method
currently does not work at all with NAs (and I have an
uncommitted fix ready for this bug).
OTOH, the current version of p.adjust() ``works'' with NA's,
apart from Hommel's method, but by using "n = length(p)" in the
correction formulae, i.e. *including* the NAs for determining
sample size `n'  {my fix to "hommel" would do this as well}.

My question is what p.adjust() should do when there are NA's
more generally, or more specifically which `n' to use in the
correction formula. Your proposal amounts to
  ``drop NA's and forget about them till the very end''
  (where they are wanted in the result),
i.e., your sample size `n' would be sum(!is.na(p)) instead of
length(p).

To me it doesn't seem obvious that this setting 
"n = #{non-NA observations}" is desirable for all 
P-value adjustment methods. One argument for keeping
``n = #{all observations}'' at least for some correction
methods is the following  "continuity" one:

If only a few ``irrelevant'' (let's say > 0.5) P-values are
replaced by NA, the adjusted relevant small P-values shouldn't
change much, ideally not at all.  I'm really no scholar on this
topic, but e.g. for "holm" I think I would want to keep ``full
n'' because of the above continuity argument.
BTW, for "fdr", I don't see a straightforward way to achieve the
desired continuity.
5D
Of course, p.adjust() could adopt the possibility of chosing how
NA's should be treated e.g. by another argument ``use.na = TRUE/FALSE''
and hence allow both versions.  

Feedback very welcome, particularly from ``P-value experts'' ;-)

Martin Maechler, ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] sorry to have broken R-devel snapshot

2004-12-28 Thread Martin Maechler

I'm sorry to report that I had accidentally broken last night's
R-devel snapshot "R-devel_2004-12-28...". 
If for some reason, you are interested in fixing that manually,
add one "\" at the end of line 649 in file src/main/array.c.

It may have bad consequences for automatic daily builds (with
R-devel only), possibly including the CRAN and Bioconductor
package check results.

Martin Maechler, ETH Zurich

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R's IO speed

2004-12-26 Thread Martin Maechler

> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
> on Sun, 26 Dec 2004 10:03:30 + (GMT) writes:

BDR> R-devel now has some improved versions of read.table
BDR> and write.table.  For a million-row data frame
BDR> containing one number, one factor with few levels and
BDR> one logical column, a 56Mb object.

BDR> generating it takes 4.5 secs.

BDR> calling summary() on it takes 2.2 secs.

BDR> writing it takes 8 secs and an additional 10Mb.

BDR> saving it in .rda format takes 4 secs.

BDR> reading it naively takes 28 secs and an additional
BDR> 240Mb

BDR> reading it carefully (using nrows, colClasses and
BDR> comment.char) takes 16 secs and an additional 150Mb
BDR> (56Mb of which is for the object read in).  (The
BDR> overhead of read.table over scan was about 2 secs,
BDR> mainly in the conversion back to a factor.)

BDR> loading from .rda format takes 3.4 secs.

BDR> [R 2.0.1 read in 23 secs using an additional 210Mb, and
BDR> wrote in 50 secs using an additional 450Mb.]

Excellent!
Thanks a lot Brian (for this and much more)!

I wish you continued merry holidays!
Martin

__
R-devel@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-13 Thread Martin Maechler

>>>>> "RichOK" == Richard A O'Keefe <[EMAIL PROTECTED]>
>>>>> on Mon, 13 Dec 2004 10:56:48 +1300 (NZDT) writes:

RichOK> I asked:
>> In this discussion of seq(), can anyone explain to me
>> _why_ seq(to=n) and seq(length=3) have different types?

RichOK> Martin Maechler <[EMAIL PROTECTED]>
RichOK> replied: well, the explantion isn't hard: look at
RichOK> seq.default :-)

RichOK> That's the "efficient cause", I was after the "final
RichOK> cause".  That is, I wasn't asking "what is it about
RichOK> the system which MAKES this happen" but "why does
RichOK> anyone WANT this to happen"?

sure, I did understand you quite well -- I was trying to joke
and used the " :-) " to point the joking ..

MM> now if that really makes your *life* simpler,
MM> what does that tell us about your life ;-) :-)

{ even more " :-) "  !! }

RichOK> It tells you I am revising someone else's e-book
RichOK> about S to describe R.  The cleaner R is, the easier
RichOK> that part of my life gets.

of course, and actually I do agree for my life too, 
since as you may believe, parts of my life *are* influenced by R.

Apologize for my unsuccessful attempts to joking..

RichOK> seq: from, to, by, length[.out], along[.with]

MM> I'm about to fix this (documentation, not code).

RichOK> Please don't.  There's a lot of text out there:
RichOK> tutorials, textbooks, S on-inline documentation, &c
RichOK> which states over and over again that the arguments
RichOK> are 'along' and 'with'.  

you meant
 'along' and 'length'

yes. And everyone can continue to use the abbreviated form as
I'm sure nobody will introduce a 'seq' method that uses
*multiple* argument names starting with "along" or "length"
(such that the partial argument name matching could become a problem).

RichOK> Change the documentation, and people will start
RichOK> writing length.out, and will that port to S-Plus?
RichOK> (Serious question: I don't know.)

yes, as Peter has confirmed already.

Seriously, I think we wouldn't even have started using the ugly
".with" or ".out" appendices, wouldn't it have been for S-plus
compatibility {and Peter has also given the explanation why there
*had* been a good reason for these appendices in the past}.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Martin Maechler

>>>>> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]>
>>>>> on Fri, 10 Dec 2004 08:38:34 -0500 writes:

    Duncan> On Fri, 10 Dec 2004 09:32:14 +0100, Martin Maechler
Duncan> <[EMAIL PROTECTED]> wrote :

RichOK> If you want to pass seq(length=n) to a .C or
RichOK> .Fortran call, it's not helpful that you can't tell
RichOK> what the type is until you know n!  It would be nice
RichOK> if seq(length=n) always returned the same type.  I
RichOK> use seq(length=n) often instead of 1:n because I'd
RichOK> like my code to work when n == 0; it would make life
RichOK> simpler if seq(length=n) and 1:n were the same type.
>> 
>> now if that really makes your *life* simpler, what does that
>> tell us about your life  ;-) :-)
>> 
>> But yes, you are right.  All should return integer I think.

Duncan> Yes, it should be consistent, and integer makes sense here.

the R-devel version now does;  and so does  seq(along = <.>)

Also ?seq {or ?seq.default} now has the value section as

> Value:

>  The result is of 'mode' '"integer"' if 'from' is (numerically
>  equal to an) integer and, e.g., only 'to' is specified, or also if
>  only 'length' or only 'along.with' is specified.

which is correct {and I hope does not imply that it gives *all* cases of
an integer result}.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Martin Maechler

I'm diverting to R-devel,  where this is really more
appropriate.

> "RichOK" == Richard A O'Keefe <[EMAIL PROTECTED]>
> on Fri, 10 Dec 2004 14:37:16 +1300 (NZDT) writes:

RichOK> In this discussion of seq(), can anyone explain to
RichOK> me _why_ seq(to=n) and seq(length=3) have different
RichOK> types?  

well, the explantion isn't hard:  look at  seq.default  :-)

RichOK> In fact, it's worse than that (R2.0.1):

>> storage.mode(seq(length=0))
RichOK> [1] "integer"
>> storage.mode(seq(length=1))
RichOK> [1] "double"

  { str(.) is shorter than  storage.mode(.) }

RichOK> If you want to pass seq(length=n) to a .C or
RichOK> .Fortran call, it's not helpful that you can't tell
RichOK> what the type is until you know n!  It would be nice
RichOK> if seq(length=n) always returned the same type.  I
RichOK> use seq(length=n) often instead of 1:n because I'd
RichOK> like my code to work when n == 0; it would make life
RichOK> simpler if seq(length=n) and 1:n were the same type.

now if that really makes your *life* simpler, what does that
tell us about your life  ;-) :-)

But yes, you are right.  All should return integer I think.

BTW --- since this is now on R-devel where we discuss R development:

 In the future, we really might want to have a new type,
 some "long integer" or "index" which would be used both in R
 and C's R-API for indexing into large objects where 32-bit
 integers overflow.
 I assume, we will keep theR "integer" == C "int" == 32-bit int
 forever, but need something with more bits rather sooner than later.
 But in any, case by then, some things might have to change in
 R (and C's R-API) storage type of indexing.

RichOK> Can anyone explain to me why the arguments of seq.default are
RichOK> "from", "to", "by", "length.out", "along.with"
RichOK> ^
RichOK> when the help page for seq documents them as
RichOK> "from", "to", "by", "length", and "along"?

Well I can explain why this wasn't caught by R's builtin 
QA (quality assurance) checks:

The base/man/seq.Rd page uses  both \synopsis{} and \usage{}
which allows to put things on the help page that are not checked
to coincide with the code...
I'm about to fix this (documentation, not code).

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Problems when printing large R objects

2004-12-06 Thread Martin Maechler

> "Simon" == Simon Urbanek <[EMAIL PROTECTED]>
> on Sun, 5 Dec 2004 19:39:07 -0500 writes:

Simon> On Dec 4, 2004, at 9:50 PM, [EMAIL PROTECTED]
Simon> wrote:
>> Source code leading to crash:
>> 
>> library(cluster)
>> data(xclara)
>> plot(hclust(dist(xclara)))
>> 
>> This leads to a long wait where the application is frozen
>> (spinning status bar going the entire time), a quartz
>> window displays without any content, and then the
>> following application crash occurs:

Simon> Please post this to the maintainers of the cluster
Simon> library (if at all),

Well, this is a *package*, not a library {please, please!}

And really, that has nothing to do with the 'cluster' package
(whose maintainer I am), as David only uses its data set.
hclust() and dist() are in the standard 'stats' package.

Btw, this can be accomplished more cleanly, i.e., without
attaching "cluster", by 

  data(xclara, package = "cluster")

Simon> this has nothing to do
Simon> with the GUI (behaves the same in X11).  The above
Simon> doesn't make much sense anyway - you definitely want
Simon> to use cutree before you plot that dendogram ...

Indeed!  

A bit more explicitly for David:
xclara has 3000 observations, 
i.e. 3000*2999/2 ~= 4.5 Mio distances {i.e., a bit more than 36
MBytes to keep in memory and about 48 mio characters to display
when you use default options(digits=7)}.
I don't think you can really make much of printing these many
numbers onto your console as you try with

David> dist(xclara) -> xclara.dist

David> Works okay, though when attempting to display those results it 
freezes  
David> up the entire system, probably as  the result of memory  
David> threshing/starvation if the top results are any indicator:

David> 1661 R   8.5%  9:36.12   392   567   368M+ 3.88M   350M-  828M

"freezes up the entire system"  when trying to print something
too large actually has something to do with user interface.
AFAIK, it doesn't work 'nicely' on any R console,
but at least in ESS on Linux, it's just that one Emacs,
displaying the "wrist watch" (and I can easily tell emacs "not
to wait" by pressing Ctrl g").  Also, just trying it now {on a
machine with large amounts of RAM}: After pressing return, it at
least starts printing (displaying to the *R* buffer) after a bit
more than 1 minute.. and that does ``seem'' to never finish.
I can signal a break (via the [Signals] Menu or C-c C-c in
Emacs), and still have to wait about 2-3 minutes for the output
stops; but it does, and I can work on.. {well, in theory; my Emacs
seems to have become v..e..r..y  s...l...ow}  We only
recently had a variation on this theme in the ESS-help mailing
list, and several people were reporting they couldn't really
stop R from printing and had to kill the R process...

So after all, there's not quite a trivial problem "hidden" in
David's report :  What should happen if the user accidentally
wants to print a huge object to console... how to make sure R
can be told to stop.

And as I see it now, there's even something like an R "bug" (or
"design infelicity") here:

I've now done it again {on a compute server Dual-Opteron with 4
GB RAM}:  After stopping, via the ESS [Signals] [Break (C-c C-c)] menu,
   Emacs stops immediately, but R doesn't return quickly,
and rather, watching "top" {the good ol' unix process monitor} I
see R using 99.9% CPU and it's memory footage ("VIRT" and 
"SHR") increasing and increasing..., upto '1081m', a bit more
than 1 GB, when R finally returns (displays the prompt) after
only a few minutes --- but then, as said, this is on a remote
64bit machine with 4000 MB RAM.

BTW, when I then remove the 'dist' (and hclust) objects in R,
and type  gc(),
(or maybe do some other things in R; the R process has about
halfed its apparent memory usage to 500something MB.  

more stats: 
 during printing:  798 m
 after "break"  :  798, for ~5 seconds, then starting to
   grow; slowly (in my top, in steps of ~ 10m)
   upto 1076m
 then the R prompt is displayed and top shows "1081m".

It stays there , until I do  
   > gc()
where it goes down to VIRT 841m (RES 823m)
and after removing the large distance object, and gc() again,
it lowers to 820m (RES 790m) and stays there.

Probably this thread should be moved to R-devel -- and hence I
crosspost for once.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] write.table inconsistency (PR#7403)

2004-12-04 Thread Martin Maechler

>>>>> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]>
>>>>> on Sat, 04 Dec 2004 09:17:26 -0500 writes:

    Duncan> On Sat, 4 Dec 2004 13:51:55 +0100, Martin Maechler
Duncan> <[EMAIL PROTECTED]> wrote:

>>>>>>> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]> on
>>>>>>> Sat, 4 Dec 2004 01:55:26 +0100 (CET) writes:
>>
Duncan> There's an as.matrix() call in write.table that
Duncan> means the formatting of numeric columns changes
Duncan> depending on whether there are any non-numeric
Duncan> columns in the table or not.
>>  yes, I think I had seen this (a while ago in the source
>> code) and then wondered if one shouldn't have used
>> data.matrix() instead of as.matrix() - something I
>> actually do advocate more generally, as "good programming
>> style".  It also does solve the problem in the example
>> here -- HOWEVER, the lines *before* as.matrix() have
>> 
>> ## as.matrix might turn integer or numeric columns into a
>> complex matrix cmplx <- sapply(x, is.complex)
>> if(any(cmplx) && !all(cmplx)) x[cmplx] <-
>> lapply(x[cmplx], as.character) x <- as.matrix(x)
>> 
>> which makes you see that write.table(.) should also work
>> when the data frame has complex variables {or some other
>> kinds of non-numeric as you've said above} -- something
>> which data.matrix() can't handle  As soon as you have
>> a complex or a character variable (together with others)
>> in your data.frame, as.matrix() will have to return
>> "character" and apply format() to the numeric variables,
>> as well...
>> 
>> So, to make this consistent in your sense,
>> i.e. formatting of a column shouldn't depend on the
>> presence of other columns, we can't use as.matrix() nor
>> data.matrix() but have to basically replicate an altered
>> version of as.matrix inside write.table.
>> 
>> I propose to do this, but expose the altered version as
>> something like as.charMatrix(.)
>> 
>> and replace the 4 lines {of code in write.table()} above
>> by the single line as.charMatrix(x)

Duncan> That sounds good.  Which version of the formatting
Duncan> would you choose, leading spaces or not?  My
Duncan> preference would be to leave off the leading spaces,

mine too, very strong preference, actually:

The behavior should be such that each column is formatted  
  ___ as if it was the only column of that data frame ___

Duncan> in the belief that write.table is usually used for
Duncan> data storage rather than data display, but it is
Duncan> sometimes used for data display (e.g. in
Duncan> utils::upgrade.packageStatus, which would not be
Duncan> affected by your choice).

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] regex to match word boundaries

2004-12-01 Thread Martin Maechler

> "Gabor" == Gabor Grothendieck <[EMAIL PROTECTED]>
> on Wed,  1 Dec 2004 21:05:59 -0500 (EST) writes:

Gabor> Can someone verify whether or not this is a bug.

Gabor> When I substitute all occurrence of "\\B" with "X" R
Gabor> seems to correctly place an X at all non-word
Gabor> boundaries (whether or not I specify perl) but "\\b"
Gabor> does not seem to act on all complement positions:

>> gsub("\\b", "X", "abc def") # nothing done
Gabor> [1] "abc def"
>> gsub("\\B", "X", "abc def") # as expected, I think
Gabor> [1] "aXbXc dXeXf"
>> gsub("\\b", "X", "abc def", perl = TRUE) # not as
>> expected
Gabor> [1] "abc Xdef"
>> gsub("\\B", "X", "abc def", perl = TRUE) # as expected
Gabor> [1] "aXbXc dXeXf"
>> R.version.string # Windows 2000
Gabor> [1] "R version 2.0.1, 2004-11-27"

I agree this looks "unfortunate".

Just to confirm: 
1) I get the same on a Linux version
2) the real perl does behave differently and as
   you (and I) would have expected:

 $ echo 'abc def'| perl -pe 's/\b/X/g'
 XabcX XdefX
 $ echo 'abc def'| perl -pe 's/\B/X/g'
 aXbXc dXeXf


Also, from what I see, "\b" should behave the same independently
of perl = TRUE or FALSE.

--
Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] a better "source(echo=TRUE)" {was "....how to pause...."}

2004-11-30 Thread Martin Maechler

> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]>
> on Sun, 28 Nov 2004 10:25:24 -0500 writes:

Duncan> <>
Duncan> <>

Duncan> We already have code to source() from the clipboard, and it could
Duncan> address the problems above, but:

Duncan> - Source with echo=T doesn't echo, it deparses, so some comments are
Duncan> lost, formatting is changed, etc.

yes, and we would have liked to have an alternative "source()"
for a *very* long time...
Examples where I "hate" the non-echo (i.e. the loss of all
comments and own-intended formatting) is when you use it for
demos, etc, notably in R's own  demo() and example() functions.

But to do this might be more tricky than at first thought:
Of course you can readLines() the source file and writeLines()
them to whatever your console is. The slightly difficult thing
is to "see" which junks to ``send to R'' , i.e. to parse() and eval().
The basic problem seems to see when expressions are complete.

Maybe we should / could think about enhancing parse() {or a new
function with extended behavior} such that it would not only
return the parse()d expressions, but also indices (byte or even
line counters) to the source text, indicating where each of the
expression started and ended.

That way I could see a way to proceed.

Martin

Duncan> <>
Duncan> <>

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] pausing between plots - waiting for graphics input

2004-11-30 Thread Martin Maechler

{I have changed the subject to match this interesting side thread}

> "TL" == Thomas Lumley <[EMAIL PROTECTED]>
> on Mon, 29 Nov 2004 09:15:27 -0800 (PST) writes:

TL> On Sun, 28 Nov 2004, Duncan Murdoch wrote:
>> 
>> Another that deals only with the original graphics problem is to have
>> par(ask=T) wait for input to the graphics window, rather than to the
>> console.  This has the advantage that the graphics window probably has
>> the focus, so a simple Enter there could satisfy it.
>> 

TL> I like this one.  I have often found it irritating that
TL> I have to switch the focus back to the console (which
TL> means uncovering the console window) to get the next
TL> graph.

I agree. 
Note that this is not windows-specific really.  Rather, this
should be applicable to all devices which support minimal mouse
interaction, i.e. at least those that support locator(),
ideally just all those listed in  dev.interactive

However, I'm not at all sure that this should be done with  
par(ask = TRUE)  which works on all devices, not just
interactive ones.
Rather, we probably should a new par() {and gpar() for grid !}
option for the new feature,
maybe something like [g]par(wait_mouseclick = TRUE)

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] \link{} to help pages in Debian

2004-11-29 Thread Martin Maechler

> "Iago" == Iago Mosqueira <[EMAIL PROTECTED]>
> on Mon, 29 Nov 2004 08:41:03 + writes:

Iago> Hello,
Iago> In my Debian 3.0 systems, packages are installed in two different
Iago> places, namely /usr/lib/R/library and /usr/local/lib/R/site-library,
Iago> depending on whether they come from debian packages or CRAN ones. Help
Iago> pages for my own packages, installed in the second location, cannot 
find
Iago> help pages from, for example, the base package via \link{}. I also 
tried
Iago> specifying the package with \link[pkg]{page}.

Iago> Is the only solution to force the system to use a single library 
folder?

not at all!
We have been working with several libraries "forever",
and I think I have n't seen your problem ever.

For instance, I never install extra packages into the "standard"
library (the one where "base" is in); have all CRAN packages in
one library, bioconductor in another library, etc,etc.

Iago> Can I force the help system to look in both places?

Actually you forgot to specify which interface to the help system
you are using.  But I assume you mean the help.start()
{webbrowser-"HTML"} one (which I very rarely use, since ESS and
"C-c C-v" is faster; to  follow links in ESS help buffers, after
selection, often "h"  " is sufficient -- ah reminds me
of an ESS improvement  I've wanted to implement...)

For me, help.start() works fine including links between pages
from packages in different libraries.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] boxplot() defaults {was "boxplot in extreme cases"}

2004-11-08 Thread Martin Maechler


AndyL> Try:

AndyL> x <- list(x1=rep(c(0,1,2),c(10,20,40)), x2=rep(c(0,1,2),c(10,40,20)))
AndyL> boxplot(x, pars=list(medpch=20, medcex=3))

AndyL> (Cf ?bxp, pointed to from ?boxplot.)

Good! Thank you, Andy.

However,
this is not the first time it had crossed my mind that R's
default settings of drawing boxplot()s are not quite ok -- and
that's why I've diverted to R-devel.

Keeping Tufte's considerations in mind, (and me not really wanting
to follow S-plus), shouldn't we consider to slightly change R's
boxplot()ing such that

   boxplot(list(x1=rep(c(0,1,2),c(10,20,40)), x2=rep(c(0,1,2),c(10,40,20

will *not* give too identically looking boxplots?
Also, the median should be emphasized more by default anyway.
{The lattice function  bwplot() does it by only drawing a large
 black ball as in Andy's example (and not drawing a line at all)}

One possibility I'd see is to use a default 'medlwd = 3'
either in boxplot() or in bxp(.) and hence, what you currently get by

   boxplot(list(x1=rep(c(0,1,2),c(10,20,40)), x2=rep(c(0,1,2),c(10,40,20))),
   medlwd=3)

would become the default plotting in boxplot().
Of course a smaller value "medlwd=2" would work too, but I'd
prefer a bit more (3).

Martin


> From: Erich Neuwirth
> 
> I noticed the following:
> the 2 datasets
> rep(c(0,1,2),c(10,20,40)) and
> rep(c(0,1,2),c(10,40,20))
> produce identical boxplots despite the fact that the medians are 
> different. The reason is that the median in one case 
> coincides with the
> first quartile, and in the second case with the third quartile.
> Is there a recommended way of displaying the median visibly in these 
> cases? Setting notch=TRUE displays the median, but does look strange.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] idea (PR#7345)

2004-11-05 Thread Martin Maechler


DanB> I really don't understand the negative and condescending
DanB> culture that seems to pervade R-dev.

It's pervading in replies to *Bug reports* about non-bugs!!
I thought you had read in the mean time what R bug reports
should be and that the things you have been posting as bug
reports were posted **WRONGLY**.

PLEASE:  
  1) All these suggestions were perfectly fit to be posted to R-devel
  2) All of them were completely NOT fit to be sent as bug reports

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Bug report (PR#7341)

2004-11-05 Thread Martin Maechler

>>>>> "dan" == dan  <[EMAIL PROTECTED]>
>>>>> on Thu,  4 Nov 2004 19:08:08 +0100 (CET) writes:

dan> Full_Name: Dan B Version: na OS: na Submission from:
dan> (NULL) (80.6.127.185)

dan> I can't log into the bug tracker (I can't find where to
dan> register / login).

[that's not what you should do. 
 Have you read on this in the FAQ or  help(bug.report) ?
 Please, please, do.
]

dan> In this way I can't add the following context diff
dan> (hopefully in the right order) for my changes to the
dan> matrix.Rd...

dan> Hmm... I guess this should be a separate report
dan> anyway...

No, this is really not a bug report __AT ALL__

You had all this long discussion about how the documentation 
can/could/should/{is_hard_to} be improved  and end up sending
a *bug report* ?
Really!

Whereas I value your contribution for improving the matrix help
page -- and I do think both changes are worthwhile ---
there is no bug, and hence a bug report is *WRONG*!

Sending this to R-devel [instead! - not automagically via the
bug report] would have been perfectly fine and helpful...

dan> The first diff explains how the dimnames list should
dan> work, and the second diff gives an example of using the
dan> dimnames list. (no equivelent example exists, and where
dan> better than the matrix man page to show this off).

agreed.

I'll put in a version of your proposed improvement,
but please do try more to understand what's appropriate for bug
reports.

Regards,
Martin Maechler, ETH Zurich

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Getting the bare R version number {was "2.0.1 buglets"}

2004-11-05 Thread Martin Maechler

> "PaulG" == Paul Gilbert <[EMAIL PROTECTED]>
> on Thu, 04 Nov 2004 20:26:04 -0500 writes:


>> If you want the R version, that is 'R --version'.
>> 
>> 
PaulG> I've been using this, but to make it into a file name
PaulG> I need to do some stripping out of the extra
PaulG> stuff. (And tail and head on Solaris do not use -n
PaulG> whereas Linux wants this, so it becomes difficult to
PaulG> be portable.) Is there a way to get just "R-2.0.1"
PaulG> from R or R CMD something?

yes, by applying Brian's advice and good ol' "sed" :

  R --version | sed -n '1s/ /-/; 1s/ .*//p'

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 2.0.1 buglets

2004-11-05 Thread Martin Maechler

> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
> on 04 Nov 2004 23:17:45 +0100 writes:

PD> Prof Brian Ripley <[EMAIL PROTECTED]> writes:
>> On Thu, 4 Nov 2004, Paul Gilbert wrote:
>> 
>> > With R-2.0.1 (patched) on Linux rhlinux 2.4.21-4.ELsmp
>> > 
>> > when I configure get > ...  > checking whether C
>> runtime needs -D__NO_MATH_INLINES... no > checking for
>> xmkmf... /usr/bin/X11/xmkmf > Usage: which [options] [--]
>> programname [...]  > Options: --version, -[vV] Print
>> version and exit successfully.  > --help, Print this help
>> and exit successfully.  > --skip-dot Skip directories in
>> PATH that start with a dot.  > ...
>> > 
>> > but everything seems to configure and make ok. Should
>> this message be > expect or is this a bug?
>> 
>> It is unexpected.  Is it new in 2.0.1 beta?  You have
>> told us your kernel, not your distro.  This looks like a
>> bug, but not in R.

PD> I've seen it whizz by occasionally but never got around
PD> to investigate.

me too. IIRC, also in some of my current Linux setups.
I think it's showing unfortunate behavior of configure ..
(ie. a "buglet" in the configure tools used to produce 'configure',
 and not in our 'configure.ac' source).

PD>  As said, it doesn't actually affect the result of configure.

my experience as well.
Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: OT: Debian/Ubuntu on amd64 (Re: [Rd] 64 Bit)

2004-10-22 Thread Martin Maechler

> "Dirk" == Dirk Eddelbuettel <[EMAIL PROTECTED]>
> on Fri, 22 Oct 2004 09:45:05 -0500 writes:

Dirk> On Fri, Oct 22, 2004 at 12:24:56PM +0100, Prof Brian
Dirk> Ripley wrote:
>> If you want a prebuilt version you are out of luck except
>> for Debian Linux on Alpha or ia64, from a quick glance.

Dirk> amd64 as well. It is not "fully official" but almost,
Dirk> see http://www.debian.org/ports/amd64/

yes, indeed.
We have one compute server (dual opteron) that runs a nice
64-bit Debian {"sid" aka "unstable" though} and for which I've used
'aptitude' (the "new" kid on the block replacement for 'apt-get')
to install r-base-recommended (and more) -- all prebuilt
[Of course I still mainly work with hand-compiled versions of R].

Dirk> <..>
Dirk> <..>

(that note about "Ubuntu" was very interesting to read;  thanks Dirk!)

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] error in plot.dendrogram (PR#7300)

2004-10-21 Thread Martin Maechler

> "Eryk" == Eryk Wolski <[EMAIL PROTECTED]>
> on Thu, 21 Oct 2004 13:41:29 +0200 (CEST) writes:

Eryk> Hi,

Eryk> hres <- hclust(smatr,method="single")
Eryk> hresd<-as.dendrogram(hres)
Eryk> as.dendrogram(hres)
Eryk> `dendrogram' with 2 branches and 380 members total, at height 2514.513 
Eryk> plot(hresd,leaflab="none") #<-error here.

definitely no error here.. maybe your graphic window is too
small or otherwise unable to show all the leaf labels?

Eryk> #the plotted dendrogram is incomplete. The x axis is not drawn.

ha! and why should this be a bug
Have you RTFHP and looked at its example??
There's never an x-axis in such a plot!

[You really don't want an x-axis overlayed over all the labels]

Eryk> #The interested reader can download the

Eryk> save(hresd,file="hres.rda")

Eryk> #from the following loacation
Eryk> www.molgen.mpg.de/~wolski/hres.rda

If you send a bug report (and please rather don't..),
it should be reproducible, i.e., I've just wasted my time for 

dFile <- "/u/maechler/R/MM/Pkg-ex/stats/wolski-hres.rda"
if(!file.exists(dFile))
download.file(url ="http://www.molgen.mpg.de/~wolski/hres.rda";, dest= dFile)
load(dFile)
hresd
plot(hresd)

If you look at this plot I hope you rather see that "single" has
been an extremly unuseful clustering method for this data / dissimilarities,
and you'd rather tried other methods than to which for an
x-axis.

If you really want one (just to see that it doesn't make sense),
you can always add
axis(1, fg = "red")

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] rw2000dev: problems with library(foreign)

2004-09-24 Thread Martin Maechler

> "Kjetil" == Kjetil Brinchmann Halvorsen <[EMAIL PROTECTED]>
> on Fri, 24 Sep 2004 10:10:39 -0400 writes:

Kjetil> I get the following
>> library(foreign)
Kjetil> Error in namespaceExport(ns, exports) : undefined
Kjetil> exports: write.foreign Error in library(foreign) :
Kjetil> package/namespace load failed for 'foreign'

Kjetil> with rw2000dev as of (2004-09-17

Does
  > system.file(package="foreign")

give the same initial path as
  > system.file(package="base")

?

If yes, I cannot help further;
if no,  this explains the problem:  you're picking up a wrong
version of the  foreign package.

Regards,
Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Cannot build cluster_1.9,6 under R 2.0.0 beta Sep 21

2004-09-24 Thread Martin Maechler

> "Dirk" == Dirk Eddelbuettel <[EMAIL PROTECTED]>
> on Thu, 23 Sep 2004 21:31:50 -0500 writes:

Dirk> And another follow-up -- this may well be related to
Dirk> cluster as mgcv-1.1.2 builds fine.

Well, thanks, Dirk, for all these.
As maintainer of cluster, I should be interested..

But then, I just see I did successfully "R CMD check cluster" on 
Sep 21 (your snapshot's date) on several Linux platforms,
and have just now tried to install from the local Sep.24 snapshot
i.e. ftp://ftp.stat.math.ethz.ch/Software/R/R-devel_2004-09-24.tar.gz
(which needs  'tools/rsync-recommended' after unpacking) without
a problem.

I'll try the /src/base-prerelease/R-2.0.0-beta-20040924.tgz 
one, subsequently.

Are you sure it's not a problem just with your copy of
something?

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] algorithm reference for sample()

2004-09-24 Thread Martin Maechler

Hi Vadim,

>>>>> "Vadim" == Vadim Ogranovich <[EMAIL PROTECTED]>
>>>>> on Thu, 23 Sep 2004 17:48:45 -0700 writes:

Vadim> Hi, Don't know if it belongs to r-devel or r-help,
Vadim> but since I am planning to alter some of R's internal code

i.e., you will propose a patch to the R sources eventually ?

Vadim>  I am sending it here.

good choice.  Also, since you are talking about
internal (non-API) C code from R - which I would deem
inappropriate on R-help.

Vadim> The existing implementation of the sample() function,
Vadim> when the optional 'prob' argument is given, is quite
Vadim> inefficient. The complexity is O(sampleSize *
Vadim> universeSize), see ProbSampleReplace() and
Vadim> ProbSampleNoReplace() in random.c. This makes the
Vadim> function impractical for the vector sizes I use. 

I'm interested: What problem are you solving where sample() is
the bottleneck (rather than what you *do* with the sample ..)

Vadim> I want to re-code these functions and I "think" I can
Vadim> come up with a more efficient algorithm.

I agree. It's a kind of table lookup, that definitely can be
made faster e.g. by bisection ideas.

Vadim> However before I go and reinvent the wheel I wonder if there
Vadim> is a published description of an efficient sampling
Vadim> algorithm with user-specified probabilities?

I've got some ideas, but maybe would first want to get a reply to the
current ones.

Vadim> Thanks, Vadim

Vadim>  [[alternative HTML version deleted]]
^^^^^^^^

(you *did* read the posting guide or just the general
 instructions on http://www.R-project.org/mail.html  ?
)

Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] attaching to position 1

2004-09-23 Thread Martin Maechler

>>>>> "PatBurns" == Patrick Burns <[EMAIL PROTECTED]>
>>>>> on Wed, 22 Sep 2004 18:30:10 +0100 writes:

PatBurns> If an attempt is made to attach to position 1, it
PatBurns> appears to work (not even a warning) but in fact
PatBurns> it doesn't work as many would expect.  "search"
PatBurns> thinks that it gets placed in position 2, but
PatBurns> nothing is actually there (according to "ls").

PatBurns> This is guaranteed to be confusing (and annoying)
PatBurns> to people who are used to attaching to position 1
PatBurns> in S-PLUS.

yes; thanks for bringing this up!

PatBurns> I'm not clear on all of the implications of
PatBurns> changing this, but my first inclination would be
PatBurns> to make it an error to attach to position 1.  The
PatBurns> help file says that you can't do it.

and has done so for a long time AFAIR.

PatBurns> At the very least there should be a warning .  My
PatBurns> guess is that it is rare for someone to attach to
PatBurns> position 1 and not attempt to modify what is being
PatBurns> attached.

Hence (together with the arguments further above),
I think an error would be more appropriate
[if there's only a warning and the user's code continues on 
 the wrong assumption, more problems lay ahead].

OTOH, in the current "beta" phase I can think of a case where an
error would be too "hard": 
The worst I can see is an R script that has attach(*, pos=1)
which doesn't attach at all {as you say, it *seems* to attach to
position 2 but doesn't really provide the object}, but for some
reason still continues to produce reasonable things.

Hene, for 2.0.0 in "deep freeze", I'd propose to give a warning only.
However, we wouldn't the database' to search()[2]  "seemingly" only,
and this could be a problem if a user's script does a detach(..) later.
I.e., we should attach() to pos=2 *properly* (instead of
"seemingly") only.

At the latest for 2.1.0, we should rather make the warning an error.

In any case, this looks like a very simple fix (to the C
source);

Martin Maechler

>> attach('foo.RData')
>> search()
PatBurns> [1] ".GlobalEnv""file:foo.RData""package:methods" 
PatBurns> [4] "package:stats" "package:graphics"  "package:grDevices"
PatBurns> [7] "package:utils" "package:datasets"  "Autoloads"   
PatBurns> [10] "package:base"
>> ls(2)
PatBurns> [1] "jj"
>> jj
PatBurns> [1] 1 2 3 4 5 6 7 8 9
>> detach()
>> search()
PatBurns> [1] ".GlobalEnv""package:methods"   "package:stats"   
PatBurns> [4] "package:graphics"  "package:grDevices" "package:utils"   
PatBurns> [7] "package:datasets"  "Autoloads" "package:base"
>> attach('foo.RData', pos=1)
>> search()
PatBurns> [1] ".GlobalEnv""file:foo.RData""package:methods" 
PatBurns> [4] "package:stats" "package:graphics"  "package:grDevices"
PatBurns> [7] "package:utils" "package:datasets"  "Autoloads"   
PatBurns> [10] "package:base"
>> ls(2)
PatBurns> character(0)

PatBurns> _  
PatBurns> platform i386-pc-mingw32
PatBurns> arch i386   
PatBurns> os   mingw32
PatBurns> system   i386, mingw32  
PatBurns> status   Under development (unstable)
PatBurns> major2  
PatBurns> minor0.0
PatBurns> year 2004   
PatBurns> month09 
PatBurns> day  17 
PatBurns> language R

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] "Namespace dependencies not required" message

2004-09-21 Thread Martin Maechler

> "GB" == Göran Broström <[EMAIL PROTECTED]>
> on Mon, 20 Sep 2004 22:28:08 +0200 writes:

GB> On Mon, Sep 20, 2004 at 04:05:52PM -0400, Warnes,
GB> Gregory R wrote:
>>  I'm still working to add namespace support to some of my
>> packages. I've removed the 'Depends' line from the
>> DESCRIPTION file, and created an appropriate NAMESPACE
>> files.
>> 
>> Strangely, whenever I use 'importFrom(package, function)'
>> R CMD check generates "Namespace dependencies not
>> required" warnings .  Without the import statements, the
>> warning does not occur, but the code tests fail with the
>> expected object not found errors.
>> 
>> This occurs with both R 1.9.1 and R 2.0.0.
>> 
>> So, should are these errors just bogus and I just ignore
>> these errors, or is there something I've done wrong with
>> the NAMESPACE or elsewhere?

GB> I had the same problem. I think you must keep the
GB> 'Depends' field in DESCRIPTION file.

yes, definitely.

And since you are (rightly, thank you!) working with 2.0.0-beta,
please consider
   http://developer.r-project.org/200update.txt

which mentions more things on 'Depends:', 'Suggests:' etc.
Also, don't forget to use 'Writing R Extensions' of 2.0.0.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

RE: [Rd] R-2.0.0 Install problem for pkg bundle w inter-dependent namespaces

2004-09-20 Thread Martin Maechler

>>>>> "Greg" == Warnes, Gregory R <[EMAIL PROTECTED]>
>>>>> on Mon, 20 Sep 2004 15:10:32 -0400 writes:

    >> -Original Message- From: Martin Maechler
>> [mailto:[EMAIL PROTECTED]
Greg> [...]  So, what is the proper way to handle this?  Is
Greg> there some way to manually specify the package install
Greg> order?
>>  Well, isn't the order in the 'Contains:' field of the
>> bundle DESCRIPTION file used?  If not, please consider
>> sending patches for src/scripts/INSTALL.in
>> 

Greg> OK, that's the simple thing that I had been
Greg> overlooking. Changing the the Contains line to provide
Greg> the packages in the order that they should be
Greg> installed fixed the problem.

Greg> May I suggest that the significance of the ordering in
Greg> the Contains: field be added to the (extremely brief)
Greg> description of in "Writing R Extensions"?

Greg> Perhaps change the text to:

Greg>   The 'Contains' field lists the packages, which
Greg> should be contained in separate subdirectories with
Greg> the names given.  During buiding and installation,
Greg> packages will be installed in the order specified.  Be
Greg> sure to order this list so that dependencies are
Greg> appropriately met.

Greg>   The packages contained in a bundle are standard
Greg> packages in all respects except that the 'DESCRIPTION'
Greg> file is replaced by a 'DESCRIPTION.in' file which just
Greg> contains fields additional to the 'DESCRIPTION' file
Greg> of the bundle, for example ...

Good idea, thank you!
I just committed this (with a typo corrected).

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Namespace problem

2004-09-20 Thread Martin Maechler

>>>>> "GB" == Göran Broström <[EMAIL PROTECTED]>
>>>>> on Mon, 20 Sep 2004 11:00:57 +0200 writes:

GB> On Mon, Sep 20, 2004 at 10:43:44AM +0200, Martin Maechler wrote:
>> >>>>> "GB" == Göran Broström <[EMAIL PROTECTED]>
>> >>>>> on Sun, 19 Sep 2004 18:51:49 +0200 writes:

GB> [...]

GB> I've checked that section, but I am not adding methods to generics, 
>> 
>> sure? 
>> Aren't you trying to export  mlreg.fit
>> which looks like a 'fit' S3 method for the 'mlreg' generic?

GB> But it isn't. I just have found '.' to be a convenient separator in
GB> variable names, since '_' (my C favourite) wasn't available. So what you
GB> are suggesting

no!! I'm not.

GB> that I have to change all the variable names with dots in
GB> them. Or add 'S3metod(...' for each of them. I guess that the former is
GB> preferable.  

no, really neither should be required.

We do encourage not using "." for new function names because of
the reason above, but it's definitely not a requirement.
In the case where  'foo'  is an S3 generic function name,
we however recommend quite strongly not to use 
   'foo.bar'
as function name since it looks "too much" like an S3 method.
Is this the case for you?

GB> But how is this problem connected to using C/Fortran code?

only via "namespace magic".

E.g., for packages with namespaces and R 2.0.0, 
 it' will become recommended  to *NOT* use the 'PACKAGE = "foobar"'
 argument to .C(.) or .Fortran() calls because then, the package
 version can be taken into account,
since NEWS for 2.0.0 has

>> C-LEVEL FACILITIES
>> 
>> oThe PACKAGE argument for .C/.Call/.Fortran/.External can be
>>  omitted if the call is within code within a package with a
>>  namespace.  This ensures that the native routine being called
>>  is found in the DLL of the correct version of the package if
>>  multiple versions of a package are loaded in the R session.
>>  Using a namespace and omitting the PACKAGE argument is
>>  currently the only way to ensure that the correct version is
>>  used.

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R-2.0.0 Install problem for pkg bundle w inter-dependent namespaces

2004-09-20 Thread Martin Maechler

>>>>> "Greg" == Warnes, Gregory R <[EMAIL PROTECTED]>
>>>>> on Fri, 17 Sep 2004 14:18:29 -0400 writes:

Greg> I have a revised version of the gregmisc package,
Greg> which I've converted into a package bundle each of
Greg> which has a namespace: gplots, gmodels, gdata,
Greg> gtoools.  Of course, there are interdependencies among
Greg> these namespaces:

Greg> gsun374: /tmp [11]> cd gregmisc/
Greg> gsun374: gregmisc [12]> grep import */NAMESPACE
Greg>   gdata/NAMESPACE:importFrom(gtools, odd, invalid, mixedsort)
Greg> gmodels/NAMESPACE:importFrom(MASS, ginv)
Greg>  gplots/NAMESPACE:importFrom(gtools, invalid)
Greg>  gplots/NAMESPACE:importFrom(gtools, odd)
Greg>  gplots/NAMESPACE:importFrom(gdata, nobs)

since nobody else has answered yet (and a considerable portion
of R-core is traveling this week) :

If I understand correctly, your basic package 'gtools' and the
dependency you need is

 gplots --> gdata --> gtools
\->/

Have you made sure to use the proper  'Depends: ' entries in
the DESCRIPTION(.in) files of your bundle packages ?

This works fine if the packages are *not* in a bundle, right?

Greg> Under R-1.9.1, this package bundle passes R CMD check
Greg> and installs happily.  However, under yesterday's
Greg> R-2.0.0-alpha, the package fails to install (& hence
Greg> pass CMD CHECK) with the error

Greg> ** preparing package for lazy loading
Greg> Error in loadNamespace(i[[1]], c(lib.loc, .libPaths()), keep.source)
Greg> : 
Greg> There is no package called 'gdata'
Greg> Execution halted
Greg> ERROR: lazy loading failed for package 'gplots'

Greg> because the gdata package is the last in the bundle to
Greg> be installed, so it is not yet present.

Greg> So, what is the proper way to handle this?  Is there
Greg> some way to manually specify the package install order?

Well, isn't the order in the 'Contains:' field of the bundle
DESCRIPTION file used?  
If not, please consider sending patches for  
src/scripts/INSTALL.in

There are not too many bundles AFAIK, and conceptually
(inspite of the recommended VR one) the improved package
management tools that we (and the bioconductor project) have
been adding to R for a while noe
really aim for "R package objects" and clean version /
dependency handling of inidividual packages in many different concepts.

If bundle installation etc could rely entirely on the package
tools, bundles would "work automagically".  But probably, for
this a bundle would have to be treated as a "package repository"
which it isn't currently AFAIK.

Regards,
Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Namespace problem

2004-09-20 Thread Martin Maechler

>>>>> "GB" == Göran Broström <[EMAIL PROTECTED]>
>>>>> on Sun, 19 Sep 2004 18:51:49 +0200 writes:

GB> Now I try to add some C and Fortan code to my package, so the NAMESPACE
GB> file is

GB> useDynLib(eha)
GB> importFrom(survival, Surv)
GB> export(mlreg.fit, risksets)

GB> but I get 

GB> .
GB> * checking R files for library.dynam ... OK
GB> * checking S3 generic/method consistency ... WARNING
GB> Error in .try_quietly({ : Error in library(package, lib.loc = lib.loc, 
character.only = TRUE, verbose = FALSE) :
GB> package/namespace load failed for 'eha'
GB> Execution halted
GB> See section 'Generic functions and methods' of the 'Writing R Extensions'
GB> manual.
GB> .

GB> I've checked that section, but I am not adding methods to generics, 

sure? 
Aren't you trying to export  mlreg.fit
which looks like a 'fit' S3 method for the 'mlreg' generic?

In that case you need to add
S3method(mlreg, fit)
to the NAMESPACE file.

GB> I'm not writing new generic functions.

GB> If I remove useDynLib(eha), I get no errors or warnings, except that the
GB> example in mlreg.fit.Rd doesn't work (of course).

GB> So what's wrong?

(see above)?

Regards,
Martin

Martin Maechler <[EMAIL PROTECTED]> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16Leonhardstr. 27
ETH (Federal Inst. Technology)  8092 Zurich SWITZERLAND
phone: x-41-1-632-3408  fax: ...-1228   <><

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] cor() fails with big dataframe

2004-09-16 Thread Martin Maechler

> "Mayeul" == Mayeul KAUFFMANN <[EMAIL PROTECTED]>
> on Thu, 16 Sep 2004 01:23:09 +0200 writes:

Mayeul> Hello,
Mayeul> I have a big dataframe with *NO* na's (9 columns, 293380 rows).

Mayeul> # doing
Mayeul> memory.limit(size = 10)
Mayeul> cor(x)
Mayeul> #gives
Mayeul> Error in cor(x) : missing observations in cov/cor
Mayeul> In addition: Warning message:
Mayeul> NAs introduced by coercion

"by coercion" means there were other things *coerced* to NAs!

One of the biggest problem with R users (and other S users for
that matter) is that if they get an error, they throw hands up
and ask for help - assuming the error message to be
non-intelligible.  Whereas it *is* intelligible (slightly ? ;-)
more often than not ...

Mayeul> #I found the obvious workaround:
Mayeul> COR <- matrix(rep(0, 81),9,9)
Mayeul> for (i in 1:9) for (j in 1:9) {if (i>j) COR[i,j] <- cor (x[,i],x[,j])}
Mayeul> #which works fine, with no warning

Mayeul> #looks like a "cor()" bug.

quite improbably.

The following works flawlessly for me
and the only things that takes a bit of time is construction of
x, not cor():

  > n <- 30
  > set.seed(1)
  > x <- as.data.frame(matrix(rnorm(n*9), n,9))
  > cx <- cor(x)
  > str(cx)
   num [1:9, 1:9]  1.0 -0.00039  0.00113  0.00134 -0.00228 ...
   - attr(*, "dimnames")=List of 2
..$ : chr [1:9] "V1" "V2" "V3" "V4" ...
..$ : chr [1:9] "V1" "V2" "V3" "V4" ...

Mayeul> #I checked absence of NA's by
Mayeul> x <- x[complete.cases(x),]
Mayeul> summary(x)
Mayeul> apply(x,2, function (x) (sum(is.na(x

Mayeul> #I use R 1.9.1

What does
sapply(x, function(u)all(is.finite(u)))
return ?

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] is it a typo?

2004-09-14 Thread Martin Maechler

> "AndyL" == Liaw, Andy <[EMAIL PROTECTED]>
> on Tue, 14 Sep 2004 10:28:31 -0400 writes:

AndyL> In ?options:
AndyL> 'warning.expression': an R code expression to be called if a
AndyL> warning is generated, replacing the standard message.  If
AndyL> non-null is called irrespective of the value of option
AndyL> 'warn'.

AndyL> Is there a missing `it' between `non-null' and `is'?

yes, thank you -- now fixed.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] reorder [stats] and reorder.factor [lattice]

2004-09-13 Thread Martin Maechler

> "DeepS" == Deepayan Sarkar <[EMAIL PROTECTED]>
> on Mon, 13 Sep 2004 14:54:52 -0500 writes:

DeepS> Before it's too late for R 2.0.0, do we have a final decision yet on 
DeepS> having a reorder method for "factor" in stats?

Since the thread is quite a bit old, (and I have been in vacation back then),
could you summarize what you think about it?

When skimming through the thread I got the impression that, yes,
it was worth to "centralize" such a method in 'stats' rather
than have different slightly incompatible versions in different
other packages.
This is of tangential interest to me as I have been slightly
involved with reorder.dendrogram()

Regards,
Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] author field in Rd files

2004-08-27 Thread Martin Maechler

Since nobody else has reacted yet:

>>>>> "Timothy" == Timothy H Keitt <[EMAIL PROTECTED]>
>>>>> on Wed, 25 Aug 2004 16:53:39 -0500 writes:

Timothy> I noticed in the extension manual that the
Timothy> \author{} entry should refer to the author of the
Timothy> Rd file and not the code documented. I had always
Timothy> interpreted it as the author of the code, not the
Timothy> documentation. I wonder if others also find this
Timothy> ambiguous.

I tend to agree with you.  Very often the author means both the
author of the R object and the help page.
In the few other cases, for me, I was the help page author
(rather than the other way around) and I think I usually have
done what you suggest:  Showed the author of the code and
sometimes also mentioned myself (as docu-author), but typically
only if I had also improved on the code.

Timothy> Its generally not an issue, except when there is a
Timothy> third party writing documentation. It looks like
Timothy> they wrote all the code. Would it make sense to
Timothy> have two entries, one for the documentation author
Timothy> and one for the code author if different?

I think in such a case \author{..} should contain both the code
and documentation authors.
In a package with many help pages, a possibility is also to
 specify  \author{..} and \references{} in only a few help
pages and for the others, inside the   \seealso{...}   section
have a sentence pointing to the main help page(s), such as
\seealso{ 
  ..
  For references etc, \code{\link{}}.
}

Regards,
Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] No is.formula()

2004-08-26 Thread Martin Maechler

> "tony" == A J Rossini <[EMAIL PROTECTED]>
> on Wed, 25 Aug 2004 14:33:23 -0700 writes:

tony> "Warnes, Gregory R"
tony> <[EMAIL PROTECTED]> writes:
>> There appears to be no "is.formula()" function in
>> R-1.9.1.  May I suggest that
>> 
>> is.formula <- function(x) inherits(x, "formula")
>> 
>> be added to base, since formula is a fundimental R type?

tony> why not just

tony> is(x,"formula")
tony> ?

because the latter needs the methods package and base functions
must work independently of "methods".

The question is what  "fundamental R type" would be exactly.
But I tend to agree with Greg, since formulae are constructed
via the .Primitive '~' operator.

Apropos, I believe we should move the  is.primitive function
from "methods" to "base".

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Possible Latex problem in R-devel?

2004-08-23 Thread Martin Maechler

> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
> on Mon, 23 Aug 2004 21:14:44 +0100 (BST) writes:

BDR> On Mon, 23 Aug 2004, Jeff Gentry wrote:
>> > What version of perl?
>> 
>> Ack, didn't realize it was this ancient.  Version 5.005_03, which is what
>> comes with FreeBSD 4.9 apparently.  I did install the /usr/ports version
>> of perl is 5.8.2, although there seems to be other problems here (which
>> are most likely related to my system, will track that down before bringing
>> that issue up - it appears to be a mismatch on my libraries between the
>> two versions).

BDR> Yes, that version of perl has a lot of bugs, but in theory we support it.
BDR> (It seems worse than either 5.004 or 5.005_04.)

>> > print $latexout &latex_link_trans0($blocks{"name"});
>> > will probably solve it for you.
>> 
>> Yup, this works, thanks.

BDR> I've changed the sources, to be defensive.  I don't like reading Perl
BDR> like that, but it does work more portably.

I'm glad for the change.
Our redhat enterprise version of perl (5.8.0) also couldn't deal
with the other syntax.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] legend's arguments "angle", "density" & "lwd" (PR#7023)

2004-08-23 Thread Martin Maechler

>>>>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
>>>>> on Fri, 20 Aug 2004 19:44:40 +0200 writes:

UweL> Martin Maechler wrote:
>>>>>>> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
>>>>>>> on Fri, 20 Aug 2004 17:01:13 +0200 writes:
>> 
UweL> Paul Murrell wrote [on 2002-03-14 with Subject: "filled bars with 
UweL> patterns" in reply to Arne Mueller]
>> 
>> >> Hi
>> >> 
>> >> 
>> >> 
>> >>> I'd also like to have the filled boxes in the legend to be striped. The
>> >>> legend function has a 'density' attribute, but unfortunately this does't
>> >>> seem to do anything
>> >>> 
>> >>> following the above example
>> >>> 
>> >>> legend(3.4, 5, c('Axx','Bxx','Cxx','Dxx'), fill = c('red', 'blue',
>> >>> 'green', 'orange'))
>> >>> 
>> >>> is the same as
>> >>> 
>> >>> legend(3.4, 5, c('Axx','Bxx','Cxx','Dxx'), density=10,
>> >>>fill = c('red', 'blue', 'green', 'orange'),
>> >>>density=c(10,-1,20, 200))


>> >> This appears to be a bug.  Can you file a bug report for this please?

UweL> [SNIP; I cannot find any related bug report in the repository]

UweL> I'm just reviewing bug reports and other stuff re. legend() and found 
UweL> this old message in one of my Next-Week-To-Do-folders.
>> 
UweL> Well, the point mentioned above is not really a bug,
UweL> because one has to specify BOTH arguments, angle AND
UweL> density in legend(). Is there any point not to make
UweL> angle = 45 the default, as it already is for polygon()
UweL> and rect()?

MM> This seems like a good idea,
MM> but we'll wait for your many other patches to legend.R and
MM> legend.Rd   :-)

UweL> Just three rather than many issues I'm trying to address, the third one 
UweL> is just closing a bug report. ;-)
UweL> Here the two suggested patches in merged form.

UweL> Uwe

<... snipping Uwe's patches  .>

This has now lead to more:
I've just added to NEWS (and the C and R sources of course)

o   plot.xy(), the workhorse function of points, lines and plot.default
now has 'lwd' as explicit argument instead of implicitly in "...",
and now recycles lwd where it makes sense, i.e. for line based plot
symbols.

such that Uwe's proposed new argument to legend(), pt.lwd is
also recycled and now can default to 'lwd', the line width of
the line segments in legend().
Hence, Leslie's original feature request (PR#7023) of June 25
is now also fulfilled using only 'lwd' and not both 'lwd' and 'pt.lwd'.
I.e., the following now works

x <- 1:10 ; y <- rnorm(10,10,5) ; y2 <- rnorm(10,8,4)
plot(x, y, bty="l", lwd=3, type="o", col=2, ylim = range(y,y2), cex=1.5)
points(x, y2, lwd=3, lty=8, col=4, type="o", pch=2, cex=1.5)
legend(10, max(y, y2), legend=c("Method 1","Method 2"),  
   col=c(2, 4), lty=c(1, 8), pch=c(1, 2), 
   xjust=1, yjust=1, pt.cex=1.5, lwd=3)

[note that I've used 'ylim = range(y,y2)' which is slightly better than
 'ylim = c(min(y,y2),max(y,y2))']

With thanks to Uwe!
Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] legend's arguments "angle" and "density"

2004-08-20 Thread Martin Maechler

> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
> on Fri, 20 Aug 2004 17:01:13 +0200 writes:

UweL> Paul Murrell wrote [on 2002-03-14 with Subject: "filled bars with 
UweL> patterns" in reply to Arne Mueller]

>> Hi
>> 
>> 
>> 
>>> I'd also like to have the filled boxes in the legend to be striped. The
>>> legend function has a 'density' attribute, but unfortunately this does't
>>> seem to do anything
>>> 
>>> following the above example
>>> 
>>> legend(3.4, 5, c('Axx','Bxx','Cxx','Dxx'), fill = c('red', 'blue',
>>> 'green', 'orange'))
>>> 
>>> is the same as
>>> 
>>> legend(3.4, 5, c('Axx','Bxx','Cxx','Dxx'), density=10, fill = c('red',
>>> 'blue', 'green', 'orange'),
>>> density=c(10,-1,20, 200))
>> 
>> 
>> 
>> This appears to be a bug.  Can you file a bug report for this please?

UweL> [SNIP; I cannot find any related bug report in the repository]


UweL> I'm just reviewing bug reports and other stuff re. legend() and found 
UweL> this old message in one of my Next-Week-To-Do-folders.

UweL> Well, the point mentioned above is not really a bug, because one has to 
UweL> specify BOTH arguments, angle AND density in legend(). Is there any 
UweL> point not to make angle = 45 the default, as it already is for polygon() 
UweL> and rect()?

This seems like a good idea,
but we'll wait for your many other patches to legend.R and
legend.Rd   :-)

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Unbalanced parentheses printed by warnings() crash text editor

2004-08-20 Thread Martin Maechler

>>>>> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
>>>>> on 20 Aug 2004 12:01:39 +0200 writes:

PD> Duncan Murdoch <[EMAIL PROTECTED]> writes:
>> >I could have sent this to the ESS or Xemacs devel list, but ESS & Xemacs'
>> >attempt to find balanced parentheses accross many lines seems sensible,
>> >and is needed with very long functions.
>> 
>> Yes, it's sensible to try, but it is a bug that they don't fail
>> gracefully.

PD> (Actually, it is not sensible; ESS should try harder to figure out
PD> what is actually R code. Much as I love ESS, it is a persistent fly in
PD> the ointment when the opening blurb comes up with "for" and "in" in
PD> magenta.)

I'm chiming in, since I have been addressed explicitly here (for
whatever reason):

Yes, yes, and yes to Duncan's and Peter's notes:

- This should have gone to the ESS-help mailing list
- it's no bug in R and a bug in ESS/Xemacs (actually a bug in Xemacs combined
  with a missing feature in ESS).

Martin Maechler

For the sake of ESS-help, here's the original message as well:

>>>>> "Mayeul" == Mayeul KAUFFMANN <[EMAIL PROTECTED]>
>>>>> on Thu, 19 Aug 2004 23:32:51 +0200 writes:

Mayeul> ... Hope it is the good place for this
Mayeul> (I discuss the question of the right place below).

Mayeul> Most of the time, warnings are more than 1000

[?? you probably mean something like '100', not '1000', right?]

Mayeul> characters long and thus are truncated.  Most of the
Mayeul> time, this generates printouts with unbalanced parentheses.

Mayeul> Intelligent text editors which do parentheses
Mayeul> highlighting get very confused with this.  After too
Mayeul> many warnings, they give errors, and may even crash.

crashing *must* be a bug of the editor setup (ESS - XEmacs -
Windows), not of R.

Mayeul> Specifically, I use ESS and XEmacs for Windows Users
Mayeul> of R (by John Fox) which is advised to do at
Mayeul> http://ess.r-project.org/ with a buffer for text
Mayeul> editing and an inferior ESS (R) buffer.  (I
Mayeul> downloaded the latest Xemacs and ESS a month ago).

Mayeul> After too many warnings (with unbalanced
Mayeul> parentheses), Xemacs swithes to an ESS-error buffer
Mayeul> which says "error Nesting too deep for parser".  In
Mayeul> some case, when back in R buffer, typing any letter
Mayeul> switches back to the ESS-error Buffer.  In other
Mayeul> case, it simply takes ages (until you kill Xemacs)
Mayeul> or it crashes.  In most case, the R process is lost.

Mayeul> I could have sent this to the ESS or Xemacs devel
Mayeul> list, but ESS & Xemacs' attempt to find balanced
Mayeul> parentheses accross many lines seems sensible, and
Mayeul> is needed with very long functions.

Mayeul> A workaround would be to change the function that print warnings.

Mayeul> Instead of, for instance,
Mayeul> "error message xx in: function.yy(z,zzz,   ..."

Mayeul> It may print
Mayeul> "error message xx in: function.yy(z,zzz,   ...)"

Mayeul> The function should truncate the error message, find
Mayeul> how many parenthesis and brackets are open in the
Mayeul> remaining part, substract the number of closing
Mayeul> parenthesis and brackets, and add that many
Mayeul> parenthesis at the end.  (Xemacs parentheses
Mayeul> highligher regards "(" and "[" as equivalent)

Mayeul> Mayeul KAUFFMANN
Mayeul> Univ. Pierre Mendes France
Mayeul> Grenoble - France

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Springer: New Series Announcement "UseR!"

2004-08-20 Thread Martin Maechler

   [in behalf of John Kimmel and Kurt Hornik:]

   [PDF version for nice printing attached at the end]

NEW SERIES ANNOUNCEMENT and 
REQUEST FOR BOOK PROPOSALS 

Springer announces a series of books called 
 UseR! 

edited by Robert Gentleman, Kurt Hornik, and Giovanni Parmigiani. 

This series of inexpensive and focused books on R will publish
shorter books aimed at practitioners. Books can discuss the use
of R in a particular subject area (e.g., epidemiology,
econometrics, psychometrics) or as it relates to statistical
topics (e.g., missing data, longitudinal data). In most cases,
books are to be written as combinations of LaTeX and R so that
all the code for figures and tables can be put on a
website. Authors should assume a background as supplied by
«Dalgaard s Introductory Statistics with R» so that each book does
not repeat basic material. Springer will supply a LaTeX style
file, all books will be reviewed and copyedited, and faster
production schedules will be used. 

To propose a book, please contact 

John Kimmel 
Executive Editor, 
Statistics Springer 
24494 Alta Vista Dr. 
Dana Point, CA 92629

[EMAIL PROTECTED] 
Telephone: 949-487-1216 Fax: 949-240-4321



UseR-Kimmel.pdf
Description: Adobe PDF document
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] interaction.plot

2004-08-11 Thread Martin Maechler

> "ChrBu" == Christoph Buser <[EMAIL PROTECTED]>
> on Fri, 6 Aug 2004 10:24:40 +0200 writes:

ChrBu> Dear R core team I've a proprosal to improve the
ChrBu> function interaction.plot. It should be allowed to
ChrBu> use type = "b". 

thank you for the suggestion.
I've implemented the above for R-devel several days ago.

ChrBu> This can be done by changing the function's header from

ChrBu> function( , type = c("l", "p"), )

ChrBu> to

ChrBu> function( , type = c("l", "p", "b"), )

ChrBu> Then it works. 

well, as I mentioned to you privately, it also needs a change
in the legend() call subsequently.

ChrBu> This type = "b" is useful, if the second level of the
ChrBu> x.factor is missing for some level of the
ChrBu> trace.factor. With type= "l" you loose first level of
ChrBu> the x.factor, too (because you can't draw the line to
ChrBu> the missing second level). With type = "p" so see
ChrBu> this first level, but you have no lines at all (just
ChrBu> chaos with points). With type = "b", you get all
ChrBu> existing levels plus the lines between two contiguous
ChrBu> levels (if they both exist).

ChrBu> There is a second point. Using interaction.plot with
ChrBu> the additional argument main creates a warning:

ChrBu> parameter "main" couldn't be set in high-level plot() function

ChrBu> The reason is that "..." is used twice inside of
ChrBu> interaction.plot, in fact in

ChrBu> matplot( ,...)

ChrBu> and in

ChrBu> axis( ,...)

ChrBu> axis can do nothing with this argument main and
ChrBu> creates this warning. You could replace ,... in the
ChrBu> axis function by inserting all reasonable arguments
ChrBu> of axis in the functions header (interaction.plot)
ChrBu> and give over those arguments to axis. Then you
ChrBu> shouldn't get this warning anymore.

yes, indeed.

Note however that this warning also happens with other such plotting
functions and I find it is not a real blemish.
Your proposed solution is not so obvious or easy, since 
axis() really has its own ``intrinsic'' ... argument and
conceptually does accept many more possible "graphical
parameters" in addition to its specific ones.

Hence I believe it would need quite a large extent of extra code
in order to
1) keep the current potential functionality
2) always properly separate arguments to be passed to matplot()
   from those to be passed to axis().

-- and as ``we all know'' we should really use lattice package
   functions rather than interaction.plot() 
   {but then I'm still not the role model here ... ;-( }

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting time series

2004-08-10 Thread Martin Maechler

>>>>> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
>>>>> on Tue, 10 Aug 2004 09:11:39 +0100 (BST) writes:

BDR> On Tue, 10 Aug 2004, Martin Maechler wrote:
>> >>>>> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
>> >>>>> on Tue, 10 Aug 2004 05:47:28 +0100 (BST) writes:
>> 
BDR> On Tue, 10 Aug 2004, Ross Ihaka wrote:
>> >> Rob Hyndman wrote:
>> >> > When part of a time series is extracted, the time series component is 
>> >> > lost. e.g.,
>> >> > x <- ts(1:10)
>> >> > x[1:4]
>> >> > 
>> >> > It would be nice if there was a subsetting function [.ts to avoid this 
>> >> > problem. However, it is beyond my R-coding ability to produce such a 
>> >> > thing.  Is someone willing to do it?
>> 
BDR> There is a [.ts, in src/library/stats/R/ts.R, and it is documented 
BDR> (?"[.ts").
>> 
>> >> Have you had a look at "window"?  The problem with "["
>> >> its that it can produce non-contiguous sets of values.
>> 
BDR> Yes.
>> 
>> indeed.  window() is what we have been advocation for a long
>> time now ... (but see below).
>> 
BDR> If you look in the sources for [.ts you will see,
BDR> commented, the code that was once there to handle cases
BDR> where the index was evenly spaced.  But it was removed
BDR> long ago in favour of window().  I tried to consult the
BDR> logs, but it seems that in the shift from CVS to SVN
BDR> recently I can't get at them.  I think the rationale
BDR> was that x[ind] should always produce an object of the
BDR> same class.
>> 
>> well, that can't have been the only rationale since now
>> x[ind] is *not* of the same class - when the "ts" property is
>> lost in any case.

BDR> `always of the same class' : for all (non-trivial)  values of ind.

aah!  please excuse my mis-interpretation of meaning "same class
as original x".

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Subsetting time series

2004-08-10 Thread Martin Maechler

> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
> on Tue, 10 Aug 2004 05:47:28 +0100 (BST) writes:

BDR> On Tue, 10 Aug 2004, Ross Ihaka wrote:
>> Rob Hyndman wrote:
>> > When part of a time series is extracted, the time series component is 
>> > lost. e.g.,
>> > x <- ts(1:10)
>> > x[1:4]
>> > 
>> > It would be nice if there was a subsetting function [.ts to avoid this 
>> > problem. However, it is beyond my R-coding ability to produce such a 
>> > thing.  Is someone willing to do it?

BDR> There is a [.ts, in src/library/stats/R/ts.R, and it is documented 
BDR> (?"[.ts").

>> Have you had a look at "window"?  The problem with "["
>> its that it can produce non-contiguous sets of values.

BDR> Yes.

indeed.  window() is what we have been advocation for a long
time now ... (but see below).

BDR>   If you look in the sources for [.ts you will see,
BDR> commented, the code that was once there to handle cases
BDR> where the index was evenly spaced.  But it was removed
BDR> long ago in favour of window().  I tried to consult the
BDR> logs, but it seems that in the shift from CVS to SVN
BDR> recently I can't get at them.  I think the rationale
BDR> was that x[ind] should always produce an object of the
BDR> same class.

well, that can't have been the only rationale since now
x[ind] is *not* of the same class - when the "ts" property is
lost in any case.

I don't much like the current behavior of "[.ts" either.
It should either work by returning the "ts" object in the
equidistant case  and give a warning (at least) in the
non-equidistant case.
OTOH, intuitively, when 'ind' has length 1,  x[ind] should just
give a number... [grumble..]
But maybe it's only a very small performance hit when that
continues to carry the "ts" class attribute along.

If we think of the data.frame analogue, we might consider
defining "[[.ts" for extracting numbers and "[.ts" to always
return a time series or an error.
But that is probably too much incompatible to current behavior.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] URL change for Statistik ETH and web interface to mailing lists.

2004-08-09 Thread Martin Maechler

   [exceptional cross-posting on purpose]

Dear users/readers of our mailing lists,

The following ONLY affects the web interface to our mailing lists. 
This is both for subscription changes and the mailing list archives.

For a long time, our "Statistics ETHZ" home page has been
available both as
http://www.stat.math.ethz.ch/   [old default]
and http://stat.ethz.ch/[new default]

For several reasons however, the longer URL above has been the
``default'', or the one to which stat.ethz.ch was automagically converted.
This now has finally been switched two hours ago, such that
http://stat.ethz.ch/  is now the official URL in all respects and
http://www.stat.math.ethz.ch/ is just an alias to the new URL
(and it seems to behave strangely just now for me, but that
 should be very temporary).

For existing mailing lists and their web interface, the change
change has to happen ``inside mailman'' (by calling the correct
python script and changing the lists URL explictly) 
once for each list.

I wanted to first announce this widely and do the change in
about 24 hours or so.
You may have to accept the SSL certificate again {and it's not
from a so called "trusted agency" since that would cost us lots
of money we rather spend differently}.

Regards,

Martin Maechler <[EMAIL PROTECTED]> http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH Zurich, Switzerland

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re: [R] Problem in method's Makefile?

2004-08-06 Thread Martin Maechler

> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
> on Thu, 5 Aug 2004 20:50:29 +0100 (BST) writes:

BDR> ..  
BDR> ..  

BDR> However, I am working right now on streamlining this 
BDR> now we don't allow lazy-loading to be optional.

"don't allow" :
That is for the "core packages" only, right?

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] 3.5 on Tiger

2004-08-02 Thread Martin Maechler

>>>>> "Jan" == Jan de Leeuw <[EMAIL PROTECTED]>
>>>>> on Sun, 1 Aug 2004 10:25:24 -0700 writes:

Jan> setenv CC gcc-3.5
Jan> setenv F77 gfortran
Jan> setenv CXX g++-3.5

Jan> What I had to do.

(for what version of R exactly)

Jan> 1. Build a local zlib-1.2.1

Jan> 2. Compile ppr.f in stats with g77-3.4 and not with gfortran-3.5

(why -- a bug in gfortran-3.5 ? )

Jan> 3. Change line 1581 in plot.c to

Jan> double *aa = REAL(cex);
Jan> if (R_FINITE(aa[i % ncex])

line 1582 {R-devel} is

if (R_FINITE(thiscex = REAL(cex)[i % ncex])

would this work?

Jan> i.e. get the assignment out of the macro

Jan> 4. Disable the DO_STR_DIM macro in plot.c (and thus the
Jan> functions do_strheight and do_strwidth). This could be fixed by
Jan> expanding the macros and pasting into the source.

??? why that??

I wrote those quite carefully IIRC.

Regards,
Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Problems with Lapack's eigen() on 64-bit

2004-07-28 Thread Martin Maechler

>>>>> "BDR" == Prof Brian Ripley <[EMAIL PROTECTED]>
>>>>> on Tue, 27 Jul 2004 13:17:48 +0100 (BST) writes:

BDR> Our `public' Opteron (dual 248, FC2 + Goto BLAS) is
BDR> right, as is Solaris 64-bit. FC2 is using gcc version
BDR> 3.3.3 20040412 (Red Hat Linux 3.3.3-7), quite a lot
BDR> later.

BDR> I would try updating your compiler (to 3.4.1?) and
BDR> perhaps try a different BLAS.

I've now hand-compiled gcc 3.4.1 and recompiled R-patched and
R-devel both configured "--without-blas" and the eigen()
problem is gone.

For the record, note that before the problem appeared both for
"--with-blas=goto" and "--without-blas".

I've also added a regression test to R-devel which makes sure
that "make check" only passes when the mentioned eigen()
computation is correct.

Martin

BDR> We do have RHEL for Opteron, but AFAIK it is not on a
BDR> system at present.  (There are others running SuSE
BDR> 9.0/.1, I believe.)

BDR> On Tue, 27 Jul 2004, Martin Maechler wrote:

>> I'm only now realizing that we have severe problems with
>> R on our AMD 'Opteron' and 'Athlon64' clients running
>> Redhat Enterprise with all 64-bit libraries (AFAICS).
>> 
>> The Lapack problem happens for R-patched and R-devel both
>> on the Opteron and the Athlon64.
>> 
>> Here are platform details:
>> 
>> o "gcc -v" and "g77 -v" both end with the line gcc
>> version 3.2.3 20030502 (Red Hat Linux 3.2.3-34)
>> 
>> o I've used ./configure --without-blas
>> 
>> 1) Opteron ("deb7", a dual-processor compute server):
>> 
>> - uname -a : Linux deb7 2.4.21-9.0.3.ELsmp #1 SMP Tue Apr
>> 20 19:44:29 EDT 2004 x86_64 x86_64 x86_64 GNU/Linux -
>> /proc/cpuinfo contains (among much more) vendor_id :
>> AuthenticAMD cpu family : 15 model : 5 model name : AMD
>> Opteron(tm) Processor 248
>> 
>> 2) Athlon64 (a simple "new generation" client - to become
>> my desktop soon):
>> 
>> - uname -a : Linux setup-12 2.4.21-15.0.2.EL #1 Wed Jun
>> 16 22:41:44 EDT 2004 x86_64 x86_64 x86_64 GNU/Linux
>> 
>> - /proc/cpuinfo contains vendor_id : AuthenticAMD cpu
>> family : 15 model : 14 model name : AMD Athlon(tm) 64
>> Processor 2800+
>> 
>> 
>> 
>> Now the Lapack problem, easily seen from the base eigen()
>> example:
>> 
>> > eigen(cbind(1, 3:1, 1:3)) $values [1] 5.7015621
>> 1.000 -0.7015621
>> 
>> $vectors [,1] [,2] [,3] [1,] 0.4877939 -0.7181217
>> -0.9576161 [2,] 0.6172751 -0.3893848 0.2036804 [3,]
>> 0.6172751 0.5767849 0.2036804
>> 
>> which is simply plainly wrong and eigen(cbind(1, 3:1,
>> 1:3), EISPACK=TRUE) gives the correct eigen values c(5,
>> 1, 0) and corresponding eigenvectors.
>> 
>> IIRC, we've already dealt with a Lapack problem, and that
>> workaround (built into R-devel's Makefiles) has been to
>> use -ffloat-store for the compilation of
>> src/modules/lapack/dlamc.f
>> 
>> --
>> 
>> Thank you for hints / suggestions.
>> 
>> Others with 64-bit platforms might also try to see what
>> eigen(cbind(1, 3:1, 1:3)) gives there.

BDR> -- Brian D. Ripley, [EMAIL PROTECTED] Professor of
BDR> Applied Statistics, http://www.stats.ox.ac.uk/~ripley/
BDR> University of Oxford, Tel: +44 1865 272861 (self) 1
BDR> South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG,
BDR> UK Fax: +44 1865 272595

BDR> __
BDR> [EMAIL PROTECTED] mailing list
BDR> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] Problems with Lapack's eigen() on 64-bit

2004-07-27 Thread Martin Maechler

I'm only now realizing that we have severe problems with R on our
AMD 'Opteron' and 'Athlon64' clients running Redhat Enterprise
with all 64-bit libraries (AFAICS).

The Lapack problem happens for R-patched and R-devel both on
the Opteron and the Athlon64.

Here are platform details:

o  "gcc -v" and "g77 -v" both end with the line
   gcc version 3.2.3 20030502 (Red Hat Linux 3.2.3-34)

o  I've used ./configure --without-blas

1) Opteron ("deb7", a dual-processor compute server):

  - uname -a :
Linux deb7 2.4.21-9.0.3.ELsmp #1 SMP Tue Apr 20 19:44:29 EDT 2004 x86_64 x86_64 
x86_64 GNU/Linux
  - /proc/cpuinfo  contains (among much more)
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 5
model name  : AMD Opteron(tm) Processor 248

2) Athlon64 (a simple "new generation" client - to become my desktop soon):
  
  - uname -a :
Linux setup-12 2.4.21-15.0.2.EL #1 Wed Jun 16 22:41:44 EDT
2004 x86_64 x86_64 x86_64 GNU/Linux 

  - /proc/cpuinfo  contains
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 14
model name  : AMD Athlon(tm) 64 Processor 2800+



Now the Lapack problem, easily seen from the base eigen()
example:

> eigen(cbind(1, 3:1, 1:3))
$values
[1]  5.7015621  1.000 -0.7015621

$vectors
  [,1]   [,2]   [,3]
[1,] 0.4877939 -0.7181217 -0.9576161
[2,] 0.6172751 -0.3893848  0.2036804
[3,] 0.6172751  0.5767849  0.2036804

which is simply plainly wrong and  
eigen(cbind(1, 3:1, 1:3), EISPACK=TRUE)
gives the correct eigen values c(5, 1, 0)  and corresponding
eigenvectors.

IIRC, we've already dealt with a Lapack problem, and that
workaround (built into R-devel's Makefiles) has been to use
-ffloat-store for the compilation of src/modules/lapack/dlamc.f

--

Thank you for hints / suggestions.

Others with 64-bit platforms might also try to see what
   eigen(cbind(1, 3:1, 1:3)) 
gives there.

Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] RE: [Rcom-l] rcom 0.97 available for download

2004-07-21 Thread Martin Maechler

> "Dirk" == Dirk Eddelbuettel <[EMAIL PROTECTED]>
> on Wed, 21 Jul 2004 09:49:24 -0500 writes:

Dirk> On Wed, Jul 21, 2004 at 04:40:46PM +0200, Uwe Ligges wrote:
>> Dirk Eddelbuettel wrote:
>> 
>> >On Wed, Jul 21, 2004 at 09:39:29AM -0400, Liaw, Andy wrote:
>> >
>> >>Would it make sense to have platform specific packages in a separate area 
>> >>on
>> >>CRAN?  I don't know of anything other than Windows that understand COM.
>> 
>> The question is: What exactly is platform-specific?

Dirk> Platform-specific means what it says -- works only on
Dirk> a given platform (or maybe a few).  Or are trying to
Dirk> pull a Clinton here: "it depends was your meaning of
Dirk> platform is" ? :)

>> >Yup, and 'core' CRAN contains at least one Windows-only
>> >package: rbugs [ as I found when working on a script to
>> >automagically build Debian packages from CPAN packages,
>> >the script is a modified version of Albrecht's script ]
>> 
>> The author told me that rbugs is intended to work with WinBUGS under 
>> wine on a linux system  (whereas I'm pretty sure R2WinBUGS is capable - 

Dirk> Think about it, what does 'run under wine' mean?  Do
Dirk> you get it: it ain't no native package when it needs
Dirk> an emulator.

Dirk> Saying it runs under Linux using wine is like claiming
Dirk> your car just turned into a boat. While it will float
Dirk> once driven into the river, I presume it won't float
Dirk> for very long ...

>> Where's the point not to have just this one source repository related to 
>> platform dependency?

Dirk> Precisely. Let's have one source repo but _let us
Dirk> label any and all binary restrictions_ more clearly so
Dirk> that I for one can skip over stuff that may build for
Dirk> you [ Windoze ] but won't for me [ Linux, preferable
Dirk> on all all ten to twelve hardware platforms supported
Dirk> by Debian for packages that get uploaded ].

Dirk> Does that make sense? Would it improve over what we currently have?

Yes (2x).

I'd prefer to keep those packages in one place and rather mark
them than to move them to specific directories.

E.g., Dirk's proposal will also work when a package only works,
in IBM AIX and in Windows.

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] all.equal(, ) not considering names [was "vector documentation error?"]

2004-07-21 Thread Martin Maechler

> "Spencer" == Spencer Graves <[EMAIL PROTECTED]>
> on Wed, 21 Jul 2004 05:47:01 -0700 writes:

Spencer> The help file for "vector" in R 1.9.1 for Windows includes the 
Spencer> following: 

Spencer> x <- c(a = 1, b = 2)
Spencer> is.vector(x)
Spencer> as.vector(x)
Spencer> all.equal(x, as.vector(x)) ## FALSE

Spencer> I ran this just now and got TRUE. 

yes, I get this as well {R-patched on Linux}.

I'm sure that it never returned FALSE, since  all.equal()
doesn't ever.   However it *did* give non TRUE
in R versions up to 1.6.2 :

  > x <- c(a=1,b=2); all.equal(x, as.vector(x))
  [1] "names for target but not for current" "TRUE" 

and it does give something similar in the our S-plus 6.1 version.

Our documentation does nowhere specify what should happen
exactly in this case, but I do doubt that the current behavior
is correct.
What do other (long time S language) users think?

Spencer> Should I bother to report such things?  

yes, please, in any case!

Spencer> If yes, to whom? 

As long as you have a "?" going with it, it's not something you
should report as bug. 
In that case, you decide between R-help or R-devel
and the posting guide has a paragraph on this.
I think you decided very well for the current topic.

<>

Spencer> p.s.  Please excuse if I'm sending this to the
Spencer> wrong address.  I went to www.r-project.org ->
Spencer> Mailing Lists and double clicked on an apparent hot
Spencer> link to "r-bugs" and got nothing <>

Well, R-bugs is *not* a mailing list.  You'll find its address
in other places such as the R-FAQ or
help(bug.report).

Spencer> Therefore, I decided to send this to r-devel.

The "therefore" wasn't quite correct IMO, but your conclusion
anyway ;-)

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.matrix.data.frame() warning for POSIXt columns

2004-07-21 Thread Martin Maechler

Thank you David,

the report and the patch look perfectly valid to me
and I will commit a patch shortly {Brian is still traveling
currently}.

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] R daily snapshots - available again

2004-07-20 Thread Martin Maechler

The daily snapshot "diff" (which produces "R-patched" from "R-release"),
 R-release.diff.gz

as hyperlinked on CRAN's main page 

and the ones on http://stat.ethz.ch/CRAN/sources.html

* Gzipped and bzipped tar files are available by anonymous
  FTP from ftp://ftp.stat.math.ethz.ch/Software/R

are now updated again (and current) as of this evening.

---

What's not there yet is the correct date in the
 /date-stamp
file.  You may do this manually or wait another day or two
 before this will be automated as well.

Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] rsync -> cvs down?

2004-07-19 Thread Martin Maechler

>>>>> "Marc" == Marc Schwartz <[EMAIL PROTECTED]>
>>>>> on Mon, 19 Jul 2004 13:22:43 -0500 writes:

Marc> Uwe,
Marc> That did it.  Using https: I am now able to do a checkout.

Marc> It seems to be slow at the moment, but the files are coming through.
  -

that should have improved now.

The default apache2 configuration (for RH Enterprise) had
'KeepAlive Off'  which I now have replaced with 'On'.

For this to take effect, I had to restart the server --
this pretty brutally terminates all running svn requests, which
in this case lead to the need for "svnadmin recover"ing the
archive... Well, well, we're getting there eventually.
If anybody has good experiences to share about
Apache performance tweaking, please let me hear.

But note that the idea (of the server setup) really was to serve 
R-core (and maybe ESS-core and maybe some really small few-person
collaboration projects).  If too many people (such as "hundreds
of R-devel readers") are going at the server it will become
pretty unusable, and I will have to make it accessible only
"non-anonymously"  {and find another one doing rsync...} eventually.

Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] rsync -> cvs down?

2004-07-19 Thread Martin Maechler

> "tony" == A J Rossini <[EMAIL PROTECTED]>
> on Mon, 19 Jul 2004 11:29:28 -0700 writes:

tony> Marc Schwartz <[EMAIL PROTECTED]> writes:
>> Uwe,
>> 
>> That did it.  Using https: I am now able to do a checkout.
>> 
>> It seems to be slow at the moment, but the files are coming through.

tony> Seems to run comparable to anoncvs.  Also seems to hiccup and barf,
tony> like anoncvs (infamouse server stalls).  

Note one difference   subversion <-> CVS :

subversion being a 21th century child it rather
optimizes bandwidth over the expense of disk space:

It keeps files 'pristine' and your modification.
I.e. you need more than double the diskspace 
but you can be offline to "diff" files !

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] rsync -> cvs down?

2004-07-19 Thread Martin Maechler

>>>>> "Marc" == Marc Schwartz <[EMAIL PROTECTED]>
>>>>> on Mon, 19 Jul 2004 12:57:12 -0500 writes:

Marc> On Mon, 2004-07-19 at 12:38, Douglas Bates wrote:
>> Marc Schwartz wrote:
>> 
>> > I am not able to access cvs via rsync today. Is the service down?
>> 
>> Yes.  We should have sent email about it to r-devel but it has been a 
>> hectic several days.
>> 
>> The bad news is that the newly installed cvs.r-project.org machine, 
>> which is also rsync.r-project.org, was compromised and we had to take it 
>> off the net.
>> 
>> The good news is that, thanks to heroic efforts by Martin Maechler and 
>> Deepayan Sarkar, the CVS repository has been transformed to Subversion 
>> and is available at http://svn.r-project.org/R/ (and at 
>> https://svn.r-project.org/R/ but SSL is probably only needed by those 
>> doing commits).  If you have a Subversion client (see 
>> http://subversion.tigris.org - those using Windows may also want to look 
>> at http://tortoiseSVN.tigris.org/) you can check out and update the 
>> current r-devel from http://svn.r-project.org/R/trunk/ and the current 
>> R-patched from http://svn.r-project.org/R/branches/R-1-9-patches/
>> 

Marc> Doug,

Marc> Thanks and thanks to Martin and Deepayan!

Marc> subversion is part of FC2 as is the svn client.

Thanks, good to know.  It's also part of Debian "testing" and
newer;  it's *not* part of RH Enterprise though.

Installing it from source, http://subversion.tigris.org/
is not hard.  The important thing for the R-project though is to
use  "configure --with-ssl "
because only then you get SSL support, i.e. only then you can use https://...
which is (currently) absolutely required as I just said in
another message on this thread.

Marc> Presuming that I am using the proper command:

Marc> svn co http://svn.r-project.org/R/branches/R-1-9-patches

Marc> Is the svn server down or is the command incorrect?

Use 'https' instead of 'http'.
This is a requirement for svn.r-project.org/(on purpose).

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] rsync -> cvs down?

2004-07-19 Thread Martin Maechler

> "UweL" == Uwe Ligges <[EMAIL PROTECTED]>
> on Mon, 19 Jul 2004 20:01:07 +0200 writes:

UweL> A.J. Rossini wrote:
>> The svn server appears to be down.

UweL> Actually, I'm just checking out a developer release from

UweL> https://svn.r-project.org/R/trunk/

UweL> Note that https is required,
UweL>   ^
UweL> the unsecured http protocol seems not to be working...

On purpose: It's "firewalled out".

I'm sorry: I've never mentioned this explicitly in my e-mails to
Doug and R-core :
Since we (R-core and potentially other people working on
 projects off svn.r-project.org) *will* need
authentication, I just wanted to make sure that no plain text authentication can
happen (and be sniffed and then misused for yet another cracker attack)

Please also note that the SSL certificate for https://svn.r-project.org/

Certificate information:
 - Hostname: svn.r-project.org
 - Valid: from Jul 16 08:10:01 2004 GMT until Jul 14 08:10:01 2014 GMT
 - Issuer: Department of Mathematics, ETH Zurich, Zurich, Switzerland, CH
 - Fingerprint: c9:5d:eb:f9:f2:56:d1:04:ba:44:61:f8:64:6b:d9:33:3f:93:6e:ad

may seem to be fishy to you, but do accept it.
AFAIK, only in certain places of the world (inside the US only?),
you can get free "trusted certificates".
I've been told (by our departmental webmaster) that for us, a
trusted certificate would cost around
1000.- swiss francs PER YEAR.  In case anyone wants to
investigate:  He mentioned  http://www.verisign.com/products/site/secure/ 

But then, you can accept the certificate permanently and won't
be asked about it anymore.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] filled.contour() ignores mfrow

2004-07-19 Thread Martin Maechler

> "BaRow" == Barry Rowlingson <[EMAIL PROTECTED]>
> on Mon, 19 Jul 2004 12:21:28 +0100 writes:

BaRow> [EMAIL PROTECTED] wrote:
>> Please stop sending unsensible bug reports! Those have to be handled 
>> manually in the bug repository!

BaRow> Really? They seem to be being handled automatically and frighteningly 
BaRow> well by the Uwe-bot at the moment. Congratulations, you've passed the 
BaRow> Turning Test.
  ^
 [you mean "Turing" -  do you have a not-so-sophisticated 
   auto-speller 'bot handling your e-mail ?]

Well, the problem is that the Uwe-bot only can work on the
R-devel side of it, not --- as the 'bot mentioned correctly --
on the 'R bug repository' side.  There, one real person, from
R-core, must authenticate and move the bug report to the trashcan.

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] RE: [R] Strange (non-deterministic) problem with strsplit

2004-07-17 Thread Martin Maechler

>>>>> "HenrikB" == Henrik Bengtsson <[EMAIL PROTECTED]>
>>>>> on Sat, 17 Jul 2004 01:59:17 +0200 writes:

HenrikB> [Moving this thread to R-devel instead] I suspect
HenrikB> your "random" results are due to a bug in
HenrikB> gsub(). On my R v1.9.0 (Rterm and Rgui) R crashes
HenrikB> when I do

HenrikB> % R --vanilla
>> gsub(" ", "", "abb + c | a*b", perl=TRUE)

HenrikB> Trying

>> gsub(" ", "", "b c + d | a * b", perl=TRUE)

HenrikB> and I'll get NULL. With

>> gsub("\\s", "", "bc + d | a * b", perl=TRUE)

HenrikB> it works as expected. So there is something buggy
HenrikB> for sure.

HenrikB> This might have been fixed in R v1.9.1 or its
HenrikB> patched version.

probably not.  Here are results from 1.91-patched

> gsub(" ",   "", "b c + d | a * b", perl=TRUE)
NULL
> gsub("\\s", "", "b c + d | a * b", perl=TRUE)
NULL
> gsub("\\s", "", "bc + d | a * b", perl=TRUE)
[1] "bc+d|a*b"
> gsub(" ",   "", "bc + d | a * b", perl=TRUE)
[1] "bc+d|a*b"
> 

Martin Maechler, ETH Zurich

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Replying to bug reports

2004-07-17 Thread Martin Maechler

> "Roger" == Roger D Peng <[EMAIL PROTECTED]>
> on Fri, 16 Jul 2004 16:13:17 -0400 writes:

Roger> I have a naive question here, but only because I've
Roger> managed to screw this up twice in the last week.

Roger> What is the correct way to reply to a bug report?

Roger> Should r-bugs be in the To: or Cc: fields?  I
Roger> originally thought that hitting "Reply" and stripping
Roger> of the r-devel email address was sufficient but
Roger> apparently not.

and be careful to keep the string '(PR#)' as part of the
subject *)  That should really suffice.

Can you remind me of an example where it didn't?

Martin

*)
 Brian found it useful to move that string to the
 beginning of the subject line in those cases where that line
 has been relatively long -- since other people's mailers may
 break the line in two - which for the 'R-bugs' mail handler
 makes the 2nd part to be "not Subject" and leads to creation of
 another PR#.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re: tail() column numbers

2004-07-12 Thread Martin Maechler

>>>>> "Duncan" == Duncan Murdoch <[EMAIL PROTECTED]>
>>>>> on Mon, 12 Jul 2004 08:05:58 -0400 writes:

Duncan> On Sun, 11 Jul 2004 12:06:44 +0100, Patrick Burns
Duncan> <[EMAIL PROTECTED]> wrote :

>> I disagree with Martin's assertion that "tail" is not
>> useful for programming.  

I really didn't assert that to the contrary I said you were
*right* (where I used 'write' -- probably my worst typing lapsus ever)
but never mind

>> useful for programming.  It has a few features relative
>> to the do-it-yourself approach:

Duncan> Me too actually.  I think tail() has two uses,
Duncan> interactive and programmatic.  I think it would be
Duncan> better for interactive use if the row names were
Duncan> added, and only slightly worse for programmatic use
Duncan> if an option were given to turn them off.

yes, so a programmer would use

   tail(obj, barebones=TRUE)
or tail(obj, addnames=FALSE)
or such an option -- where we'd want the interactive use not to
have to specify the option.

Note that this would still be a non-backward compatibaly
behavior -- which I think however is acceptable in this special
case.


Duncan> In interactive use, I find it unhelpful to be told
Duncan> that something is in row 3 when it's really in row
Duncan> 47.

indeed.

Duncan> Duncan Murdoch

>>  *) It compactly makes the intention clear.  *) It
>> automatically handles cases where there may be either a
>> vector or a matrix.  *) It handles cases where there is
>> less data than is being sought (which may or may not be a
>> good thing).
>> 
>> "tail" of functions is what is definitely intended for
>> interactive use.
>> 
>> Pat
>> 
>> Martin Maechler wrote:
>> 
>>>>>>>> "PatBurns" == Patrick Burns
>>>>>>>> <[EMAIL PROTECTED]> on Tue, 27 Jan 2004
>>>>>>>> 14:20:30 + writes:
>>>>>>>> 
>>>>>>>> 
>>>  [more than half a year ago]
>>> 
PatBurns> Duncan Murdoch wrote:
>>>  .
>>> 
DM> One other one I'll look at:
DM> 
DM> If a matrix doesn't have row names, I might add names
DM> like '[nn,]' to it, so I get results like
>>>
R> x <- matrix(1:100,ncol=2) tail(x)
Rout> [,1] [,2] [45,] 45 95 [46,] 46 96 [47,] 47 97 [48,] 48
Rout> 98 [49,] 49 99 [50,] 50 100
Rout> 
DM> instead of the current
>>>
R> tail(x)
Rout> [,1] [,2] [1,] 45 95 [2,] 46 96 [3,] 47 97 [4,] 48 98
Rout> [5,] 49 99 [6,] 50 100
>>>
DM> I just want to be careful that this doesn't mess up
DM> something else.
DM> 
DM> Duncan Murdoch
>>>
PatBurns> I think this could be being too "helpful".  Using
PatBurns> tail on a matrix may often be done in a program so
PatBurns> I think leaving things as they come is the best
PatBurns> policy.
>>>  I tend to disagree, and would like to have us think
>>> about it again:
>>> 
>>> 1) Duncan's proposal was to only add row names *when*
>>> there are none.  2) Pat is write that tail() for
>>> matrices maybe used not only interactively and
>>> help(tail)'s "Value:" section encourages this to some
>>> extent.
>>> 
>>> However, how can adding column names to such a
>>> matrix-tail be harmful?
>>> 
>>> Well, only in the case where the tail is quite large,
>>> the added dimnames add unneeded memory and other
>>> overhead when dealing with that matrix.
>>> 
>>> But I think, programmers/users caring about efficient
>>> code wouldn't use tail() in their function code,
>>> would they?
>>> 
>>> In conclusion, I'd still argue for following Duncan's
>>> proposal, maybe adding a \note{.} to head.Rd stating
>>> that these functions were meant for interactive use, and
>>> for "programming", we'd rather recommend the direct
>>> (n-k+1):n indexing.
>>> 
>>> 
>>> 
>>> 
>>

Duncan> __
Duncan> [EMAIL PROTECTED] mailing list
Duncan> https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Re: tail() column numbers

2004-07-09 Thread Martin Maechler

>>>>> "MM" == Martin Maechler <[EMAIL PROTECTED]>
>>>>> on Fri, 9 Jul 2004 10:35:42 +0200 writes:

  ...

MM> 1) Duncan's proposal was to only add row names *when* there are none.
MM> 2) Pat is write that tail() for matrices maybe used not only
  ^
 'right'  
of course!
wondering what happened in my brain just that second ... 

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] Re: tail() column numbers

2004-07-09 Thread Martin Maechler

> "PatBurns" == Patrick Burns <[EMAIL PROTECTED]>
> on Tue, 27 Jan 2004 14:20:30 + writes:

[more than half a year ago]

PatBurns> Duncan Murdoch wrote:

   .

DM> One other one I'll look at:
DM> 
DM> If a matrix doesn't have row names, I might add names
DM> like '[nn,]' to it, so I get results like

  R> x <- matrix(1:100,ncol=2)
  R> tail(x)
   Rout>   [,1] [,2]
   Rout> [45,]   45   95
   Rout> [46,]   46   96
   Rout> [47,]   47   97
   Rout> [48,]   48   98
   Rout> [49,]   49   99
   Rout> [50,]   50  100
   Rout> 
DM> instead of the current

  R> tail(x)
   Rout>  [,1] [,2]
   Rout> [1,]   45   95
   Rout> [2,]   46   96
   Rout> [3,]   47   97
   Rout> [4,]   48   98
   Rout> [5,]   49   99
   Rout> [6,]   50  100

DM> I just want to be careful that this doesn't mess up
DM> something else.
DM> 
DM> Duncan Murdoch

PatBurns> I think this could be being too "helpful".  Using
PatBurns> tail on a matrix may often be done in a program so
PatBurns> I think leaving things as they come is the best
PatBurns> policy.

I tend to disagree, and would like to have us think about it
again:

1)  Duncan's proposal was to only add row names *when* there are none.
2)  Pat is write that tail() for matrices maybe used not only interactively
and help(tail)'s "Value:" section encourages this to some extent.

However, how can adding column names to such a matrix-tail be harmful?

Well, only in the case where the tail is quite large, the
added dimnames add unneeded memory and other overhead when
dealing with that matrix.

 But I think, programmers/users caring about efficient code
 wouldn't use tail() in their function code, would they?

In conclusion, I'd still argue for following Duncan's proposal,
maybe adding a \note{.} to head.Rd stating that these functions
were meant for interactive use, and for "programming", we'd
rather recommend the direct  (n-k+1):n indexing.

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] Re: plot.new() warning from coplot()'s par(*, new=FALSE)

2004-06-26 Thread Martin Maechler

{diverted from the R-SIG-gui list}

>>>>> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
>>>>> on 26 Jun 2004 11:51:02 +0200 writes:

PD> James Wettenhall <[EMAIL PROTECTED]> writes:
>> Hi,
>> 
>> Does anyone know a good way to get rid of warnings like:
>> Warning message: calling par(new=) with no plot
>> 
>> when using an R plot function which calls plot.new()
>> (e.g. coplot) from within tkrplot?
>> 
   .

PD> Hmm, the same wart appears if you just plot to a freshly
PD> opened X11 device (X11(); coplot()), nothing
PD> specific to tkrplot. I think I've seen this reported
PD> before, but I have forgotten what the recommended action
PD> was.

If I look at coplot, I see that it's very first graphics call is

  par(mfrow =..., new = FALSE)

and this ('new = FALSE') of course gives the warning when no
graphic device is active.
coplot()'s code  is just not quite right here IMO.

I can rid of the warning and keep coplot() behaving as
now otherwise by replacing

  opar <- par(mfrow = c(total.rows, total.columns),
  oma = oma, mar = mar, xaxs = "r", yaxs = "r", new = FALSE)

by
  if(dev.cur() > 1 && par("new")) # turn off a par(new=TRUE) setting
  par(new = FALSE)
  opar <- par(mfrow = c(total.rows, total.columns),
  oma = oma, mar = mar, xaxs = "r", yaxs = "r")

- - -

and I'd commit this (to R-patched).

OTOH, I wonder if we couldn't just omit the   
  if(...) par(new = FALSE)
clause {for R-devel at least}.
If a user really calls  par(new = TRUE) before calling coplot()
(s)he should be allowed to produce such a monstrosity --- unless
its an ingenuosity such as drawing a background image on which
to draw coplot() ...

Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

RE: [Rd] news file included in source but not binary package

2004-06-25 Thread Martin Maechler

> "AndyL" == Liaw, Andy <[EMAIL PROTECTED]>
> on Thu, 24 Jun 2004 19:33:00 -0400 writes:

>> From: Friedrich Leisch
>> 

>> But to get to the heart of the email (something similar
>> was proposed by Greg Warnes a few weeks ago): We should
>> definetely provide a simple mechanism to see the latest
>> changes in a package.
>> 
>> Question: I am aware of files calles NEWS and ChangeLog
>> (or CHANGELOG, etc.) holding the relevant information
>> ... are there any others we want/need to respect?

AndyL> Yes, I do recall the thread that Greg started.  This
AndyL> is sort of trying to get it going again...  

yes, thanks, Andy!

AndyL> Could we just settle on a standard name and be done
AndyL> with it?  Since base R uses NEWS, why not just use
AndyL> that for all packages,

We emacs fans would definitely additionally want 'ChangeLog' (as
per Fritz' proposal).
This has been a standard name for decades with a very convenient
emacs interface ['C-x 4 a'] to create and update the file.

AndyL> and use NEWS.platform as Duncan suggested
AndyL> (in the top-level directory, rather than
AndyL> platform-specific directories)?

I agree.  

BTW, both NEWS and ChangeLog have specific (but different) syntax
the main difference that NEWS entries are anonymous and not
dated whereas ChangeLog is a "log", i.e. has dated entries.
Kurt (or Fritz?) already mentioned in the earlier
'Greg-initiated' thread that it might be interesting to consider
automatic conversion of these files to (e.g.) html.  
But such possibilities should not hinder us from agreeing *now*
that e.g. files with regexps
--
m{NEWS.*}
m{Change(s|Log)}i
--
should automatically be taken from the source toplevel directory
and put into /doc/.txt

(the '.txt' maybe a good idea idea because the ./doc/ directory
 can be easily opened in the browser in some interfaces)

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] function not in load table

2004-06-23 Thread Martin Maechler

> "Toralf" == Toralf Kirsten <[EMAIL PROTECTED]>
> on Wed, 23 Jun 2004 11:36:23 +0200 writes:

Toralf> Hi Martin, Vadim,
Toralf> 
>> 
>> really C, not C++ ?
>> [or did you compile by a C++ compiler instead of C ?]
>> I ask because for C++ that's really a FAQ

Toralf> It's really a C function.

ok

Toralf> wy.result <- wy.grps(data1=X1, grpdata=groups, nres=1, 
Toralf> alpha1=0.05, alpha2=0.05)
Toralf> Error in .C("wy_grps_R", as.double(X), as.integer(n1), as.integer(n2),  :
Toralf> C function name not in load table
Toralf> Execution halted
>> 
>> this really means that there's no exported C function named 'wy_grps_R'
>> from the dyn.loaded C code.
>> Do
>> nm -g  izbi.so
>> inside izbi/src/ in the shell to check.

Toralf> I checked the exported function as you mentioned above and I can see the 
Toralf> function named 'wy_grps_R' in the list (as you can see below)
Toralf> ...
Toralf> 4d80 T uvarWYdist
Toralf> 4790 T wy_clust_R
Toralf> 45a0 T wy_grps_R   <---
Toralf> 4450 T wy_uvar_R
Toralf> ...
Toralf> The T in the second column means it is available in the code segment, 
right.

yes.

Toralf> My .First.lib.R is as follows:
Toralf> .First.lib <- function(libname, pkgname) {
Toralf> library.dynam("izbi", package = pkgname, lib.loc = libname)
Toralf> data(COLS, package=pkgname)
Toralf> data(ROWS, package=pkgname)
>> 
Toralf> if (exists(".Dyn.libs")) remove(".Dyn.libs")
>> 
>> not sure if the above is a good idea.
>> What do you want it for?

Toralf> What do you think what is not a good practice?

only the last line (before which I had added an empty line)
removing .Dyn.libs.

Toralf> The COLS and ROWS are R objects which we use in R
Toralf> programs to replace the 1 and 0 used as col and row
Toralf> parameter.

(I don't understand but it's not relevant here anyway)

Toralf> Do you think we should use the command
Toralf> dyn.load("")
Toralf> for each C file instead of
Toralf> library.dynam(...)?

No, no these were fine. Do not change them

Toralf> I also tried to specify the package name in this manner
Toralf> result <- .C("wy_grps_R",
Toralf> as.double(X),
Toralf> as.integer(n1),
Toralf> as.integer(n2),
Toralf> as.integer(p),
Toralf> as.integer(unlist(grpidx)),
Toralf> as.integer(grplen),
Toralf> as.integer(grpnum),
Toralf> as.character(WYFUN),
Toralf> as.double(alpha2),
Toralf> as.character(MINMAXFUN),
Toralf> WYdist=double(nres),
Toralf> as.integer(nres),
Toralf> test.value=double(grpnum),
Toralf> p.value=double(grpnum),
Toralf> PACKAGE="izbi")

Toralf> Unfortunately it didn't solve the problem.

yes, as I said in my e-mail.

I'm almost at the end of my hints here.
One thing -- you probably have already tried is to use
is.loaded()  
as in the following example

  > is.loaded(symbol.C("pam"))
  [1] FALSE
  > library(cluster)
  > is.loaded(symbol.C("pam"))
  [1] TRUE
  > 

but I presume you will just find that
is.loaded(symbol.C("wy_grps_R"))
gives FALSE for you even after  
library(izbi)

Is there nobody in Leipzig willing to delve into your package's
source?

Regards,
Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] function not in load table

2004-06-23 Thread Martin Maechler

>>>>> "Toralf" == Toralf Kirsten <[EMAIL PROTECTED]>
>>>>> on Tue, 22 Jun 2004 18:52:51 +0200 writes:

Toralf> I apologize for this often/old question. I found
Toralf> some hints but couldn't solve the problem so far.

Toralf> I have C functions (incl. the header files) as well

really C, not C++ ?
[or did you compile by a C++ compiler instead of C ?]
I ask because for C++ that's really a FAQ

Toralf> as the R wrapper functions which I want to use for
Toralf> faster calculations. These functions are included in
Toralf> a R package.  The installation process seems to be
Toralf> ok (no errors). I also can load the package without
Toralf> errors. But when I call the function I got the
Toralf> following error

Toralf> wy.result <- wy.grps(data1=X1, grpdata=groups, nres=1, 
Toralf> alpha1=0.05, alpha2=0.05)
Toralf> Error in .C("wy_grps_R", as.double(X), as.integer(n1), as.integer(n2),  :
Toralf> C function name not in load table
Toralf> Execution halted

this really means that there's no exported C function named 'wy_grps_R'
from the dyn.loaded C code.
Do
nm -g  izbi.so
inside izbi/src/ in the shell to check.

Toralf> The parameter are
Toralf> data1 - result of read.table()
Toralf> grpdata - dito
Toralf> nres - integer
Toralf> alpha1 nad alpha1 - factors (float)

Toralf> In the R function wy.grps(...) the C function is called by using the 
Toralf> statement

Toralf> result <- .C("wy_grps_R",
Toralf> as.double(X),
Toralf> as.integer(n1),
Toralf> as.integer(n2),
Toralf> as.integer(p),
Toralf> as.integer(unlist(grpidx)),
Toralf> as.integer(grplen),
Toralf> as.integer(grpnum),
Toralf> as.character(WYFUN),
Toralf> as.double(alpha2),
Toralf> as.character(MINMAXFUN),
Toralf> WYdist=double(nres),
Toralf> as.integer(nres),
Toralf> test.value=double(grpnum),
Toralf> p.value=double(grpnum))

Vadim mentions that you should add  PACKAGE= ""  here
which is true but not related to your problem.

Toralf> My .First.lib.R is as follows:
Toralf> .First.lib <- function(libname, pkgname) {
Toralf> library.dynam("izbi", package = pkgname, lib.loc = libname)
Toralf> data(COLS, package=pkgname)
Toralf> data(ROWS, package=pkgname)

Toralf> if (exists(".Dyn.libs")) remove(".Dyn.libs")

not sure if the above is a good idea.
What do you want it for?

Toralf> if (interactive() && getenv("DISPLAY") != "") x11()
Toralf> }

Toralf> I read something about R_CMethodDef in "Writing R
Toralf> Extensions" but I'm not really sure where I should
Toralf> write it, may in the .First.lib.R or in a separate
Toralf> file?

That's something else nice -- but all happens on the C level.
For an example of this see in the R sources
  R-1.9.1/src/library/stats/src/init.c

Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

RE: [Rd] regarding saving R graphis images directly to directory

2004-06-21 Thread Martin Maechler

> "Vadim" == Vadim Ogranovich <[EMAIL PROTECTED]>
> on Sun, 20 Jun 2004 15:22:41 -0700 writes:

Vadim> If you just need to ignore the errors use try or tryCatch around the
Vadim> plotting functions.

Vadim> Re: jpeg and friends. R has a notion of current
Vadim> device where it sends all its graphics. If none is
Vadim> open R opens the default device for you, which
Vadim> happens to be X11 on your system. To use a different
Vadim> device just open it, say jpeg, and all graphics will
Vadim> go there until you close it or open yet another
Vadim> device. For the list of available devices see
Vadim> ?Devices

Vadim> It might be useful to have a null device which just
Vadim> silently ignores all graphics (aka /dev/null on
Vadim> UNIX), but I don't know if R has anything like this.

Vadim> P.S. This sort of questions looks more appropriate for r-help. Just
Vadim> personal sensing, I am no master of polices.

but you are very right, Vadim.  
Saurin's question would have been appropriate only for R-help.

OTOH, your "P.S." above --- being a proposal for enhancing R ---
does well fit into R-devel.

I agree that it would be nice to have a
  nullDev() or dev.null()
graphics device which would efficiently discard all "plotting to
devices" graphics.  
Note however that it should *not* discard the building of GROBs
(graphical objects) [grid package], i.e. it would construct all
these also for all lattice (or nlme) graphics.  It would just
trash discard when grob's are being 'printed' (i.e. plotted).

A -- quite inefficient -- but easy way on Unix-alikes
{i.e. everwhere but Windows}, 
would be to call, e.g.,   postscript(file = "/dev/null")
I assume there's something equivalent on modern Windows (?)

Martin

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] is.integer() (PR#6984)

2004-06-18 Thread Martin Maechler

   {removed R-bugs from CC; as we've seen it's not a bug at all}

>>>>> "Wolfi" == Wolfgang Huber <[EMAIL PROTECTED]>
>>>>> on Thu, 17 Jun 2004 18:54:47 +0200 writes:

Wolfi> Hi Marcio,
Wolfi> it's not a bug, it's a well-documented feature. In the S language, 
Wolfi> numeric literals are floating point by default.

that hasn't been true for quite a while now:
In S-plus since version 5.0, these literals *are* integer.

> mode(9)
[1] "numeric"

and the above doesn't tell you anything, since the mode() of an
integer is "numeric" in any case.
You'd need  storage.mode(9) [S and R] or  typeof(9) [R only] to get the
``low-level mode''.

> is.integer(9)
[1] FALSE
> is.integer(as.integer(9))
[1] TRUE

BTW, I've been using the  9:9  `acronym' instead of as.integer(9)
in cases where it enhances readability {e.g. in formulas}.
But note that really you shouldn't care in almost all
situations.

Regards,
Martin Maechler

Wolfi> Best wishes
Wolfi> Wolfgang

Wolfi> [EMAIL PROTECTED] wrote:
>> Hello!
>> 
>> I'm not sure if is it a BUG or not...
>> I'm using R 1.9.0, and I used the command below:
>> 
>> 
>>> is.integer(9)
>> 
>> [1] FALSE
>> 
>> R manual contains this words about the is.integer() function:
>> 
>> "is.integer returns TRUE or FALSE depending on whether its argument is of 
>> integer type or not."
>> 
>> What's the problem? Am I wrong about the BUG report?
>> 
>> Thank you very much.
>> 
>> Márcio de Medeiros Ribeiro
>> Graduando em Ciência da Computação
>> Departamento de Tecnologia da Informação - TCI
>> Universidade Federal de Alagoas - UFAL
>> Maceió - Alagoas - Brasil
>> Projeto CoCADa

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

[Rd] R-devel and not R-help (was 'using "= matrix (...)" in .C calls')

2004-06-18 Thread Martin Maechler

>>>>> "PD" == Peter Dalgaard <[EMAIL PROTECTED]>
>>>>> on 17 Jun 2004 18:50:25 +0200 writes:

PD> Tony Plate <[EMAIL PROTECTED]> writes:
>> Also, this type of basic question is more appropriate for
>> R-help, not R-devel.

PD> Hmm. No. 

PD> (1) This list is for developers, and they can be beginners too. On
PD> r-help we cannot even assume that people know that C is a
PD> programming language.

yes, yes, yes!
In general, I believe people should shift more topics from
R-help to R-devel, rather than the other way around.  Indeed,
lately we have been seeing too many things on R-help that should
have gone to R-devel instead.

A very good example being the thread started by Vadim Ogranovich
with subject "mkChar can be interrupted" (on Mon, 14 Jun 2004).
That has been utterly non-intelligible for probably more than
90% of the readers of R-help, and it may even have been the reason of
several "unsubscribe"rs (from R-help) that I as list maintainer
had seen happening +- subsequently.

I'd say, informed questions should go to R-devel in many cases
where .C() is involved and certainly in all cases where .Call()
is used.  Even though we (R core) would like to promote the use
of .Call() as much as possible, for most R users, programming in
C (or C++, Java, Fortran) is a big step, and learning to use
SEXP's is a much bigger step most (unfortunately) never take.

BTW: I'm glad for well formulated suggestions along the lines
     above to be added to http://www.r-project.org/mail.html
 and/or the posting guide

Regards,
Martin Maechler

__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-devel

1 2 >

1 - 100 of 178 matches

Mail list logo