Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-13 Thread Martin Maechler
 RichOK == Richard A O'Keefe [EMAIL PROTECTED]
 on Mon, 13 Dec 2004 10:56:48 +1300 (NZDT) writes:

RichOK I asked:
 In this discussion of seq(), can anyone explain to me
 _why_ seq(to=n) and seq(length=3) have different types?

RichOK Martin Maechler [EMAIL PROTECTED]
RichOK replied: well, the explantion isn't hard: look at
RichOK seq.default :-)

RichOK That's the efficient cause, I was after the final
RichOK cause.  That is, I wasn't asking what is it about
RichOK the system which MAKES this happen but why does
RichOK anyone WANT this to happen?

sure, I did understand you quite well -- I was trying to joke
and used the  :-)  to point the joking ..

MM now if that really makes your *life* simpler,
MM what does that tell us about your life ;-) :-)

{ even more  :-)   !! }

RichOK It tells you I am revising someone else's e-book
RichOK about S to describe R.  The cleaner R is, the easier
RichOK that part of my life gets.

of course, and actually I do agree for my life too, 
since as you may believe, parts of my life *are* influenced by R.

Apologize for my unsuccessful attempts to joking..


RichOK seq: from, to, by, length[.out], along[.with]

MM I'm about to fix this (documentation, not code).

RichOK Please don't.  There's a lot of text out there:
RichOK tutorials, textbooks, S on-inline documentation, c
RichOK which states over and over again that the arguments
RichOK are 'along' and 'with'.  

you meant
 'along' and 'length'

yes. And everyone can continue to use the abbreviated form as
I'm sure nobody will introduce a 'seq' method that uses
*multiple* argument names starting with along or length
(such that the partial argument name matching could become a problem).

RichOK Change the documentation, and people will start
RichOK writing length.out, and will that port to S-Plus?
RichOK (Serious question: I don't know.)

yes, as Peter has confirmed already.

Seriously, I think we wouldn't even have started using the ugly
.with or .out appendices, wouldn't it have been for S-plus
compatibility {and Peter has also given the explanation why there
*had* been a good reason for these appendices in the past}.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-12 Thread Peter Dalgaard
Richard A. O'Keefe [EMAIL PROTECTED] writes:

 seq: from, to, by, length[.out], along[.with]
 
   I'm about to fix this (documentation, not code).
   
 Please don't.  There's a lot of text out there: tutorials, textbooks,
 S on-inline documentation, c which states over and over again that
 the arguments are 'along' and 'with'.  Change the documentation, and 
 people will start writing length.out, and will that port to S-Plus?
 (Serious question:  I don't know.)

It will.

-- 
   O__   Peter Dalgaard Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics 2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark  Ph: (+45) 35327918
~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Re: [R] Is k equivalent to k:k ?

2004-12-12 Thread Richard A. O'Keefe
I asked:

 In this discussion of seq(), can anyone explain to
 me _why_ seq(to=n) and seq(length=3) have different
 types?  

Martin Maechler [EMAIL PROTECTED] replied:
well, the explantion isn't hard:  look at  seq.default  :-)

That's the efficient cause, I was after the final cause.
That is, I wasn't asking what is it about the system which MAKES this
happen but why does anyone WANT this to happen?

now if that really makes your *life* simpler, what does that
tell us about your life  ;-) :-)

It tells you I am revising someone else's e-book about S to describe R.
The cleaner R is, the easier that part of my life gets.

 In the future, we really might want to have a new type,
 some long integer or index which would be used both in R
 and C's R-API for indexing into large objects where 32-bit
 integers overflow.

It would be useful needed now for large file support and for Java interfacing.

 I assume, we will keep theR integer == C int == 32-bit int
 forever, but need something with more bits rather sooner than later.
 But in any, case by then, some things might have to change in
 R (and C's R-API) storage type of indexing.

seq: from, to, by, length[.out], along[.with]

I'm about to fix this (documentation, not code).

Please don't.  There's a lot of text out there: tutorials, textbooks,
S on-inline documentation, c which states over and over again that
the arguments are 'along' and 'with'.  Change the documentation, and 
people will start writing length.out, and will that port to S-Plus?
(Serious question:  I don't know.)

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Martin Maechler
I'm diverting to R-devel,  where this is really more
appropriate.

 RichOK == Richard A O'Keefe [EMAIL PROTECTED]
 on Fri, 10 Dec 2004 14:37:16 +1300 (NZDT) writes:

RichOK In this discussion of seq(), can anyone explain to
RichOK me _why_ seq(to=n) and seq(length=3) have different
RichOK types?  

well, the explantion isn't hard:  look at  seq.default  :-)

RichOK In fact, it's worse than that (R2.0.1):

 storage.mode(seq(length=0))
RichOK [1] integer
 storage.mode(seq(length=1))
RichOK [1] double

  { str(.) is shorter than  storage.mode(.) }

RichOK If you want to pass seq(length=n) to a .C or
RichOK .Fortran call, it's not helpful that you can't tell
RichOK what the type is until you know n!  It would be nice
RichOK if seq(length=n) always returned the same type.  I
RichOK use seq(length=n) often instead of 1:n because I'd
RichOK like my code to work when n == 0; it would make life
RichOK simpler if seq(length=n) and 1:n were the same type.

now if that really makes your *life* simpler, what does that
tell us about your life  ;-) :-)

But yes, you are right.  All should return integer I think.

BTW --- since this is now on R-devel where we discuss R development:
  
 In the future, we really might want to have a new type,
 some long integer or index which would be used both in R
 and C's R-API for indexing into large objects where 32-bit
 integers overflow.
 I assume, we will keep theR integer == C int == 32-bit int
 forever, but need something with more bits rather sooner than later.
 But in any, case by then, some things might have to change in
 R (and C's R-API) storage type of indexing.


RichOK Can anyone explain to me why the arguments of seq.default are
RichOK from, to, by, length.out, along.with
RichOK ^
RichOK when the help page for seq documents them as
RichOK from, to, by, length, and along?


Well I can explain why this wasn't caught by R's builtin 
QA (quality assurance) checks:

The base/man/seq.Rd page uses  both \synopsis{} and \usage{}
which allows to put things on the help page that are not checked
to coincide with the code...
I'm about to fix this (documentation, not code).

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Duncan Murdoch
On Fri, 10 Dec 2004 09:32:14 +0100, Martin Maechler
[EMAIL PROTECTED] wrote :

RichOK If you want to pass seq(length=n) to a .C or
RichOK .Fortran call, it's not helpful that you can't tell
RichOK what the type is until you know n!  It would be nice
RichOK if seq(length=n) always returned the same type.  I
RichOK use seq(length=n) often instead of 1:n because I'd
RichOK like my code to work when n == 0; it would make life
RichOK simpler if seq(length=n) and 1:n were the same type.

now if that really makes your *life* simpler, what does that
tell us about your life  ;-) :-)

But yes, you are right.  All should return integer I think.

Yes, it should be consistent, and integer makes sense here.

However, as a matter of defensive programming, one should almost
always explicitly set the type (using  as.integer for example) in a .C
or .Fortran call:  those languages care quite a bit about the storage
mode, and give bizarre and hard to debug errors when it is wrong.   If
you did this, you wouldn't care that seq(length=n) returns mode
double.

It might waste a few cpu cycles, but programmer debugging cycles are
much more expensive.  

Duncan Murdoch

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Roger D. Peng

Martin Maechler wrote:
I'm diverting to R-devel,  where this is really more
appropriate.

RichOK == Richard A O'Keefe [EMAIL PROTECTED]
   on Fri, 10 Dec 2004 14:37:16 +1300 (NZDT) writes:

RichOK In this discussion of seq(), can anyone explain to
RichOK me _why_ seq(to=n) and seq(length=3) have different
RichOK types?  

well, the explantion isn't hard:  look at  seq.default  :-)
RichOK In fact, it's worse than that (R2.0.1):
 storage.mode(seq(length=0))
RichOK [1] integer
 storage.mode(seq(length=1))
RichOK [1] double
  { str(.) is shorter than  storage.mode(.) }
RichOK If you want to pass seq(length=n) to a .C or
RichOK .Fortran call, it's not helpful that you can't tell
RichOK what the type is until you know n!  It would be nice
RichOK if seq(length=n) always returned the same type.  I
RichOK use seq(length=n) often instead of 1:n because I'd
RichOK like my code to work when n == 0; it would make life
RichOK simpler if seq(length=n) and 1:n were the same type.
now if that really makes your *life* simpler, what does that
tell us about your life  ;-) :-)
But yes, you are right.  All should return integer I think.
BTW --- since this is now on R-devel where we discuss R development:
  
 In the future, we really might want to have a new type,
 some long integer or index which would be used both in R
 and C's R-API for indexing into large objects where 32-bit
 integers overflow.
 I assume, we will keep theR integer == C int == 32-bit int
 forever, but need something with more bits rather sooner than later.
 But in any, case by then, some things might have to change in
 R (and C's R-API) storage type of indexing.
I'm very much in favor of this suggestion.  I too believe that more 
people will begin running into this problem as more 64 bit machines 
come alive with  4GB of memory.  (I believe) we've run into this 
problem a few times when trying to load large image arrays.


RichOK Can anyone explain to me why the arguments of seq.default are
RichOK from, to, by, length.out, along.with
RichOK ^
RichOK when the help page for seq documents them as
RichOK from, to, by, length, and along?
Well I can explain why this wasn't caught by R's builtin 
QA (quality assurance) checks:

The base/man/seq.Rd page uses  both \synopsis{} and \usage{}
which allows to put things on the help page that are not checked
to coincide with the code...
I'm about to fix this (documentation, not code).
Martin
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
--
Roger D. Peng
http://www.biostat.jhsph.edu/~rpeng/
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Martin Maechler
 Duncan == Duncan Murdoch [EMAIL PROTECTED]
 on Fri, 10 Dec 2004 08:38:34 -0500 writes:

Duncan On Fri, 10 Dec 2004 09:32:14 +0100, Martin Maechler
Duncan [EMAIL PROTECTED] wrote :

RichOK If you want to pass seq(length=n) to a .C or
RichOK .Fortran call, it's not helpful that you can't tell
RichOK what the type is until you know n!  It would be nice
RichOK if seq(length=n) always returned the same type.  I
RichOK use seq(length=n) often instead of 1:n because I'd
RichOK like my code to work when n == 0; it would make life
RichOK simpler if seq(length=n) and 1:n were the same type.
 
 now if that really makes your *life* simpler, what does that
 tell us about your life  ;-) :-)
 
 But yes, you are right.  All should return integer I think.

Duncan Yes, it should be consistent, and integer makes sense here.

the R-devel version now does;  and so does  seq(along = .)

Also ?seq {or ?seq.default} now has the value section as

 Value:

  The result is of 'mode' 'integer' if 'from' is (numerically
  equal to an) integer and, e.g., only 'to' is specified, or also if
  only 'length' or only 'along.with' is specified.

which is correct {and I hope does not imply that it gives *all* cases of
an integer result}.

Martin

__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Prof Brian Ripley
On Fri, 10 Dec 2004, Peter Dalgaard wrote:
Martin Maechler [EMAIL PROTECTED] writes:

RichOK from, to, by, length.out, along.with
RichOK ^
RichOK when the help page for seq documents them as
RichOK from, to, by, length, and along?
Well I can explain why this wasn't caught by R's builtin
QA (quality assurance) checks:
The base/man/seq.Rd page uses  both \synopsis{} and \usage{}
which allows to put things on the help page that are not checked
to coincide with the code...
I'm about to fix this (documentation, not code).
In the case of length, I think there's a historical explanation for
having the formal argument being a slightly lengthened version of what
you'd like to use as the actual argument: length is the obvious
choice of name for the argument, but if you used that in older
versions of S and R then it would mask the length() function and get
you in all sorts of trouble, or at least spit out a number of annoying
warning messages. (On a related note, you may have noticed that some
of the oldtimers still have knee-jerk reactions to people using c
and t for variable names). So call it something longer and let
partial matching allow users to use the short form.
With namespaces, base::length(v) would clear up the issue quite
nicely, as would the convention of looking for objects of mode
function if it is clear from the context that a function is needed.
However, seq() predates both of these features as far as I remember.
Indeed, seq() is a Blue Book function, but with args `length' and `along'. 
R seems to have followed S-PLUS 3.x in using length.out and along.with: 
they were there in 1998-03-06, the earliest copy I can get hold of from 
SVN.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Re: [R] Is k equivalent to k:k ?

2004-12-10 Thread Prof Brian Ripley
On Fri, 10 Dec 2004, Martin Maechler wrote:
I'm diverting to R-devel,  where this is really more
appropriate.
In the future, we really might want to have a new type,
some long integer or index which would be used both in R
and C's R-API for indexing into large objects where 32-bit
integers overflow.
I assume, we will keep theR integer == C int == 32-bit int
forever, but need something with more bits rather sooner than later.
But in any, case by then, some things might have to change in
R (and C's R-API) storage type of indexing.
Indeed.  Assuming that seq() will always produce one type to pass to C 
code is dangerous.  Not so long ago someone asked why an R call had 
as.integer around a length, as ?length says the result is integer.
I replied that

1) This was liable to change and
2) Methods for generic functions were not forced to return the same thing 
as the documentation for the default method.

We are compelled to keep R integer == C int == Fortran integer as 
32-bit by backwards compatibility, as that is what all known 64-bit 
platforms do and hence what external libraries (notably libm/libc and 
Fortran support libraries) use.  This limits the length of R vectors to 
2^31-1, and that will start to bite fairly soon.  We do already have 
people using larger objects (as measured in bytes), and for example the 
return type of object.size got changed from int to double to accommodate 
such.  The C code has a type R_len_t that will eventually be used for
index and length computations, at least in C code.  The widespread use of 
Fortran for e.g. matrix computations limits what we can do.

Note that this was until recently only an issue for 64-bit operating 
systems, as the 32-bit OS 4Gb limit on a block of memory bites first 
except for raw vectors.

--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel