from:"François Pinard"

Re: [R] Lisp-like primitives in R

2007-09-08 Thread François Pinard

[Peter Dalgaard]
[François Pinard]

I meant that R might have implemented a Scheme engine [...] with 
a surface language [...] which is purposely not Scheme, but could have 
been.  [...] one could dare dreaming that the Scheme engine in R be 
completed, and Scheme offered as an alternate extension language.  
[...] there are excellent Scheme compilers.  [...]

Well, depending on what you want, this is either trivial or 
impossible...

I'm more leaning on the impossible side :-).

The internal storage of R is still pretty much equivalent to scheme.

R needs a few supplementary data types, and it motivated the R authors 
into re-implementing their own Scheme engine instead of relying on an 
existing implementation of a Scheme system.

  r2scheme - function(e) [...]

Nice exercise! :-)

a parser that parses a similar language to R internal format is  not 
a very hard exercise (some care needed in places). However, replacing 
the front-end is not going to make anything faster,

Of course.  The idea is nothing more than to please people starving to 
use Scheme instead of S as a surface language, here and there in 
scripts.  I merely thought that if the gap is small enough (so to not 
require an extraordinary effort), it would be worth the leap.  One 
immediate difficulty to foresee is the name clashes between R and RnRS.
There might also be missing things in R (like continuations, say).

To make anything faster, and this is a totally different idea, one might 
consider replacing the back-end, not the front-end.  Writing good 
optimizing Scheme compilers is quite an undertaking, and if one only 
considers type inference (as a subproblem), this still is an active 
research area.  The Scheme engine in R was written as to quickly get 
a working S (non-obstant lexical scoping and some library issues).
My ramble was about switching this quick base of R to some solid Scheme 
implementation, than to re-address separately compiling issues for R.

and the evaluation engine in R does a couple of tricks which are not 
done in Scheme, notably lazy evaluation,

Promises?  Aren't they already part of Scheme?  The main difference 
I saw is their systematic use in R argument passing.  All aspects of 
mere argument passing would require a lot of thought.  As you wrote, 
variable scope is another difficulty.  Offering a compatible C API, and 
library interface in general, might be a frightening but necessary 
challenge.  It's all more of a dream than a thought, actually... :-)

Look up the writings of Luke Tierney on the matter to learn more.

Thanks for this interesting reference.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lisp-like primitives in R

2007-09-07 Thread François Pinard

[Duncan Murdoch]

You could also look at Ross Ihaka's paper that is online here:

http://cran.r-project.org/doc/html/interface98-paper/paper.html

Interesting read.  Thanks for this reference!

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lisp-like primitives in R

2007-09-07 Thread François Pinard

[Roland Rau]
[François Pinard]

I wonder what happened, for R to hide the underlying Scheme so fully, 
at least at the level of the surface language (despite there are 
hints).  

To further foster portability, we chose to write R in ANSI C

Yes, of course.  Scheme is also (often) implemented in C.  I meant that 
R might have implemented a Scheme engine (or part of a Scheme engine, 
extended with appropriate data types) with a surface language (nearly 
the S language) which is purposely not Scheme, but could have been.

If the gap is not extreme, one could dare dreaming that the Scheme 
engine in R be completed, and Scheme offered as an alternate extension 
language.  If you allow me to continue dreaming awake -- they told me 
they will let me free as long as I do not get dangerous! :-) -- part 
of the interest lies in the fact there are excellent Scheme compilers.  
If we could only find or devise some kind of marriage between a mature 
Scheme and R, so to speed up the non-vectorisable parts of R scripts...

If we are lucky and one of the original authors reads this thread they 
might explain the situation further and better [...].

In r-devel, maybe!  We would be lucky if the authors really had time to 
read r-help. :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Lisp-like primitives in R

2007-09-06 Thread François Pinard

[Chris Elsaesser]

 I mainly program in Common Lisp and use R for statistical analysis.  
 While in R I miss the power and ease of use of Lisp, especially its 
 many primitives such as find, member, cond, and (perhaps a bridge too 
 far) loop.  Has anyone created a package that includes R analogs to 
 a subset of Lisp functions?

[Greg Snow]

Not all of us are familiar with lisp [...]  If you tell us what find, 
member, cond, and loop do, or what functionality you are looking for, 
then we will have a better chance of telling you how to do the same in 
R.

Hi, my fRiends :-).

So far that I understand, R is built over what originally was a Scheme 
engine.  Scheme may be seen as a flavour of LISP (yet I know people that 
would strongly object seeing Scheme and Lisp in the same statement 
:-).  But it makes it rather likely that most functions you want already 
exist in R, even if under different names or syntax.

I wonder what happened, for R to hide the underlying Scheme so fully, at 
least at the level of the surface language (despite there are hints).  
Wouldn't it have been natural to have the underlying Scheme exposed as 
an extension language for R, so one might write Scheme functions just as 
well as C or FORTRAN functions?  Is the engine so far from a real Scheme 
implementation, that such an idea was never reasonable?

About the idea of Lisp-inspired library functions...  Many Lisp 
flavours, Common Lisp likely included, have a comprehensive 
(tremendous?) set of primitives and library functions.  By comparison, 
Scheme is quite moderate, and does not go much beyond the essentials, 
something which much pleases me :-).  There also are many important 
differences between Common Lisp and Scheme (like for example, global 
dynamic scoping versus textual scoping).  If R was ever to offer 
Lisp-like interfaces, RnRS (Scheme standards) might be considered, both 
for being simpler, and more in the spirit of what R already is.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R 2.5.1 - Rscript through tee

2007-08-31 Thread François Pinard

[Dirk Eddelbuettel]
[François Pinard]

#!/usr/bin/Rscript
options(echo=TRUE)
a - 1
Sys.sleep(3)
a - 2

 If I execute ./pp.R at the shell prompt, the output shows the 
 timely progress of the script as expected.  If I use ./pp.R | tee 
 OUT instead, the output seems buffered and I see it all at once at 
 the end.  [...] So, is there a way to tell R (or Rscript) that 
 standard output should be unbuffered, even if it is not directly 
 connected to a terminal?

 Use explicit print statements, e.g.  print(a - 1)

Yes, I noticed that print statements get written.  But I wanted the 
mere echo trace of the execution of the script to be synchronous (as 
some statements take many seconds to compute, which I symbolically 
replaced by Sys.sleep above).

 Littler5D actually won't show anything unless you explicitly call 
 cat() or print(), but then it does [...]

It shares the limitation of Rscript, then.

 Littler is an 'all-in' binary and starts and runs demonstrably faster 
 than Rscript.

I'm not familiar with Littler.  Speedwise, Rscript is OK for me so far, 
as most time is spent within R computations, not much in language 
compilation or script interpretation.

 [...] the rather petty refusal of Rscript's main author to a least 
 give a reference to littler in Rscript's documentation, let alone 
 credit as 'we were there first', [...]

I've long been in academic circles (and elsewhere too), so I'm familiar 
with the need of recognizing authorship and people's works.  However, 
perusing R mailing list archives, and following actual list contents, 
I'm sometimes surprised, and even a bit annoyed, by the recurrent starve 
for credit I observe.  Of course, maintainers and contributors much 
deserve our thanks and, without going into arguments about what is due 
to whom, I think contributors receive praise on average, would it be 
only by all the interest shown by the community.  However, it gets a bit 
muddy when maintainers or contributors show bad temper when not 
receiving the systematic credit they would like to read.

Cicero's friends were telling him how upset they felt that there was 
still no statute of Cicero on the public place.  Cicero replied that he 
much preferred to hear people saying Why no Cicero statute yet? than 
to hear people saying Why the Cicero statute?.  A wise attitude! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Synchronzing workspaces

2007-08-30 Thread François Pinard

[Paul August]

 I used to work on several computers and to use a flash drive to 
 synchronize the workspace on each machine before starting to work on 
 it.  I found that .RData always caused some trouble: Often it is 
 corrupted even though there is no error in copying process.  Does 
 anybody have the similar experience?

Not me.  I use flash drives a lot to move .RData files around, without 
the slightest trouble.  However, in my case, the involved machines are
similar in their architecture and system, so I was not fearing trouble.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Excel (off-topic, sort of)

2007-08-29 Thread François Pinard

[Alberto Monteiro]

 Maybe I'll write a letter to Santa Claus [there are people
 who write to congressman; they must have more faith than me].

:-) :-)

 I wish a language where I can write

  a = b + 10

 and then when I write

  a = 20

 the language automatically assigns b = 10.

METAFONT does this (and consequently, Metapost as well).  I still 
remember my surprise when I found out that Donald Knuth resorts to such 
sophisticated machinery for the sole purpose of designing font 
characters.  Knuth surely did many wonderful things :-).

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Max vs summary inconsistency

2007-08-27 Thread François Pinard

[Adam D. I. Kramer]

I'm having the following questionable behavior:

 summary(m)
Min. 1st Qu.  MedianMean 3rd Qu.Max.
   1   13000   26280   25890   38550   50910 
 max(m)
[1] 50912

...it seems to me like max() and summary(m)[6] ought to return the same
number.  Am I doing something wrong?

Some may say that you did not scrutinize the documentation enough, as 
summary artificially limits the number of significant digits.

However, this question reoccurs often and regularly in these mailing 
lists, so at last, maybe something should be done about it, beyond 
documenting how it works.  Overall, too many users got mislead, that one 
may not so bluntly assert they are all wrong.

For example, resorting to scientific notation whenever non significant 
zero digits would have otherwise been printed.  This should clarify 
a bit that the printing precision got artificially limited.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subset using noncontiguous variables by name (not index)

2007-08-26 Thread François Pinard

[Muenchen, Robert A (Bob)]

I'm using the subset function to select a list of variables, some of
which are contiguous in the data frame, and others of which are not. It
works fine when I use the form:

subset(mydata,select=c(x1,x3:x5,x7))

In reality, my list is far more complex. So I would like to store it in
a variable to substitute in for c(x1,x3:x5,x7) but cannot get it to
work. That use of the c function seems to violate R rules, so I'm not
sure how it works at all. A small simulation of the problem is below.  

mydata - data.frame(
  x1=c(1,2,3,4,5),
  x2=c(1,2,3,4,5),
  x3=c(1,2,3,4,5),
  x4=c(1,2,3,4,5),
  x5=c(1,2,3,4,5),
  x6=c(1,2,3,4,5),
  x7=c(1,2,3,4,5)
)
mydata

# This does what I want.
summary(subset(mydata, select=c(x1, x3:x5, x7)))

Maybe:

  variables - expression(c(x1, x3:x5, x7))

and later:

  summary(subset(mydata, select=eval(variables)))

However, I do not know how one computes the expression piecemeal, that 
is, better than by building a string and parsing the result.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R 2.5.1 - Rscript through tee

2007-08-26 Thread François Pinard

Hi, people.

I met a little problem for which someone might have a solution.  Let's 
say I have an executable file (named pp.R) with this contents:

   #!/usr/bin/Rscript
   options(echo=TRUE)
   a - 1
   Sys.sleep(3)
   a - 2

If I execute ./pp.R at the shell prompt, the output shows the timely 
progress of the script as expected.  If I use ./pp.R | tee OUT 
instead, the output seems buffered and I see it all at once at the end.

The problem does not come from the tee program, as if I use this 
command:

   (echo a; sleep 5; echo b) | tee OUT

the output is timely, not batched.

So, is there a way to tell R (or Rscript) that standard output should be 
unbuffered, even if it is not directly connected to a terminal?

In case useful, here is local R information:

Version:
 platform = x86_64-unknown-linux-gnu
 arch = x86_64
 os = linux-gnu
 system = x86_64, linux-gnu
 status = 
 major = 2
 minor = 5.1
 year = 2007
 month = 06
 day = 27
 svn rev = 42083
 language = R
 version.string = R version 2.5.1 (2007-06-27)

Locale:
LC_CTYPE=fr_CA.UTF-8;LC_NUMERIC=C;LC_TIME=fr_CA.UTF-8;LC_COLLATE=fr_CA.UTF-8;LC_MONETARY=fr_CA.UTF-8;LC_MESSAGES=fr_CA.UTF-8;LC_PAPER=fr_CA.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=fr_CA.UTF-8;LC_IDENTIFICATION=C

Search Path:
 .GlobalEnv, package:stats, package:utils, package:datasets, fp.etc, 
package:graphics, package:grDevices, package:methods, Autoloads, package:base

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Turning a logical vector into its indices without losing its length

2007-08-24 Thread François Pinard

[Leeds, Mark (IED)]

I have the code below which gives me what I want for temp based on
invec but I was wondering if there was a shorter way ( i.e :
a one liner ) without having to initialize temp to zeros.  This is
ppurely for learning purposes. Thanks.

invec - c(TRUE,FALSE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE)

temp-numeric(length(invec))
temp[invec]-which(invec)
temp

[1] 1 0 0 4 0 0 7 0

A mere:

   invec * seq_along(invec)

would do it.  To be honest, I dislike the multiplication trickery, and 
so prefer Gabor's solution, even if a bit longer:

   ifelse(invec, seq_along(invec), 0)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Does anyone.... worth a warning?!? No warning at all

2007-08-20 Thread François Pinard

[Ted Harding]

 [...] a very important point.  [...] There are a lot of idiosyncracies 
 in R, which in time we get used to; but learning about them is 
 something of a sociological exercise, just as one learns that when 
 one's friend A says X Y Z is may not mean the same as when one's 
 friend B says it.  [...] Another example is in the use of %*% for 
 matrix multiplication when one or both of the factors is a vector.  
 [...] Just a few thoughts.  As I say we all get used to this stuff in
 the end, but it can be bewildering (and a trap) for beginners.

Using R is a bit akin to smoking.  Beginnings are difficult, one may get 
headaches, and even gag on the first experiences.  But in the long run, 
it becomes pleasurable, and even addictive.  Yet, deep down, for those 
willing to be honest, there is something not fully healthy in it.

While I appreciate many of the virtues of R, as a language, it has a few 
flaws.  Besides, as a library, and despite many commendable symmetries 
and beauties, it sometimes suffers from irregularities in its various 
specifications and offerings -- likely for historical reasons -- maybe 
lack of coordination while aging, or maybe needs of S compatibility.

These irregularities are sometimes documented clearly, yet in many 
cases, exegesis is required.  Moreover, around documentation, there is 
a question of attitude.  While some R maintainers are refreshingly 
open-minded, others are strongly reluctant to reconsider anything which 
has been written, as if the mere fact of documenting a detail was fixing 
it in the universe and eternity; they would then argue to death against 
slightest changes.  In a word, because almost impossible to repair in 
practice, R idiosyncrasies are likely to stay.

Accepting them (idiosyncrasies, irregularities) is part of the game.  
Correcting them a tiny bit at a time (like, for example, the mean 
behaviour at the origin of this thread) might overall take forever and 
shake myriads of electrons within tons of discussions.  I'm not sure it
is a worth undertaking.  For one, I prefer learning to be productive 
with R as it stands, even knowing it could have been a bit better.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to collapse a list of 1 column matrix to a matrix?

2007-08-19 Thread François Pinard

[EMAIL PROTECTED]

I encounter a situation where I have a list whose element is a column 
matrix. Says,

$'1'
[,1]
1
2
3

$'2'
[,1]
4
5
6

Is there fast way to collapse the list into a matrix like a cbind 
operation in this case? Meaning, the result should be a matrix that 
looks like: 

  [,1]  [,2]
[1,]1  4
[2,]2  5
[3,]3  6

I can loop through all elements and do cbind manually. But I think 
there must be a simpler way that I don't know. Thank you.

The do.call function is the R equivalent of the apply from many 
other languages.  I guess that, in R, apply was already taken :-)
For example:

 a = list(x=matrix(1:3, 3, 1), y=matrix(4:6, 3, 1))
 a
$x
 [,1]
[1,]1
[2,]2
[3,]3

$y
 [,1]
[1,]4
[2,]5
[3,]6

 do.call(cbind, a)
 [,1] [,2]
[1,]14
[2,]25
[3,]36


-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to collapse a list of 1 column matrix to a matrix?

2007-08-19 Thread François Pinard

[EMAIL PROTECTED]
 One more question. After I collapse everything into one matrix, 
 I would like to find the index of column that holds minimum value for 
 each row. I remember that there is a function like maxCols but I can't 
 seem to find the same thing for minimum value.  Any suggestion please?

Here is a possible avenue:

 z - matrix(sample(1:25), 5)
 z
 [,1] [,2] [,3] [,4] [,5]
[1,]   1629   247
[2,]   21   19   22   23   18
[3,]   1235   13   15
[4,]   204   25   11   10
[5,]   1718   146
 apply(z, 2, which.min)
[1] 3 5 3 4 5

I would presume (yet I did not recently check) that do.call, 
which.min, and a flurry of other useful functions, are introduced in 
various R tutorials.  If you plan to use R seriously, it might be worth 
scrutinizing a few of those.

Keep happy!

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combine matrix

2007-08-16 Thread François Pinard

[Gianni Burgin]
let say something like this

a=matrix(1:25, nrow=5)

rownames(a)=letters[1:5]
 colnames(a)=rep(A, 5)

 a
  A  A  A  A  A
a 1  6 11 16 21
b 2  7 12 17 22
c 3  8 13 18 23
d 4  9 14 19 24
e 5 10 15 20 25

 b=matrix(1:40, nrow=8)
 rownames(b)=c(rep(a,4),rep(b,4))
 colnames(b)=rep(B, 5)

 b
  B  B  B  B  B
a 1  9 17 25 33
a 2 10 18 26 34
a 3 11 19 27 35
a 4 12 20 28 36
b 5 13 21 29 37
b 6 14 22 30 38
b 7 15 23 31 39
b 8 16 24 32 40

as a results I wold like something like

  A  A  A  A  A  B  B  B  B  B
a 1  6 11 16 21  1  9 17 25 33
a 1  6 11 16 21  2 10 18 26 34
a 1  6 11 16 21  3 11 19 27 35
a 1  6 11 16 21  4 12 20 28 36
b 2  7 12 17 22  5 13 21 29 37
b 2  7 12 17 22  6 14 22 30 38
b 2  7 12 17 22  7 15 23 31 39
b 2  7 12 17 22  8 16 24 32 40

does it is clear? is there a function that automate this operation?

Like, maybe:

   cbind(a[rownames(b),], b)



-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] function to find coodinates in an array

2007-08-16 Thread François Pinard

[Ana Conesa]

 I am looking for a function/way to get the array coordinates of given
 elements in an array. What I mean is the following:
 - Let X be a 3D array
 - I find the ordering of the elements of X by ord - order(X)
   (this returns me a vector)
 - I now want to find the x,y,z coordinates of each element of ord

[Moshe Olshansky]

If your array's dimensions were KxMxN and the linear
index is i then
n - ceiling(i/(K*M))
i1 - i - (n-1)*(K*M)
m - ceiling(i1/K)
k - i1 - (m-1)*K

and your index is (k,m,n)

The reshape package might be helpful, here.  If I understand the problem 
correctly, given this artificial example:

   X - sample(1:24)
   dim(X) - c(2, 3, 4)

you would want:

   library(reshape)
   melt(X)[order(X), -4]

so getting the indices in a three columns data frame.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with counting how many times each value occur in each column

2007-08-10 Thread François Pinard

[Gabor Grothendieck]

   table(col(mat), mat)

Clever, simple, and elegant! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with counting how many times each value occur in each column

2007-08-10 Thread François Pinard

[Tom Cohen]

  I have the following dataset and want to know how many times each value 
 occur in each column.

   data
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,] -100 -100 -100000000  -100
 [2,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [3,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [4,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [5,] -100 -100 -100 -100 -100 -100 -100 -100 -100   -50
 [6,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [7,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [8,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
 [9,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[10,] -100 -100 -100  -50 -100 -100 -100 -100 -100  -100
[11,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[12,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[13,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[14,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[15,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[16,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[17,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[18,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
[19,] -100 -100 -100000000  -100
[20,] -100 -100 -100 -100 -100 -100 -100 -100 -100  -100
  The result matrix should look like
   -100 0 -50
[1]   20  
[2]   20
[3]   20
[4]   17
[5]   18
[6]   18
[7]   18  and so on 
[8] 
[9] 
[10]

Presuming that data is a matrix, one could try a sequence like this:

dataf - factor(data)
dim(dataf) - dim(data)
result - t(apply(dataf, 2, tabulate, nlevels(dataf)))
colnames(result) - levels(dataf)
result

If you want the columns sorted, you might decide the order of the levels 
on the factor() call, or explicitly reorder columns afterwards.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Bind together two vectors of different length...

2007-07-30 Thread François Pinard

[Andris Jankevics]

I have two vectors:
A - c(1:10)
B- seq(1,10,2)

Now I want to make a table form vectors A and B as rows, and if a value of A 
isn't present B, then I want to put a N/A symbol in it:

Output should look like this:

1 2 3 4 5 6 7 8 9 10 
1 0 3 0 5 0 7 0 9 0

How can I do this in R?

Either of:

  A[!A %in% B] - NA
  A[!A %in% B] - 0

depending on what you want your N/A symbol to be.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoiding timconsuming for loop renaming identifiers

2007-07-20 Thread François Pinard

[EMAIL PROTECTED]

I was wondering if I can avoid a time-consuming for loop on my 60 
obs dataset.

school_id   y
8   9.87
8   8.89
8   7.89
8   8.88
20  6.78
20  9.99
20  8.79
31  10.1
31  11

There are, say, 143 different schools in this 60 obs dataset.
I need to thave sequential identifiers, 1,2,3,4,5,...,143.

Hello, Toby.  Maybe:

   dta$id - cumsum(c(1, diff(dta$school_id) != 0))

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A More efficient method?

2007-07-04 Thread François Pinard

[Keith Alan Chamberlain]

Is there a faster way than below to set a vector based on values
from another vector? I'd like to call a pre-existing function for
this, but one which can also handle an arbitrarily large number of
categories. Any ideas?

Cat=c('a','a','a','b','b','b','a','a','b') # Categorical variable
C1=vector(length=length(Cat))  # New vector for numeric values

# Cycle through each column and set C1 to corresponding value of Cat.
for(i in 1:length(C1)){
   if(Cat[i]=='a') C1[i]=-1 else C1[i]=1
}

C1
[1] -1 -1 -1  1  1  1 -1 -1  1
Cat
[1] a a a b b b a a b

For handling an arbitrarily large number of categories, one may go
through a recoding vector, like this for the example above:

 Cat - c('a', 'a', 'a', 'b', 'b', 'b', 'a', 'a', 'b')
 C1 - c(a=-1, b=1)[Cat]
 C1
 a  a  a  b  b  b  a  a  b
-1 -1 -1  1  1  1 -1 -1  1

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Off topic:Spam on R-help increase?

2007-03-10 Thread François Pinard

[Marc Schwartz]

The Human Spam Filter (aka Martin) [...]

The R mailing list has, indeed, be remarkably spam-free, and 
well-managed so far that I can see.  I do hope, however, that Martin 
does not have to do the filtering himself -- it would be just daunting!

In any case, Martin, a lot of thanks from me!

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Conflict in .Rprofile documentation FAQ vs. Help?

2007-01-15 Thread François Pinard

[Brian Ripley]

No one actually said it was a *working* example [...]

Do you mean that, whenever we see something presented as an example 
within or around the R system, we should not take it as dependable 
unless it is explicitly said to be working?

 (and it is enclosed in \dontrun{})

Within the R online help system, many examples are marked so they are
not run.  I naively thought they were not run for friendly reasons, like 
for example, not inordinately impacting the user's environment.  Should 
I read you as saying that those examples are not to be believed?

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about R environment

2007-01-10 Thread François Pinard

[Philippe Grosjean]

Please, don't reinvent the wheel: putting functions in a dedicated 
environment is one of the things done by R packages (together with a 
good documentation of the function, and making them easily installable 
on any R implementation).

[...] this is probably the time for you to read the Writing 
R extensions manual, and to start implementing your own R package!

Hi, Philippe, and gang!

I read this manual long ago and used it to create packages already.  You 
really got the impression I did not read it? :-)

You know, there are small wheels, and huge wheels.  I do not see why 
I would use complex devices for tiny problems, merely because those 
complex devices exist.  R packages undoubtedly have their virtues, of 
course.  But just like many statistical tests, they do not always apply.

Why go at length organising package directories populated with many 
files, resorting to initialisation scripts, using package validators,
creating documentation files and processing them, go through the cycle 
of creating a package and installing it, all that merely for a few small 
quickies that fit very well in the ubiquitous .Rprofile file?  Why worry 
about installation on any R implementation, for little things only meant 
for myself, and too simple to warrant publication anyway?

Keep happy, all.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scripts with littler

2007-01-08 Thread François Pinard

[John Lawrence Aspden]

I'm trying to write R scripts using littler (under Debian), and was
originally using the shebang line:

#!/usr/bin/env r

However this picks up any .RData file that happens to be lying around, which
I find a little disturbing, because it means that the script may not behave
the same way on successive invocations.

If you drop the /usr/bin/env trick then 

#!/usr/bin/r --vanilla

seems to work, but it also prevents the loading of the libraries in my home
directory, some of which I'd like to use.

#!/usr/bin/r --no-restore

doesn't work at all.

Ideally I'd like #!/usr/bin/env r --no-restore

Has anyone else been round this loop and can offer advice?

I usually do something like:


#!/bin/sh
R --slave --vanilla EOF

   R script goes here...

EOF

# vim: ft=r


If you need to search special places for packages, you may tweak 
exported environment variables between the first and second line.




-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] scripts with littler / subroutines

2007-01-08 Thread François Pinard

[John Lawrence Aspden]

Another difficulty I'm having is creating a common function (foo, say) to
share between two scripts.

In your previous message, you were telling us that you want to load from 
your home directory.  You might put the common functions there, maybe?

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] A question about R environment

2007-01-08 Thread François Pinard

[Tong Wang]

I created environment mytoolbox by : mytoolbox - 
new.env(parent=baseenv()).  Is there anyway I put it in the search 
path?  In a project, I often write some small functions, and load them 
into my workspace directly, so when I list the objects with ls(), it 
looks pretty messy.  So I am wondering if it is possible to creat an 
environment, and put these tools into this environment.  For example, 
I have functions fun1(), fun2() ... and creat an environment mytoolbox 
which contains all these functions.  And it should be somewhere in the 
search path:  .GlobalEnv mytoolbox package:methods.

Here is a trick, shown as a fairly simplified copy of my ~/.Rprofile.  
It allows for a few simple functions always available, yet without 
having to create a package, and leaving ls() and any later .RData file 
unencumbered.

The idea is to use local() to prevent any unwanted clutter to leak out 
(my real ~/.Rprofile holds more than shown below and use temporary 
variables), to initialise a list meant to hold a bunch of functions or 
other R things, and to save that list on the search path.

This example also demonstrate a few useful functions for when I read the 
R mailing list.  I often need to transfer part of emails containing
code excerpts within the window where R executes, while removing 
quotation marks, white lines and other noise.  I merely highlight-select 
part of the message with the mouse, and then, within R, do things like:

   xs()   source the highlighted region
   xd()   read in a data.frame
   xm()   read in a matrix
   xe()   evaluate and print an expression
   xv()   read a list of values as a vector

The list above in decreasing order of usefulness (for me).  Except for 
xs(), which has no automatic printout, you may either let the others 
print what they got, or assign their value to some variable.  Arguments 
are also possible, for example like this:

   xd(T)  read in a data.frame when the first line holds column names



if (interactive()) {
local({

fp.etc - list()

fp.etc$xsel.vector - function (...) {
connexion - textConnection(xselection())
on.exit(close(connexion))
scan(connexion, ...)
}
fp.etc$xsel.dataframe - function (...) {
connexion - textConnection(xselection())
on.exit(close(connexion))
read.table(connexion, ...)
}
fp.etc$xsel.matrix - function (...) {
connexion - textConnection(xselection())
on.exit(close(connexion))
data.matrix(read.table(connexion, ...))
}
fp.etc$xsel.eval - function (...) {
connexion - textConnection(xselection())
on.exit(close(connexion))
eval(parse(connexion, ...))
}
fp.etc$xsel.source - function (...) {
connexion - textConnection(xselection())
on.exit(close(connexion))
source(connexion, ...)
}

fp.etc$xselection - function ()
{
lignes - suppressWarnings(readLines('clipboard'))
lignes - lignes[lignes != '']
stopifnot(length(lignes) != 0)
marge - substr(lignes, 1, 1)
while (all(marge %in% c('', '+', ':', '|'))
  || all(marge == ' ')) {
lignes - substring(lignes, 2)
marge - substr(lignes, 1, 1)
}
lignes
}

fp.etc$xv - fp.etc$xsel.vector
fp.etc$xd - fp.etc$xsel.dataframe
fp.etc$xm - fp.etc$xsel.matrix
fp.etc$xe - fp.etc$xsel.eval
fp.etc$xs - fp.etc$xsel.source

attach(fp.etc, warn=FALSE)

})
}

# vim: ft=r


-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.fwf and header

2006-11-01 Thread François Pinard

[Martin Maechler]

In my (and probably R-core's) view, read.fwf() should only have
to be used for ``legacy data files'' (those times when people used *no*
separators in order to save disk space), since nowadays, such
data files should automatically have correct separators. 

In my day-to-day experience, the main virtue for fixed width format 
files is basic, humble legibility, much more than disk space savings.  
The FWF files I see have delimiters between fields, but also embedded 
space within fields, or at end of fields, without extraneous quotes.  
XML markup, CSVs, quoted fields, etc. are devices meant for helping 
machines much more than for helping humans.  They significantly decrease 
legibility.  Humans not only know better, they decipher fixed width 
format easily enough for not really needing hairier devices in general.

FWF files may be archaic, they are not obsolescent.  They will resist 
the fashion of the day for complexity, and survive in the long run.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Suitability of R for Algorithm simulations

2006-10-22 Thread François Pinard

Hi, people.

A correspondent puts me in front of a reply I sent to r-help, a few 
weeks ago, and quoted below.  I should have been tired when I sent it.
Please replace Eiffel by Erlang all over.  Sorry for this error.

  Date: 2006-10-05 00:43:36
  Message-ID: [EMAIL PROTECTED]

  [Ethan B. Fini]

   I would like to be able to instantiate an object for each node in my
   simulated  (stand alone, one computer) distributed environment and
   then proceed by (a)  adding message exchange functionality and (b)
   algorithm  behavior to each node.

  Not so long ago, I quickly glanced at Eiffel after an enthusiastic
  friend told me about it, and while I do not think I will soon use it for
  myself, Eiffel might be the right choice for you, being strong on
  light-weight processes and message passing, from what I've read...

  If I had a simulation problem to tackle nowadays, I'd likely consider
  Python supplemented with greelets from the pylib library, mainly because
  I'm fond on Python legibility, and have a reasonably good confidence in
  people having implemented greenlets.

   The simulation results are represented on a GUI [...]

  The GUI aspects of Eiffel are unknown to me, I did not dive deep enough
  to touch them.  For Python, I'd use pygtk, but there are many toolkits
  to choose from.

   Is R suitable for what I am trying to do? I looked around but have not
   been able to determine if R is the appropriate platform.

  R libraries are especially good at statistics and graphics.  The
  language in itself is much oriented towards vectorisation, among other
  things, and this might be convenient for a speedy implementation of some
  simulation problems.  If vectorisation could not be turned into an
  advantage for you with R, it is likely that R might be slow for such
  problems, and also not so well adapted to quasi-parallelism between
  interacting processes having each their own behaviour.

  Of course, seasoned R users might have much more sound opinions than
  mine on this topic! :-)

--
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Suitability of R for Algorithm simulations

2006-10-04 Thread François Pinard

[Ethan B. Fini]

 I would like to be able to instantiate an object for each node in my 
 simulated  (stand alone, one computer) distributed environment and 
 then proceed by (a)  adding message exchange functionality and (b) 
 algorithm  behavior to each node.

Not so long ago, I quickly glanced at Eiffel after an enthusiastic 
friend told me about it, and while I do not think I will soon use it for 
myself, Eiffel might be the right choice for you, being strong on 
light-weight processes and message passing, from what I've read...

If I had a simulation problem to tackle nowadays, I'd likely consider 
Python supplemented with greelets from the pylib library, mainly because 
I'm fond on Python legibility, and have a reasonably good confidence in 
people having implemented greenlets.

 The simulation results are represented on a GUI [...]

The GUI aspects of Eiffel are unknown to me, I did not dive deep enough 
to touch them.  For Python, I'd use pygtk, but there are many toolkits 
to choose from.

 Is R suitable for what I am trying to do? I looked around but have not 
 been able to determine if R is the appropriate platform. 

R libraries are especially good at statistics and graphics.  The 
language in itself is much oriented towards vectorisation, among other 
things, and this might be convenient for a speedy implementation of some 
simulation problems.  If vectorisation could not be turned into an 
advantage for you with R, it is likely that R might be slow for such 
problems, and also not so well adapted to quasi-parallelism between 
interacting processes having each their own behaviour.

Of course, seasoned R users might have much more sound opinions than 
mine on this topic! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] FW: Large datasets in R

2006-07-18 Thread François Pinard

[Thomas Lumley]

People have used R in this way, storing data in a database and reading it 
as required. There are also some efforts to provide facilities to support 
this sort of programming (such as the current project funded by Google 
Summer of Code:  http://tolstoy.newcastle.edu.au/R/devel/06/05/5525.html). 

Interesting project indeed!  However, if R requires uses more swapping 
because arrays do not all fit in physical memory, crudely replacing 
swapping with database accesses is not necessarily going to buy
a drastic speed improvement: the paging gets done in user space instead 
of being done in the kernel.

Long ago, while working on CDC mainframes, astonishing at the time but 
tiny by nowadays standards, there was a program able to invert or do 
simplexes on very big matrices.  I do not remember the name of the 
program, and never studied it but superficially (I was in computer 
support for researchers, but not a researcher myself).  The program was 
documented as being extremely careful at organising accesses to rows and 
columns (or parts thereof) in such a way that real memory was best used.
In other words, at the core of this program was a paging system very 
specialised and cooperative with the problems meant to be solved.

However, the source of this program was just plain huge (let's say from 
memory, about three or four times the size of the optimizing FORTRAN 
compiler, which I already knew better as an impressive algorithmic 
undertaking).  So, good or wrong, the prejudice stuck solidly in me at 
the time, if nothing else, that handling big arrays the right way, 
speed-wise, ought to be very difficult.

One reason there isn't more of this is that relying on Moore's Law has 
worked very well over the years.

On the other hand, the computational needs for scientific problems grow 
fairly quickly to the size of our ability to solve them.  Let me take
weather forecasting for example.  3-D geographical grids are never fine 
enough for the resolution meteorologists would like to get, and the time 
required for each prediction step grows very rapidly, to increase 
precision by not so much.  By merely tuning a few parameters, these 
people may easily pump nearly all the available cycles out the 
supercomputers given to them, and they do so without hesitation.  
Moore's Law will never succeed at calming their starving hunger! :-).

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [Rd] R as shell script

2006-07-14 Thread François Pinard

[Juha Vierinen]
Hi,

Hello, Juha.  Your request, quoted below, is likely more appropriate for
R help than for R devel, so I'm redirecting this reply there.

I am considering if I should invest in learning R. Based on the
language definition and introductory documents, it seems nice. But now
I am faced with a problem: I want to be able to run R programs easily
from the unix shell, and write scripts that can automatically select R
as the interpreter:

#!/usr/bin/R
cat(Hello world.\n)

This of course doesn't work, because /usr/bin/R is a shell script.

I have been able to create a binary wrapper that calls R with the
correct arguments, which is documented here:

http://kavaro.fi/mediawiki/index.php/Using_R_from_the_shell

This still lacks eg. standard input (but I have no idea how I can
implement it in R) and full command line argument passing (can be
done), but am I on the right track, or is there already something that
does what I need?

I'm often using something like:

   #!/bin/sh
   R --slave --vanilla EOF

   # Your R source code goes here!

   EOF

Within your script, shell substitution for $1, etc., will occur.  So 
with a bit of imagination, you can do about anything :-).  Simple 
enough!  Make sure you `cat' or `print' explicitly whatever has to be 
written on standard output: for one, I usually prefer full control in 
scripts over automatic printing of given expressions.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] incomplete final line found by readLines on ...

2006-06-30 Thread François Pinard

[Taka Matzmoto]

Is there any way to prevent [this] warning message.

Hi, Taka.  The easiest might be using the suppressWarnings wrapper.
See ?suppressWarnings for more information.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to rotate a triangle image(ZMAT) ?

2006-06-27 Thread François Pinard

[Cleber N.Borges]

 how to align this Zmat (triangle image)  in X axis?  I would like that 
 the triangle's base become in the X axis and the triangle's height 
 become in the Y axis.  Is there some trick for make this?

I'm not fully sure of what is the base and the height of the triangle, 
but if I guess correctly, you may peek at ?image, the last paragraph 
of the Details: section, and also in the Examples: section, where it 
says Need to transpose and flip matrix horizontally..  Maybe you'll 
find some explanations or ideas in there.

 f - function(x, y) {
   z = 1-x-y
   z[z  (-1e-15)] - NA
return(-100*x+0*y+100*z)
 }

 x = seq(1, 0, by = -0.01)
 y = seq(1, 0, by = -0.01)
 zmat = outer(x, y, f)

 image(zmat, col=terrain.colors(10))
 contour(zmat, add=T)

Another idea is to exchange x, y in the outer call, and maybe also 
use rev() on one of them.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Install R 2.3.1 on SUSE Linux

2006-06-15 Thread François Pinard

[EMAIL PROTECTED]

I am new to Linux, and I am trying to install R 2.3.1 on SUSE Linux
10.0. The RPM installer, YAST, states that I need libgfortran.so.0.

There is a SuSE 10.0 machine somewhere.  Yes: I installed R on it,
and it works well there:

   $ rpm -qf /usr/lib/libgfortran.so
   gcc-fortran-4.0.2_20050901-3

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] flipping a plot vertically?

2006-06-15 Thread François Pinard

[Tim Brown]

This seems like an obvious question but I can't find the answer in the 
par help document --- I'd like to make a plot where the 0,0 point is 
in the top left of the screen rather than bottom left... .  [...] Any 
suggestions?

You might retry your plot, adding an ylim=c(HIGHEST, LOWEST) argument,
that is, listing the maximum before the minimum.  For example:

   plot(1:10, ylim=c(10, 1))

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Re-binning histogram data

2006-06-08 Thread François Pinard

[Berton Gunter]

 I would argue that histograms are outdated relics and that density  
 plots (whatever your favorite flavor is) should **always** be used 
 instead these days.

When a now retired researcher paid us a visit, I showed him a density 
plot produced by R over some data he did work a lot, before he left.
I, too, find them rather sexy, and I wanted to impress him with some of 
the pleasures of R, especially knowing he has been a dedicated user of 
SAS in his times.  Yet, this old and wise man _immediately_ caught that 
the density curve was leaking a tiny bit through the extrema.

Not a big deal of course -- and he did like what he saw.  Nevertheless, 
this reminded me that we should be careful at not dismissing too lightly 
years of accumulated knowledge, experience and know-how, merely because 
we give in joyful enthusiasm for more recent things.

Let me make a comparison, looking at the R mailing lists themselves.  
Some would much like sending HTML email in here: they would get colours,
use various fonts, offer links, and have indentation which dynamically 
adapts on the receiving end to the window size of the reading guy.  But 
the collective wisdom is to stick to non-HTML email, which is quite 
proven and still very functional, after all.  Some impatient people or 
dubious tools use other things than fixed-width fonts while presenting
text/plain email, or merely ignore the usual 79-column limit and other 
oldish etiquette issues while sending it: in last analysis, they kibitz 
the community more than they help it, and deep down, are a bit selfish.  
There is a long way to go before HTML email is really ubiquitous and 
correctly supported.  Consider the long time MIME took to establish 
itself: even now, email readers correctly supporting MIME are hard to 
find -- most are fond on gadgets much more than they know standards.

Another comparison which pops to my mind is how some people fanatically 
try to impose UTF-8 all around, saying that ASCII or ISO-8859-1 (and 
many others) are part of the prehistory of computers.  When mere users,
they can always talk without making too much damage.  But I've seen 
a few maintainers going overboard on such matters, consciously breaking 
software to force their convictions forward: Crois ou meurs! as we say 
in French (approximately: Believe or perish!).  Here, just like for 
HTML mail or nicer bitmapped R graphics, Unicode does have technical 
merit; the truth is that we are _far_ from mastering everything about 
it, and there are lots of open issues that are not strictly technical.

Many proponent of these various things are tempted to say that they want 
to clean out the planet of outdated relics (I liked your expression!)
and have the honest feeling they do trigger overall progress.  Moreover, 
new good things do not necessarily make older things wrong.  In a word, 
we should rather wait for progress with calm, and with respectful care 
of what already exists.  Progress will impose itself slowly over time, 
and is not so much in need of forceful evangelists. :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Edit function

2006-06-07 Thread François Pinard

[Pikounis, Bill [CNTUS]]

 view - function(x) {
   warnopt - options()$warn
   options(warn=-1)
   on.exit({sink(); options(warn=warnopt)})
   edit(x)
   invisible()
 }

I'm surprised by the necessity of sink().  Presuming it is necessary 
indeed, the above could be simplified a bit like this (untested) code:

  view - function(x) {
on.exit(sink())
invisible(suppressWarnings(edit(x)))
  }

The documentation for suppressWarnings is not overly clear about if 
the warn option is restored or not in case of error.  It says:

 'suppressWarnings' evaluates its expression in a context that
 ignores all warnings.

My exegesis :-) for that sentence would be that the context does not 
survive the error, and so, the warn option is not changed.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] How can you buy R?

2006-05-18 Thread François Pinard

[Damien Joly]

 The entity has a policy of only using software that has been 
 purchased and properly licensed (whatever that means).  [...]
 Any ideas?

[Rogerio Porto]

I think there isn't such a vendor.

A while ago, the Cygnus organisation has been created to address this 
kind of need, betting on the fact that they could live well by support 
contracts on free software, mainly GPL'ed software, which R is.  Since 
then, Cygnus has been bought by Redhat, and I do not know if the 
original vocation survived, or has been plain lost.  With enough luck, 
it could be useful to check on this side, who knows... :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] text plots?

2006-05-12 Thread François Pinard

[Robert Citek]

Is there a way to do text plots in R?  I'd like to do some simple XY 
plots in R with the output in text  (ascii).  Since I connect to 
a remote Linux machine using SSH, being able to generate a rough idea 
of what a plot will look like in text would be of benefit.

Note that it is easy with SSH to open a graphics connection, you may use 
ssh -X to force it.  Than, R will show you nice graphics even if run 
remotely.

For example, with gnuplot I can do the following:
  echo 'set terminal dumb ; plot sin(x)' | gnuplot
to generate a simple sin wave.  Regards,

Amusing for me that you mention this: I wrote that code, many years ago.
Despite gnuplot was aiming higher graphic output quality on average, my 
contribution was readily accepted, and considered useful.

While it is possible to attach images within an email, rough graphics
is sometimes simpler and sufficient.  I do not know how easy (or not) it 
would be writing a dumb device for R, but I wish that if someone ever 
contributes it, it will be accepted by the core team.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] command completion?

2006-05-10 Thread François Pinard

[Duncan Murdoch]
[Robert Citek]

 Does R have command or object name completion?

[...] I don't think it would be a welcome change to the console 
versions; some of them use readline's filename completion which would 
almost certainly be broken by this.

We have to put things in perspective, here.  In my opinion, object name 
completion would be a lot more useful than filename completion, because 
in R, we name R objects much more often than we name files.

Others need to run under ESS.

While this is a good things for Emacs lovers, the requirement is rather 
unwelcome for pagans!  :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] On the speed of apply and alternatives?

2006-05-09 Thread François Pinard

[Monty B. ]

I have to handle a large matrix (1000 x 10001) where in the
last column i have a value that all the preceding values in the same row
has to be compared to.  I have made the following code :

# generate a (1000 x 10001) matrix, testm
# generate statistics matrix 1000 x 4:

qnt - c(0.01, 0.05)
cmp_fun  - function(x)
{
  LAST - length(x)
  smpls - x[1:(LAST-1)]
  real  - x[LAST]

  ret - vector(length=length(qnt)*2)
  for (i in 1:length(qnt))
  {
q_i  - quantile(smpls, qnt[i])# the quantile i
m_i - mean(smpls[smplsq_i ] ) # mean of obs less than q_i
ret[i] - ifelse(real  q_i, 1, 0)
ret[length(qnt)+i] - ifelse(real  q_i, real - m_i, 0)
  }
  ret
}
hcvx  - apply(testm, 1, cmp_fun)

Can anyone advise as to how I can optimize the runtime of this problem?  
All suggestions are welcome!

You may speed it up a bit, not so much, with the following:

stats.testm - function (testm, qnt=c(0.01, 0.05)) {
quants - apply(testm[, 1:(ncol(testm)-1)], 1, quantile, qnt)
smpls - testm[rep(1:nrow(testm), each=length(qnt)), 1:(ncol(testm)-1)]
reals - testm[rep(1:nrow(testm), each=length(qnt)), ncol(testm)]
keeps - smpls  rep(quants, ncol(smpls))
means - rowSums(smpls * keeps) / rowSums(keeps)
matrix(rbind((reals  quants) + 0,
 (reals  quants) * (reals - means)),
   length(qnt) * 2)
}

Try it with something like:

gen.testm - function (n, m) {
matrix(sample(0:99, n * (m + 1), TRUE), n)
}

testm - gen.testm(100, 100)
stats.testm(testm)

Without checking, I would suspect that quantile is the big consumer.
If you could make it without quantile interpolation, maybe some more
vectorisation could be possible, but in any case, I do not think you can 
avoid sorting each row separately, in one way or another (currently done 
within quantile).

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] screen wrapping

2006-05-04 Thread François Pinard

[Robert Citek]

How can I increase/decrease the line length for screen wrapping?

Check ?options, and within in, width, it might be what you want.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] install R under suse: packages dependency

2006-05-03 Thread François Pinard

[zhihua li]

I'm trying to install R 2.3.0 under Suse 10.0.   As I'm using SSH to 
login into the SUSE server, I can't use YAST2,

I presume this is because you cannot remotely mount the CD's or DVD's?  
The next time you visit your server, if possible, copy your distribution 
media to your hard disks, you'll find out that this is really a useful 
thing to do.  You can later use YaST2 to install from the copies you 
made, even remotely.  There is no problem using YaST2 over SSH, either 
in graphical mode (if you used `ssh -X') or in text mode.

In my experience, R 2.3.0 installs painlessly under SuSE 10.0, and needs 
nothing which is not already available on the distribution media.  
Should I say, I'm still impressed (even astonished) that R installation 
succeeds so easily, given the size and complexity of the distribution.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] plot cdf

2006-04-27 Thread François Pinard

[Romain Francois]

[...] it would be useful to add an option 'ask' in 'example', maybe 
with a default to TRUE in interactive mode

Seconded.  `example(...)' would be more friendly for the average use.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] deleting rows with the same ID if any meet a condition

2006-04-27 Thread François Pinard

[gblevins]

If x2 equal 2 then I want to delete all the rows for that person from 
the dataframe--see Before and After below.

Before
x1 - c(1,1,1,2,2,3,3,3)
x2 - c(2,3,3,1,1,4,4,2)
x3 - data.frame(x1,x2)
 x3
  x1 x2
1  1  2
2  1  3
3  1  3
4  2  1
5  2  1
6  3  4
7  3  4
8  3  2

After
  x1 x2
1  2  1
2  2  1

You might try:

  subset(x3, !x1 %in% x1[which(x2==2)])

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Is there a way....

2006-04-26 Thread François Pinard

[Levent TERLEMEZ]

I would like to get rid of counting lines in fix() when i made 
a mistake in coding? Is there an easy way to an line numbers to editor?

You may configure R to use the editor you want, the way you want.  For 
example, if you want fix() to start Vim in graphical mode, highlighting 
file contents using R syntax, and numbering source lines, you might use
the following R command:

options(editor = gvim -c \set nu\ -c \set ft=r\)

Of course, how to do this much depends on each editor.  Some simpler 
editors may not have options for displaying the line number on each 
line.  Yet, most keep the line counter for the cursor position updated 
in some mode or status bar, so if nothing better, you can learn to 
position the cursor while keeping an eye on that. :-)

Once you find the proper incantation for your editor, and if you always 
want it activated, save the R command within your ~/.Rprofile file.


P.S. - By the way, much congratulations and thanks to the R Core team 
for the recent publication of R 2.3.0.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] how to draw a circle

2006-04-22 Thread François Pinard

[Jian Zhang]

how to draw a circle (e.g. radius=10cm) of one point?
And how to choose these points in the circle?

There also are ellipse functions in both packages car and ellipse.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] I am surprised (and a little irritated)

2006-04-19 Thread François Pinard

[Tom Backer Johnsen]
 [...] I've understood that RPM's are somewhat like installing 
 programs on Windows, so that was downloaded and started with YAST.  
 [...] Then I discover to my big surprise that the readme file says 
 that I need to have eight installed packages.  Then it says Most of 
 them are included in a standard install.  [...] someone should get 
 the OpenSuse people to include R in the installation.

[Gabor Csardi]

I'm irritated as well. Your email should go to some suse mailing list, 
this is a suse problem, it has (almost) nothing to do with R.

We are running regular (Pro?) SuSE systems at various distributions 
levels on a flurry of machines, but have no experience with OpenSuse, 
however, and install R from sources on these machines wherever needed.
My notes say that *I* should pay attention to have the following 
packages pre-installed, besides those which are already usual for us:

   gcc-fortran, libjpeg-devel, readline-devel, tcl-devel, tk-devel

I'm not sure about tk-devel.  But these are all available on the CDs.

R installation from sources goes surprisingly well for us, using SuSE.  
surprisingly is an euphemism here, astonishingly is more proper, 
given the size and complexity of R sources, components, and all release 
engineering.  I'm always quite impressed that such software works!  
There is a tremendous amount of work behind a successful distribution, 
which many of us do not suspect enough! :-)  It forces admiration.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R and ViM

2006-04-18 Thread François Pinard

[Michael Graber]

[...] I'd like to be able to use R together with ViM.  [...]  My 
question now is, whether there are already people out there knowing how 
to do this in a similar easy way as with Emacs [...]

I've been an Emacs user for a very long time, and then, switched to Vim.
See http://pinard.progiciels-bpi.ca/opinions/editors.html, if you feel 
curious, for a few personal thoughts on Emacs.

For R, I tried sticking to a mere interactive shell, taking advantage of 
the GNU readline interface built into R, with Vim as an external 
editor.  For sending R code from Vim to R, one merely selects the code 
to send within Vim using the mouse, and paste it directly with the mouse 
in the interactive shell window running R.  Simple and comfortable!  :-)

Emacs offers ESS, which has many interesting features.  However, despite 
quite attractive, it did not fully seduce me: a bit because I try to 
avoid returning to Emacs keystroke habits, a bit because ESS is heavy 
weighted compared to Vim + R-in-a-shell solution, a bit because ESS adds 
distracting idiosyncrasies, like scrolling differently or opening extra 
windows at times.  R already offers enough options I could customize if 
I want to read help in a browser or a pager, and at good speed.  (Of 
course, if you use an heavy browser, you feel it; but links -g is OK!)

An ESS nicety that my current setup does not really replace is the 
automatic highlighting or R output.  One of the advantages of this 
output highlighting is visually spotting R requests and replies.  As 
a compromise, I'm using this bit of a kludge in my Rprofile file:

if (interactive()) {
local({
options(editor='vim -c set ft=r')
if (Sys.getenv('TERM') %in% c('rxvt', 'xterm')) {
onglet = 2
options(prompt=paste(sep='',
 formatC('', width=80-onglet), '\033[;30;45m',
 formatC('', width=onglet), '\033[0m\n',
 options('prompt')))
}
})
}

The set ft=r bit ensures proper highlighting and coloration within 
Vim, whenever edit() or fix() are used.  Here vim could be replaced by 
gvim or gvim -f, say.  (In my Vim configuration, vim uses the GUI 
automatically if started within X; or uses the console mode otherwise.)

Then, the R prompt is modified to visually mark each request-reply 
interaction with a white separating line holding a small violet marker 
at the right.  It works nicely for me in almost all circumstances (there 
are a few, uncommon exceptions).  Usual scrolling of the shell window 
allows me to quickly find R commands and replies, even if much less 
colourful than with ESS.  I'm ready to pay that price for simplicity.

A last trick which is convenient in my case.  My X window manager allows 
customization of keystrokes.  (I'm using Openbox, but surely many other 
window manages offer that possibility too.)  For all 26 of 
Ctrl-Alt-Letter, the same small openbox-helper (Python) script of mine 
is called with the Letter given as an option, which may launch 
applications in turn.  This is how Ctrl-Alt-R opens a shell window 
running R, and Ctrl-Alt-M opens a shell window running Maxima.  In both 
these shells, Ctrl-D closes the application and the window.  This is 
convenient for quick mathematical jobs, and quite in the spirit of Vim 
(fast and easy start/exit, instead of long running like Emacs).

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] What does rbind(iris[,,1], iris[,,2], iris[,,3]) do?

2006-04-13 Thread François Pinard

[Gabor Grothendieck]
What you are referring to iris is called iris3 in R so just replace
iris with iris3.   iris3 is a 3d array in R whereas iris is a data frame.

Thanks for this calm and simple reply.  Some could learn from you! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] strange matrix behaviour: is there a matrix with one row?

2006-04-07 Thread François Pinard

[EMAIL PROTECTED]

 y - matrix(1:8, ncol=2)
 is.matrix(y[-c(1,2),])
[1] TRUE
 is.matrix(y[-c(1,2,3),])
[1] FALSE
 is.matrix(y[-c(1,2,3,4),])
[1] TRUE

It seems like an inconsistent behaviour:
- with 2 or more rows we have a matrix
- with 1 row we do not have a matrix and
- with 0 rows we have a matrix again

?'[' explains it.  Using your example:

 is.matrix(y[-c(1, 2), , drop=FALSE])
[1] TRUE

 is.matrix(y[-c(1, 2, 3), , drop=FALSE])
[1] TRUE

 is.matrix(y[-c(1, 2, 3, 4), , drop=FALSE])
[1] TRUE

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] argv[0] --- again

2006-04-03 Thread François Pinard

[ivo welch]
how about people on [...] linux or unix [...]

See ?commandArgs.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Remove [1] ... from output

2006-03-28 Thread François Pinard

[Gregor Gorjanc]

I am writing some numbers and character vectors to an ascii file and
would like to get rid of [1] ... as shown bellow (a dummy example)

R runif(20)
 [1] 0.653574 0.164053 0.036031 0.127208 0.134274 0.103252 0.506480 0.547759
 [9] 0.912421 0.584382 0.987208 0.996846 0.666760 0.053637 0.327590 0.370737
[17] 0.505706 0.412316 0.887421 0.812151

I have managed to work up to remove quotes and all [*] except [1] as
shown bellow.

R print(paste(runif(20), collapse =  ), quote = FALSE)
[1] 0.790620362851769 0.45603066496551 0.563822037540376
0.812907998682931 0.726162418723106 0.37031230609864 0.681147597497329
0.29929908295162 0.209858040558174 0.304300333140418 0.105796672869474
0.743657597573474 0.409294542623684 0.825012607965618 0.282235795632005
0.21159387845546 0.620056127430871 0.337449935730547 0.754527133889496
0.280175548279658

Any hints how to solve my task?

You may use cat instead of print.  No need to paste then.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Replies on this list [was: removing NA from a data frame]

2006-03-20 Thread François Pinard

[Berton Gunter]
[Sam Steingold]

 PPS. how do I figure out the number of rows in a data.frame?
  is length(attr(X,row.names)) the right way?

help.search(number of rows) immediately gets you your answer!

Hi, people.  Here, I get:

  Help files with alias or concept or title matching ‘number of rows’
  using fuzzy matching:

  nrow(base)  The Number of Rows/Columns of an Array

and '?nrow' says that it meant for arrays: nothing about data.frame, and 
not a generic method either.  Even if it was a class method, we should 
not expect a new user to be very familiar with R (both!) class systems 
from the start.

What a new user might think, reading the documentation?   Sam Steingold 
is surely an experimented and competent computer guy.  He might guess, 
who knows, that some automatic array to data.frame conversion occurs 
(all inefficient that it could be).  Yet this would not match other 
knowledge nor experimentation, as a data.frame is hardly an array:

   x = data.frame(a=1:3, b=c(TRUE, TRUE, FALSE), c=letters[1:3])
   as.array(x)
  Erreur dans dimnames-.data.frame(`*tmp*`, value = list(c(a, b, c :
  'dimnames' incorrect pour ce tableau de données

Despite help.search(number of rows) provides an answer that happens to 
be right, it might not be recognised as such by an intelligent reader, 
and so, it is not really satisfactory.  The documentation for nrow 
could be improved by saying that it applies to any kind of structure for 
which dim() is meaningful.  And even then, ?dim is silent about data 
frames.  One clue (yet a pretty weak one) that nrow may be applied to 
a data.frame comes from the fact that ?dim.data.frame lists the same 
documentation as ?dim.


Why do I say all this?  Because it happens, not necessarily in this 
case, a bit too often nevertheless, that answers given to users are 
uselessly harsh or haughty.  Especially when they imply that the 
documentation is perfect.  One problem is that some people enjoy reading 
such replies.  As example of this strange kind of pleasure, here is 
a excerpt from R Archives, which I find especially enlightening on the 
mentality of few members:

  From: [EMAIL PROTECTED] (Steve Wisdom)
  Date: 2003-12-26 17:04
  Subject: [R] re| Dr Ward on List protocol 

  Andrew C. Ward [EMAIL PROTECTED] :

  With respect to 'tone' and 'friendliness', perhaps all that is meant or
  needed is that people be polite and respectful.

  I shake my head as often at rude answers

  Oh, by gosh, by golly.

  I don't think an occasional dose of 'real life', via a jab from the
  Professor, will cause any lasting harm to the cosseted  emolumated students
  and academics on the List.

  On a Wall St trading desk, for example, every day one is kicked in the head
  more brutally by clients, superiors, counterparts, the markets  etc, than
  ever one would be by the Professor.

  Plus, the Professor's jabs are good Schadenfreudic fun for the rest of us.

  Regards,

  Steve Wisdom
  Westport CT US

The truth is that not everybody around here is cosseted  emolumated 
students and academics.  Moreover, behaviour at trading desks is fully 
irrelevant, and for most of us, this is not the kind of life we chose to 
live.  Wrong behaviour elsewhere is hardly an excuse for not behaving 
properly, here.

Moreover, what is mere good fun for some may be perceived as highly 
inelegant by others.  While some competent members may inspire 
admiration and charism by their knowledge and dedication, they sometimes
damage beyond repair what they inspire, when showing poor humanity.

I'm aware of the constant fear some have of seeing this list abused.  
There are ways for not being abused, which do not require becoming 
abusive ourselves.  We should deepen such ways in our own habits.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] error function

2006-03-06 Thread François Pinard

[Kjetil Brinchmann Halvorsen]

erf [in] package (CRAN) NORMT3, as help.search(error function) could 
have told [you]

It does not for me.  I would presume one needs NORMT3 installed first, 
and NORMT3 is seemingly not part of standard base R installation.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Remove gray grid from levelplot

2006-03-06 Thread François Pinard

[Brian Ripley]
On Mon, 6 Mar 2006, Martin Sandiford wrote:

[...]

 P.S. To me, the png() device does not appear to do sub-pixel
 rendering.  The postscript() and pdf() devices do.

What could you possibly mean by that?

I would think the original poster refers to aliasing issues.

The png device writes on a bitmap.  It outputs a rectangular grid of 
either pre-defined colour indices or RGB values.  There is nothing in 
the PNG standard to allow anything finer.

Granted.  Yet, there are nuances.  Anti-aliasing techniques may be 
applied to bit-mapped images like PNGs, and a carefully computed alpha 
channel could be included in the PNG as a way to acknowledge sub-pixel 
rendering matters.

If the background of the generated image is opaque instead of 
transparent, the graphics and the background might be combined at PNG 
generation, swallowing what would have been an alpha channel and so, 
sparing the need of including any in the generated PNG.

However, on this Linux system, if I understood correctly, R goes through 
X11 for generating PNGs, and so, does no better than X11 itself (at 
least as currently driven by R) in the area of anti-aliasing.

Anti-aliasing libraries exist (which I never really studied or used 
myself) that could likely provide better PNG quality.  Did some decision 
has been reached among developers on this topic?  I would guess, without 
really knowing, that developers favor vector-to-raster rendering to be 
done outside R, whenever quality is required.

Using an anti-aliasing library for higher output quality within R would 
mean, besides the obvious trouble of selecting one of those libraries 
and programming the interface, adding yet another dependency at 
R build-time (likely autoconfigured, of course), and an observable 
slowdown for graphics which are more heavily loaded, especially in 
interactive mode.  For one, I do not need more than draft quality so far 
when using R interactively for plots.  Maybe some draft, quality or 
aa flag is added to control anti-aliasing behaviour? (I know that 
quality is already used to mean something else for JPEG images).

  Just a few thoughts.  Keep happy, all!

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Newton-Raphson with analytical derivatives

2006-03-03 Thread François Pinard

[m p]

Can someone point to a package with Newton-Raphson method using 
analytic derivatives. This is just for a 1D problem. Could not find 
easily anything suitable.

You may check:

  http://pinard.progiciels-bpi.ca/plaisirs/animations/NRart/R/nr.image.R

If you remove the graphics-related lines, the few remaining lines is
Newton-Raphson over an expression.  Don't take this too seriously, it 
was a mere toy with this to get an initial feel of the R language.  :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] prehistoric versions of R -- 1995!

2006-02-09 Thread François Pinard

[Martin Maechler]

2) The oldest stuff that I have is all from 1995;

Mailing lists seem to go back into 1995 too.  I found a few messages 
from around 1994 on topics to be later found within R, but I'm not sure 
where I got these old messages from.  I did find a message really 
related to R-pre-alpha, which itself quotes a message written in 1994.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] qqplot

2006-02-07 Thread François Pinard

[Vincent Negre]

[...] I do not understand how qqplot() compute quantiles.

Just type ``qqplot`` (without the parentheses) at the R prompt, to see 
the source code.  ``qqplot`` does not especially compute quantiles, 
which are rather obtained directly through sorting its arguments.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Difficulty with qqline in logarithmic context

2006-02-03 Thread François Pinard

[Brian Ripley]
Is there a good reason to use qqnorm in a single-log context?

Yes.  Googling around reveals this is not so uncommon.

 Should one not rather use

qqnorm(log(freq))
qqline(log(freq))

In the display produced by qqnorm, the y-axis would then show 
log(value) labels, while the user (me!) expects value labels.

since you are (I guess) looking at log-normality of freq?

Once again, I was merely toying with qqplot.  I found intriguing that, 
while shuffling messages around between folders, for a good while, the 
distribution of log(number of messages) per folder appears vagueley 
normal, as I do not quickly see a reasonable justification for this.

Another way to look at that is

qqplot(qlnorm(ppoints(length(freq))), freq, log=xy)

the same plot, different scales.

Interesting, thanks for teaching me about ppoints.  Yet, I stay more 
happy with the abcissa scale produced by qqnorm.  Besides, how would 
one uses qqline with the above?

(I believe a QQ plot should always have comparable scales on the two 
axes.)

While comparable scales are somewhat simpler to compare, this is not 
necessarily what is most adequate for the user.  Proof is that while 
quantiles are being compared here, scales do not show quantiles, but 
units as meaningful to the user.  One might want to compare variables 
scaled very differently, maybe because of different units from the same 
distribution, of from different but similar distributions using 
different scales and shifted to different means.  Or even, why not, if 
this is what is meaningful for users, a log scale.

The point is that qqline is tied to normality, not to log-normality.

As it stands, yes.  As a convenience, it could be extended (probably 
easily) to log-normality.  qqnorm already does something sensible in 
log-context, so a user might expect qqline to do equally well.

The real point might be that qqline is tied to abline a bit too 
blindly.  What is the meaning of intercept and slope of a straight line 
on a graphic in log context?  First, the intercept might not even exist.  
Second, abline interpretation depends on the clippling, and possibly 
on the extrema of the pretty breakpoints chosen for scales, so making it 
hard to predict on average use.   There ought to be some reason for the 
log-aware code in abline, yet I did not find documentation for it.

The wisest for abline, in my very humble opinion, would be for it to 
complain if ever called in log context.  Then, qqline would indirectly 
complain through abline, if qqline is not modified to do something 
more proper.  Moreover, if it is definitely out of question that 
qqline be ever meaningfully called in log context, then so qqnorm, 
which should then complain as well.

Currently, qqline misbehaves, in that it silently produces 
a meaningless result, while it could either diagnose that the result is 
meaningless, or produce a mearningful result.


[Remainder of the reply top-quoted, as usual on r-help.]

On Wed, 1 Feb 2006, François Pinard wrote:

Hi, R friends.  I had some difficulty with the following code:

   qqnorm(freq, log='y')
   qqline(freq)

as the line drawn was seemingly random.  The exact data I used appears
below.  After wandering a bit within the source code for abline,
I figured out I should rather write:

   qqnorm(freq, log='y')
   par(ylog=FALSE)
   qqline(log10(freq))
   par(ylog=TRUE)

I'm proposing that this little stunt be rather be hidden and
automatically effected within qqline proper, whenever par('ylog') is
TRUE.  I thought about providing a patch, as qqline is so small.  Yet
it would be more noise than useful, as I'm not familiar with the datax
argument usage, which should probably be addressed as well.



Here is the data, in case useful:

freq -
as.integer(c(33, 79, 21, 436, 58, 18, 1106, 498, 1567, 393, 2,
104, 50, 67, 113, 76, 327, 331, 196, 145, 86, 59, 12, 215, 293,
154, 500, 314, 246, 587, 85, 23, 323, 3, 13, 576, 29, 37, 24,
21, 1230, 137, 13, 93, 3, 101, 72, 218, 59, 17, 2, 8, 86, 143,
150, 22, 19, 234, 119, 157, 4, 255, 146, 126, 76, 15, 271, 170,
4, 6, 16, 3048, 2175, 3350, 5017, 5706, 1610, 665, 322, 1, 16,
47, 51, 168, 94, 66, 154, 99, 11, 547, 953, 1, 1071, 80, 184,
168, 52, 187, 103, 187, 361, 46, 85, 135, 597, 121, 283, 26,
12, 20, 169, 9, 79, 15, 114, 75, 30, 111, 556, 173, 32, 99, 438,
2, 2, 1, 117, 5, 3, 51, 8, 41, 12, 23, 2, 13, 5, 1, 9, 4, 1,
7, 15, 5, 48, 16, 112, 6, 1, 39, 60, 5, 23, 5, 19, 1, 8, 32,
4, 13, 1, 14, 71, 5, 1, 35, 30, 100, 389, 22, 8, 1, 192, 40,
6, 3, 17, 2, 14, 71, 14, 1, 5, 4, 32, 21, 18, 13, 2, 2, 45, 342,
46, 144, 18, 131, 188, 112, 37, 85, 90, 8, 195, 173, 5, 53, 96,
37, 16, 16, 281, 64, 50, 92, 336, 31, 744, 4, 134, 74, 1, 227,
6, 48, 418, 64, 66, 59, 20, 45, 20, 370, 148, 22, 7, 30, 601,
29, 82, 113, 938, 252, 65, 137, 72, 22, 98, 12, 152, 212, 13,
8, 35, 3, 77))

Yet this really is the value of courriel$freq after data(courriel),
with a file .../R/data/courriel.R here, holding:

courriel - read.table(pipe('grep -c

[R] Difficulty with qqline in logarithmic context

2006-02-01 Thread François Pinard

Hi, R friends.  I had some difficulty with the following code:

   qqnorm(freq, log='y')
   qqline(freq)

as the line drawn was seemingly random.  The exact data I used appears 
below.  After wandering a bit within the source code for abline, 
I figured out I should rather write:

   qqnorm(freq, log='y')
   par(ylog=FALSE)
   qqline(log10(freq))
   par(ylog=TRUE)

I'm proposing that this little stunt be rather be hidden and 
automatically effected within qqline proper, whenever par('ylog') is 
TRUE.  I thought about providing a patch, as qqline is so small.  Yet 
it would be more noise than useful, as I'm not familiar with the datax 
argument usage, which should probably be addressed as well.



Here is the data, in case useful:

freq -
as.integer(c(33, 79, 21, 436, 58, 18, 1106, 498, 1567, 393, 2, 
104, 50, 67, 113, 76, 327, 331, 196, 145, 86, 59, 12, 215, 293, 
154, 500, 314, 246, 587, 85, 23, 323, 3, 13, 576, 29, 37, 24, 
21, 1230, 137, 13, 93, 3, 101, 72, 218, 59, 17, 2, 8, 86, 143, 
150, 22, 19, 234, 119, 157, 4, 255, 146, 126, 76, 15, 271, 170, 
4, 6, 16, 3048, 2175, 3350, 5017, 5706, 1610, 665, 322, 1, 16, 
47, 51, 168, 94, 66, 154, 99, 11, 547, 953, 1, 1071, 80, 184, 
168, 52, 187, 103, 187, 361, 46, 85, 135, 597, 121, 283, 26, 
12, 20, 169, 9, 79, 15, 114, 75, 30, 111, 556, 173, 32, 99, 438, 
2, 2, 1, 117, 5, 3, 51, 8, 41, 12, 23, 2, 13, 5, 1, 9, 4, 1, 
7, 15, 5, 48, 16, 112, 6, 1, 39, 60, 5, 23, 5, 19, 1, 8, 32, 
4, 13, 1, 14, 71, 5, 1, 35, 30, 100, 389, 22, 8, 1, 192, 40, 
6, 3, 17, 2, 14, 71, 14, 1, 5, 4, 32, 21, 18, 13, 2, 2, 45, 342, 
46, 144, 18, 131, 188, 112, 37, 85, 90, 8, 195, 173, 5, 53, 96, 
37, 16, 16, 281, 64, 50, 92, 336, 31, 744, 4, 134, 74, 1, 227, 
6, 48, 418, 64, 66, 59, 20, 45, 20, 370, 148, 22, 7, 30, 601, 
29, 82, 113, 938, 252, 65, 137, 72, 22, 98, 12, 152, 212, 13, 
8, 35, 3, 77))

Yet this really is the value of courriel$freq after data(courriel), 
with a file .../R/data/courriel.R here, holding:

courriel - read.table(pipe('grep -c \'^From \' ../courriel/*'),
   sep=':', as.is=T, row.names=1,
   col.names=c('fichier', 'freq'))

My goal, which is nothing serious, was merely to toy with the number of 
messages per folder, for folders massaged out of R archives.



Version:
 platform = i686-pc-linux-gnu
 arch = i686
 os = linux-gnu
 system = i686, linux-gnu
 status = 
 major = 2
 minor = 2.1
 year = 2005
 month = 12
 day = 20
 svn rev = 36812
 language = R

Locale:
LC_CTYPE=fr_CA.UTF-8;LC_NUMERIC=C;LC_TIME=fr_CA.UTF-8;LC_COLLATE=fr_CA.UTF-8;LC_MONETARY=fr_CA.UTF-8;LC_MESSAGES=fr_CA.UTF-8;LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C;LC_IDENTIFICATION=C

Search Path:
 .GlobalEnv, package:methods, package:stats, package:graphics, 
package:grDevices, package:utils, package:datasets, fp.etc, Autoloads, 
package:base


-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R-help Digest, Vol 35, Issue 24

2006-01-25 Thread François Pinard

[Gabor Grothendieck]

[...] this list is inhabited by some rather rude participants but
everyone puts up with them in the hope that they do have some useful
remarks.

I've been witnessing this list for about one year, and also read *lots* 
of archived messages.  While it is true that a few members do not use 
white gloves, are rather fond on concise replies, and do express strong 
opinions at times, they never went overboard insulting people and always 
kept a reasonable measure, at least so far that I could see (yet who 
knows, outliers might happen! :-).

(*) Our whole society is a bit shy and shivers easily when opinions are 
expressed nowadays, I often observed than people quickly get insecure,
feel attacked, and overreact (by running away or starting a fight).

there is even a group of thought that feels it is a justifiable way to
keep the list volume under control.

This may work because of the starred paragraph above, that is, for wrong 
reasons.  Best is, and this often occurs on the R list, when everything 
(facts, opinions) is being shared efficiently, without useless arguing.  
Then, threads quickly fade out.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R-help Digest, Vol 35, Issue 24

2006-01-24 Thread François Pinard

[EMAIL PROTECTED], addressing to Brian Ripley]

First of all, unless you are an english professor, then I do not think
you have any business policing language.

We all do mistakes (English or otherwise).  I'm very grateful that 
people forgive my own errors, and I try to be tolerant to others.  (Yet, 
it happens that people lacking good will ask for stronger reactions.)

This is the business of everybody, really, building a better community 
in every possible aspect, and the means for this go through interaction 
and collaboration.  Let's all be humble enough to ponder the criticism 
of others, improve ourselves, and so increase the value of our share.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] [Rd] Display an Image on a Plane

2006-01-20 Thread François Pinard

[Barry Rowlingson]
[Ben Bolker]
 [Labbe, Vincent]

I am new to R and I would like to display an image on a plane in a
3D plot, i.e. I would like to be able to specify a theta and a phi
parameters like in the function persp to display a 2D image on an
inclined plane.

 what do you mean by image exactly?

I think once you get into doing fancy visualisations like this then you
may find a solution outside of R. [good referrences deleted]

Bonjour, Vincent.

I'm not fully sure I understand your request, what I get is that you 
want to transform an image on a plane as if one was looking at it in 
space, from an angle.   If I had this problem, I would probably produce 
the image using regular R machinery for this like png() or postscript(), 
then interactively process the result within Gimp, using trapezoidal 
deformations (I think they call it Perspective transformation).  For 
example, I used this simple trick in the following picture:

   http://pinard.progiciels-bpi.ca/plaisirs/dessins/cd-back.jpg

for the KWIC listing being part of the composition.  However, if 
I needed a precise phi and theta for transformations beyond what 
trans3d() can offer, I would likely use Python or R for computing the 
projection of the rectangle enclosing the image, than PIL (Python 
Imaging Library) for producing that precise trapezoidal deformation.  
Just sharing ideas, of course.  Much likely that if I knew R better, 
I would use it more fully -- but that's a tautology! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Dynamic Programming in R

2006-01-19 Thread François Pinard

[Arnab mukherji]

A concern that has been cited that may discourage R use for solving
dynamic programs is its memory handling abilities.

For a dynamic programming problem defined over N steps, one usually 
needs a N*N matrix, so problems should be tractable for N being not too 
big.  In those I studied, CPU time usually was the scarse resource.
As extreme paths were known to be very unlikely, this (and memory as 
well) could be alleviated somehow by limiting the solution search into 
bands (more or less wide) following the diagonal of the solution matrix.  
I also had some success in splitting big problems into a sequence of 
smaller subproblems, and recursively: such approximations are likely not 
acceptable in the general case.

I would guess that most dynamic programming problems have their own 
specific artifacts and speed-up techniques, a universal solution might 
be uneasy.  Who knows (I'm not sure): R might well offer a powerful 
environment for building a dynamic programming framework.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-09 Thread François Pinard

[hadley wickham]

[François Pinard]

 Selecting a sample is easy.  Yet, I'm not aware of any SQL device for
 easily selecting a _random_ sample of the records of a given table.
 On the other hand, I'm no SQL specialist, others might know better.

There are a number of such devices, which tend to be rather SQL variant
specific.  Try googling for select random rows mysql, select random
rows pgsql, etc.

Thanks as well for these hints.  Googling around as your suggested (yet 
keeping my eyes in the MySQL direction, because this is what we use), 
getting MySQL itself to do the selection is a bit discouraging, as 
according to comments I've read, MySQL does not seem to scale well with 
the database size according to the comments I've read, especially when 
records have to be decorated with random numbers and later sorted.

Yet, I did not drive any benchmark myself, and would not blindly take 
everything I read for granted, given that MySQL developers have speed in 
mind, and there are ways to interrupt a sort before running it to full 
completion, when only a few sorted records are wanted.

Another possibility is to generate a large table of randomly
distributed ids and then use that (with randomly generated limits) to
select the appropriate number of records.

I'm not sure I understand your idea (what mixes me in the randomly 
generated limits part).  If the large table is much larger than the 
size of the wanted sample, we might not be gaining much.

Just for fun: here, sample(1, 10) in R is slowish already :-).

All in all, if I ever have such a problem, a practical solution probably 
has to be outside of R, and maybe outside SQL as well.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-08 Thread François Pinard

[Brian Ripley]
[François Pinard]
[Brian Ripley]

One problem [...] is that R's I/O is not line-oriented but
stream-oriented.  So selecting lines is not particularly easy in R.

I understand that you mean random access to lines, instead of random
selection of lines.

That was not my point. [...] Skipping lines you do not need will take 
longer than you might guess (based on some limited experience).

Thanks for telling (and also for the expression reservoir sampling).
OK, then.  All summarized, if I ever need this for bigger datasets, 
selection might better be done outside of R.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-08 Thread François Pinard

[Martin Maechler]

FrPi Suppose the file (or tape) holds N records (N is not known
FrPi in advance), from which we want a sample of M records at
FrPi most. [...] If the algorithm is carefully designed, when
FrPi the last (N'th) record of the file will have been processed
FrPi this way, we may then have M records randomly selected from
FrPi N records, in such a a way that each of the N records had an
FrPi equal probability to end up in the selection of M records.  I
FrPi may seek out for details if needed.

[...] I'm also intrigued about the details of the algorithm you
outline above.

I went into my old SPSS books and related references to find it for you, 
to no avail (yet I confess I did not try very hard).  I vaguely remember 
it was related to Spearman's correlation computation: I did find notes 
about the severe memory limitation of this computation, but nothing 
about the implemented workaround.  I did find other sampling devices, 
but not the very one I remember having read about, many years ago.

On the other hand, Googling tells that this topic has been much studied, 
and that Vitter's algorithm Z seems to be popular nowadays (even if not 
the simplest) because it is more efficient than others.  Google found 
a copy of the paper:

   http://www.cs.duke.edu/~jsv/Papers/Vit85.Reservoir.pdf

Here is an implementation for Postgres: 

   http://svr5.postgresql.org/pgsql-patches/2004-05/msg00319.php

yet I do not find it very readable -- but this is only an opinion: I'm 
rather demanding in the area of legibility, while many or most people 
are more courageous than me! :-).

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-08 Thread François Pinard

[hadley wickham]

 [...] according to comments I've read, MySQL does not seem to scale
 well with the database size according to the comments I've read,
 especially when records have to be decorated with random numbers and
 later sorted.

With SQL there is always a way to do what you want quickly, but you
need to think carefully about what operations are most common in your
database.  For example, the problem is much easier if you can assume
that the rows are numbered sequentially from 1 to n.  This could be
enfored using a trigger whenever a record is added/deleted.  This would
slow insertions/deletions but speed selects.

Sure (for a caricature example) that if database records are already 
decorated with random numbers, and an index is built over the 
decoration, random sampling may indeed be done quicker :-). The fact is 
that (at least our) databases are not especially designed for random 
sampling, and people in charge would resist redesigning them merely 
because there would be a few needs for random sampling.

What would be ideal is being able to build random samples out of any big 
database or file, with equal ease.  The fact is that it's doable.  
(Brian Ripley points out that R textual I/O has too much overhead for 
being usable, so one should rather say, sadly: It's doable outside R.)

 Just for fun: here, sample(1, 10) in R is slowish already
 :-).

This is another example where greater knowledge of problem can yield
speed increases.  Here (where the number of selections is much smaller
than the total number of objects) you are better off generating 10
numbers with runif(10, 0, 100) and then checking that they are
unique

Of course, my remark about sample() is related to the previous 
discussion.  If sample(N, M) was more on the O(M) side than being on 
the O(N) side (both memory-wise and cpu-wise), it could be used for
preselecting which rows of a big database to include in a random sample, 
so building on your idea of using a set of IDs.  As the sample of 
M records will have to be processed in-memory by R anyway, computing 
a vector of M indices does not (or should not) increase complexity.

However, sample(N, M) is likely less usable for randomly sampling 
a database, if it is O(N) to start with.  About your suggestion of using 
runif and later checking uniqueness, sample() could well be 
implemented this way, when the arguments are proper.  The greater 
knowledge of the problem could be built in right into the routine meant 
to solve it.  sample(N, M) could even know how to take advantage of 
some simplified case of a reservoir sampling technique :-).

 [...] a large table of randomly distributed ids [...] (with randomly
 generated limits) to select the appropriate number of records.

[...] a table of random numbers [...] pregenerated for you, you just
choose a starting and ending index.  It will be slow to generate the
table the first time, but then it will be fast.  It will also take up
quite a bit of space, but space is cheap (and time is not!)

Thanks for the explanation.

In the case under consideration here (random sampling of a big file or 
database), I would be tempted to guess that the time required for 
generating pseudo-random numbers is negligible when compared to the 
overall input/output time, so it might be that pregenerating randomized 
IDs is not worth the trouble.  Also given that whenever the database 
size changes, the list of pregenerated IDs is not valid anymore.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-08 Thread François Pinard

[Uwe Ligges]
François Pinard wrote:
[David Forrest]

[...] A few end-to-end tutorials on some interesting analyses would
be helpful.

I'm in the process of learning R.  While tutorials are undoubtedly
very useful, and understanding that working and studying methods vary
between individuals, what I (for one) would like to have is a fairly
complete reference manual to the library [...] organised by topics.

Have a look at  help.start() -- Search Engine  Keywords -- Section 
Keywords by Topic.

Yes, thanks.  This is quite in the spirit, or direction, of what I was 
proposing.  Is that resource exhaustive?  (I'm asking out of laziness, 
as it might take me several months to really check.)

One serious drawback (for me) is that it requires an heavy weight
browser to be used, with Javascript enabled.  I do not find this very
practical.  Another point is that the presentation, while useful, is a
rather dry.  In another message, I suggested the Emacs Lisp Reference 
Manual as a good example of a fluid presentation of a voluminous 
library.  There might be some workable compromise between the current 
situation with R, even through the Keywords by Topic, and that 
fluidity.  (Wikis also have the drawback of requiring heavy machinery,
and the editor they force us into if usually unbearable.)

I may be back with this subject, but only in a good while.  I'm slowly 
building a kind of documentation plan I want (yet in French), as I learn 
R, and guess I may complete my base learning in one or two years from 
now (hoping I'll stay courageous enough).  If I then get something 
usable or shareable enough, I'll offer it -- because I like returning 
a little something for the nice tools given to me!  :-)

In any case, thanks for listening!

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-05 Thread François Pinard

[Jonathan Baron]

 [the current reference manual] is organised by library and, within 
 each library, by function name: this organisation means that the 
 manual is mainly used as a reference, or else, that it ought to be 
 studied from cover to cover, dauntingly.

I think that many search facilities are helpful here: [...]
help.search() [...] 2. RSiteSearch() [...]

Sure they are!  Yet, we do not all learn or work the same way.  Given
full choice, I prefer reading a reference than go fish for information,
as this tends to build stronger information nets within my brain :-).

I doubt that the sort of manual you describe is possible given the very
rapid growth of CRAN, and it would be really inadequate if it did not
include those packages.

The current reference manual does not cover CRAN, and even if it does 
not, I would not be tempted to qualify it as inadequate (at least for 
the novice I am).  There seems to be a lot to know about R, initially 
as a language, and then, for learning to shuffle and organise data in 
preparation for later processing.  I would guess every new R user has to 
learn his way in there.  The current reference says a lot, but is big to 
grasp as it stands, its organisation is not as helpful as it could for 
learning and retaining.

The kind of manual I described seems possible to me, because it could be
mechanically derived out of a plan, and the derivation mechanics could
diagnose what is being forgotten (this could even yield some Unsorted
functions chapter or appendix).  The mechanic could be made general
enough to accept glue text at appropriate places.  [Not completely
dissimilar to, for those who happen to remember it, the way C code was
mechanically derived out of Pascal, initially, for Knuth's TeX.]

Many of [CRAN packages] are designed for people in particular fields
and turn out to be extremely useful.

Undoubtedly!  I envy you all, who know already! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread François Pinard

[ronggui]

R's week when handling large data file.  I has a data file : 807 vars,
118519 obs.and its CVS format.  Stata can read it in in 2 minus,but In
my PC,R almost can not handle. my pc's cpu 1.7G ;RAM 512M.

Just (another) thought.  I used to use SPSS, many, many years ago, on 
CDC machines, where the CPU had limited memory and no kind of paging 
architecture.  Files did not need to be very large for being too large.

SPSS had a feature that was then useful, about the capability of 
sampling a big dataset directly at file read time, quite before 
processing starts.  Maybe something similar could help in R (that is, 
instead of reading the whole data in memory, _then_ sampling it.)

One can read records from a file, up to a preset amount of them.  If the 
file happens to contain more records than that preset number (the number 
of records in the whole file is not known beforehand), already read 
records may be dropped at random and replaced by other records coming 
from the file being read.  If the random selection algorithm is properly 
chosen, it can be made so that all records in the original file have 
equal probability of being kept in the final subset.

If such a sampling facility was built right within usual R reading 
routines (triggered by an extra argument, say), it could offer 
a compromise for processing large files, and also sometimes accelerate 
computations for big problems, even when memory is not at stake.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Suggestion for big files [was: Re: A comment about R:]

2006-01-05 Thread François Pinard

[Brian Ripley]

I rather thought that using a DBMS was standard practice in the 
R community for those using large datasets: it gets discussed rather 
often.

Indeed.  (I tried RMySQL even before speaking of R to my co-workers.)

Another possibility is to make use of the several DBMS interfaces already 
available for R.  It is very easy to pull in a sample from one of those, 
and surely keeping such large data files as ASCII not good practice.

Selecting a sample is easy.  Yet, I'm not aware of any SQL device for 
easily selecting a _random_ sample of the records of a given table.  On 
the other hand, I'm no SQL specialist, others might know better.

We do not have a need yet for samples where I work, but if we ever need 
such, they will have to be random, or else, I will always fear biases.

One problem with Francois Pinard's suggestion (the credit has got lost) 
is that R's I/O is not line-oriented but stream-oriented.  So selecting 
lines is not particularly easy in R.

I understand that you mean random access to lines, instead of random 
selection of lines.  Once again, this chat comes out of reading someone 
else's problem, this is not a problem I actually have.  SPSS was not 
randomly accessing lines, as data files could well be hold on magnetic 
tapes, where random access is not possible on average practice.  SPSS 
reads (or was reading) lines sequentially from beginning to end, and the 
_random_ sample is built while the reading goes.

Suppose the file (or tape) holds N records (N is not known in advance), 
from which we want a sample of M records at most.  If N = M, then we 
use the whole file, no sampling is possible nor necessary.  Otherwise, 
we first initialise M records with the first M records of the file.  
Then, for each record in the file after the M'th, the algorithm has to 
decide if the record just read will be discarded or if it will replace 
one of the M records already saved, and in the latter case, which of 
those records will be replaced.  If the algorithm is carefully designed, 
when the last (N'th) record of the file will have been processed this 
way, we may then have M records randomly selected from N records, in 
such a a way that each of the N records had an equal probability to end 
up in the selection of M records.  I may seek out for details if needed.

This is my suggestion, or in fact, more a thought that a suggestion.  It 
might represent something useful either for flat ASCII files or even for 
a stream of records coming out of a database, if those effectively do 
not offer ready random sampling devices.


P.S. - In the (rather unlikely, I admit) case the gang I'm part of would 
have the need described above, and if I then dared implementing it 
myself, would it be welcome?

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] A comment about R:

2006-01-04 Thread François Pinard

[David Forrest]

[...] A few end-to-end tutorials on some interesting analyses would be
helpful.

I'm in the process of learning R.  While tutorials are undoubtedly very 
useful, and understanding that working and studying methods vary between 
individuals, what I (for one) would like to have is a fairly complete 
reference manual to the library.

Of course, we already have one, and that's marvellous already.  Yet, it 
is organised by library and, within each library, by function name: this
organisation means that the manual is mainly used as a reference, or 
else, that it ought to be studied from cover to cover, dauntingly.

The very same material could be organised by topics.  Chapters could be 
named like General Help, Language features, Data types, Data 
Handling, Input/Output, Graphics, Statistics, and such.  The 
chapter Language features, to take one example, could hold sections 
like Expressions, Statements, Functions, Environments, 
Packages, Execution and Debugging.  Sections could then hold 
current reference pages.  References by library and/or by function name 
could be stated either in appendices or as a general index at the end.

For those who happen to know it, I find the Emacs Lisp Reference 
Manual to be a good example for organising, in a very usable way,
a comprehensive reference to a flurry of library functions.  When one 
needs string handling functions, they are likely grouped together in the 
manual, and are likely all present.  A tutorial, by comparison, usually 
presents a subset, or even a tiny subset, of what is available.

Any volunteers?

Not me, or at least, not before quite a long while.  The overall 
organisation of a reference should not be handled by beginners.  On the 
contrary, it rather requires someone who has comprehensive knowledge of 
all the material to be considered.

Just an idea.  A good work plan would be to establish a new structure 
for a reference manual, and once competent people (or this community as 
a whole) agrees on a structure, to develop mechanical means for 
generating a reference manual out of the current material.  The 
mechanism should likely allow for added glue text, about everywhere 
reasonable, and for diagnosing any lone, unreachable page in the current 
reference.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] use of tapply?

2005-12-29 Thread François Pinard

[tom wright]

 I'm still learning how to program with R and I was hoping someone 
 could take the time to show me how I can rewrite this code?

I'll try! :-)

data.intersects-data.frame(
x=c(0.230,0.411,0.477,0.241,0.552,0.230),
y=c(0.119,0.515,0.261,0.431,0.304,0.389),
angle=vector(length=6),
length=vector(length=6),
row.names=c('tbr','trg','dbr','dbg','pbr','pbg'))


calcDist-function(x,y){
#calcualates distance from origin (C)
origin-data.frame(x=0.34,y=0.36)
dx-origin$x-x
dy-origin$y-y

length-sqrt(dx^2+dy^2)
angle-asin(dy/length)
return(list('length'=length,'angle'=angle))
}

for(iLoc in 1:length(data.intersects[,1])){
result-calcDist(data.intersects[iLoc,]$x,data.intersects[iLoc,]$y)
data.intersects[iLoc,]$angle-result$angle
data.intersects[iLoc,]$length-result$length
}

Using `di' instead of `data.intersects' for short:

di - data.frame(x=c(0.230, 0.411, 0.477, 0.241, 0.552, 0.230),
 y=c(0.119, 0.515, 0.261, 0.431, 0.304, 0.389),
 row.names=c('tbr', 'trg', 'dbr', 'dbg', 'pbr', 'pbg'))
di.c - with(di, data.frame(x=x-0.34,  y=y-0.36))
di$length - with(di.c, sqrt(x^2 + y^2))
di$angle - with(di.c, atan2(y, x))

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] HOW to Create Movies with R with repeated plot()?

2005-07-27 Thread François Pinard

[Jan Verbesselt]

 Is it possible to create a type of 'movie' in R based on the output
 of several figures (e.g., jpegs) via the plot() function.  I obtained
 dynamic results with the plotting function and would like to save
 these as a movie (e.g., avi or other formats)?

You may also peek at an actual example of using R for mini-movies:

   http://pinard.progiciels-bpi.ca/plaisirs/animations/index.html

I wrote this toy about the same week I started to learn R, and it was a
hell of a good exercise for the poor little me! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R: to the power

2005-07-16 Thread François Pinard

[Thomas Lumley]

 It would be nice if R could realize that you meant the cube root
 of -8, but that requires either magical powers or complicated and
 unreliable heuristics.  The real solution might be a function like
 root(x,a,b) to compute x^(a/b), where a and b could then be exactly
 representable integers. If someone wants to write one...

While this could be done with moderate difficulty for the simpler cases,
one cannot reasonably ask R to be and do everything. :-)

So far, I see R more on the numerical side of things.  If you want
precise, exact solutions to various mathematical problems, you might
consider installing a Computer Algebra System on your machine, next to
R, for handling the symbolic side of things.

One such system which is both free and very capable might be Maxima.
Its convoluted story is rooted 40 years in the past.  Some may say it
lacks some chrome and be mislead; don't be, the engine is pretty solid.
Peek at http://maxima.sourceforge.net if you think you need such a
beast.  Beware: to use it, you need either GCL or Clisp pre-installed.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] Rpy and RSPython

2005-06-10 Thread François Pinard

[Weiwei Shi]

 I am thinking to use one of them but not sure which one is better. I
 think Rpy cannot call python from R while the PRPython can in
 two-directional calling.  Am I right?

s/PRPython/RSPython/ ? :-)

This is also what I understood.  Yet, despite the uni-directionality of
RPy, this is what I chose for my personal usage (probably more handy to
use, or easy to install -- but the main point was that RPy guaranteed
to be more stable and never crash!).  I think I recently read somewhere
that they were plans for undusting RSPython, and then said to myself:
Should re-evaluate once done..

Surely that for now, RPy is quite sufficient for my simple needs.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R annoyances

2005-05-20 Thread François Pinard

[Barry Rowlingson]

 Even my great dream that R and Python eventually merge into the same
 language?  R gets Python's syntax and Object-oriented functions and
 Python gets access to all R's statistical functions?

R is more than a statistical library.  I'm coming to R with a strong
Python background, and first thought I would mainly use R through
Python.  But soon, the R language revealed a few interesting features
that Python does not offer, and which are very appropriate in R context.

For example, vectorisation is built-in (yet available on the Python
side through Numeric or Numarray extensions).  R also holds interesting
(useful and flexible) ideas about argument passing and matching, lazy
evaluation, and environments.  And surely other things as well.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Re: [R] R_LIBS difficulty ?

2005-04-14 Thread François Pinard

[Prof Brian Ripley]
 [François Pinard]

 Now using this line within `~/.Renviron':
   R_LIBS=/home/pinard/etc/R
 my tiny package is correctly found by R.  However, R does not seem to
 see any library within that directory if I rather use either of:
   R_LIBS=$HOME/etc/R
   R_LIBS=$HOME/etc/R

 Correct, and as documented.  See the description in ?Startup,
 which says things like ${foo-bar} are allowed but not $HOME, and
 not ${HOME}/bah or even ${HOME}.  But R_LIBS=~/etc/R will work in
 .Renviron since ~ is intepreted by R in paths.

Hello, Brian (or should I rather write Prof Ripley?).

Thanks for having replied.  I was not sure how to read but not, which
could be associated either with which says or are allowed.  My
English knowledge is not fully solid, and I initially read you meant the
later, but it seems the former association is probably the correct one.

The fact is the documentation never says that `$HOME' or `${HOME} are
forbidden.  It is rather silent on the subject, except maybe for this
sentence: value is processed in a similar way to a Unix shell in the
Details section, which vaguely but undoubtedly suggests that `$HOME' and
`${HOME}' might be allowed.  Using `~/' is not especially documented
either, except from the Examples section, where it is used.  I probably
thought it was an example of how shell-alike R processes `~/.Renviron'.

 The last writing (I mean, something similar) is suggested somewhere in
 the R manuals (but I do not have the manual with me right now to give
 the exact reference, I'm in another town).

 It is not mentioned in an R manual, but it is mentioned in the FAQ.

I tried checking in the FAQ.  By the way, http://www.r-project.org
presents a menu on the left, and there is a group of items under the
title `Documentation'.  `FAQs' is shown under that title, but is not
clickable.  I would presume it was meant to be?  However, the `Other'
item is itself clickable, and offers a link to what appears to be an
FAQs page.

The only thing I saw, in item 5.2 of the FAQ (How can add-on packages be
installed?) says that one may use `$HOME/' while defining `R_LIBS' in a
Bourne shell profile, or _preferably_ use `~/` while defining `R_LIBS'
within file `~/.Renviron`.  The FAQ does not really say that `$HOME' is
forbidden.  The FAQ then refers to `?Startup' for more information, and
`?Startup' is not clear on this thing, in my opinion at least.

 R_LIBS=$HOME/etc/R will work in a shell (and R_LIBS=~/etc/R may not).

 Another hint that it could be expected to work is that the same
 `~/.Renviron' once contained the line:

   R_BROWSER=$HOME/bin/links

 which apparently worked as expected.  (This `links' script launches
 the real program with `-g' appended whenever `DISPLAY' is defined.)

 Yes, but that was not interpreted by R, rather a shell script called by R.

Granted, thanks for pointing this out.

The documentation does not really say either (or else I missed it) if
the value of R_BROWSER is given to exec, or given to an exec'ed shell.
If a shell is called, it means in particular that we can use options,
and this is a useful feature, worth being known I guess.

Once again, thanks for having replied, and for caring.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R_LIBS difficulty ?

2005-04-10 Thread François Pinard

Hi, R people.

I'm shy reporting this, as a bug in this area sounds very unlikely.  Did
I make my tests wrongly?  I'm still flaky at all this.  Let me dare
nevertheless, who knows, just in case...  Please don't kill me! :-)

Not so long ago, I wrote to this list:

 (For now, [the library code] works only for me when I do _not_ use `-l
 MY/OWN/LIBDIR' at `R CMD INSTALL' time, I surely made a simple blunder
 somewhere.  Hopefully, I'll figure it out.)

Now using this line within `~/.Renviron':

   R_LIBS=/home/pinard/etc/R

my tiny package is correctly found by R.  However, R does not seem to
see any library within that directory if I rather use either of:

   R_LIBS=$HOME/etc/R
   R_LIBS=$HOME/etc/R

The last writing (I mean, something similar) is suggested somewhere in
the R manuals (but I do not have the manual with me right now to give
the exact reference, I'm in another town).

Another hint that it could be expected to work is that the same
`~/.Renviron' once contained the line:

   R_BROWSER=$HOME/bin/links

which apparently worked as expected.  (This `links' script launches the
real program with `-g' appended whenever `DISPLAY' is defined.)

This is R 2.0.1, installed on SuSE 9.2.

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R-generated animation of a polynomiograph

2005-04-08 Thread François Pinard

Hi, people.

Two days ago, I sent to this list a little toy for exploring
polynomiographs (yet, the mathematical formulas were not polynomials
anymore, so the name is not really appropriate).

After studying R calls, expressions and functions a bit more, I gave
myself the homework of producing an animation out of my recent toy.  The
resulting animation, and also the sources, are available at:

http://pinard.progiciels-bpi.ca/plaisirs/nr-anim-01.html

P.S. - My intent was studying R, much more than producing art :-). I'm
sure anyone could do nicer, playing a few hours with this!

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] Polynomiographic function in R :-)

2005-04-06 Thread François Pinard

Hi, people.  Nothing too serious in this message.  Nevertheless, all
criticism or advice is welcome :-).

Yesterday, I went to a conference by Bahman Kalantari (Rutgers
University) about Polynomiography (the Fine Art and Science of
Visualizing Polynomials).  Since I'm starting my R learning, I decided
to try using it for computing some (any!) polynomiograph.  I was
surprised about how easy and quick it was to get some results.  Then I
thought it was interesting to extend the drawing to any function, and
not necessarily polynomials, yielding the function below:


polygraph - function(expression, xrange=c(-1, 1), yrange=c(-1, 1),
  points=200, steps=20, display=image)
{
expression - substitute(expression)
variable - all.vars(expression)
stopifnot(length(variable) == 1)
derivative - D(expression, variable)
name - as.name(variable)
expression - substitute(name - expression / derivative)
assign(variable, outer(seq(xrange[1], xrange[2], length=points),
   seq(yrange[1], yrange[2], length=points) * 1i,
   '+'))
for (step in 1:steps) {
display(Arg(eval(name)))
assign(variable, eval(expression))
}
}


which can be used this way, picking a function almost at random, say:


polygraph(x^3 - sqrt(x) - 1, points=300)


Here are a few random thoughts or remarks:

* Once fully converged, there should be only one colour per root.  Each
pixel colour shows towards which root would converge the chosen root
finding algorithm, starting at this particular point, or complex number.

* Another nice choice for `display' could be `filled.contour', yet it
computes more slowly.

* The successive plots (20 by default) show the progressive refinement
while finding equation roots, making a kind of animation.  One might
prefer moving the `display' call out of the loop, and show only the last
refinement.

* I did not know that root finding through Newton-Raphson could be
merely extended to complex numbers, fun to see that it works! :-)

* The conferencer told us that there a _lot_ of root finding algorithms,
and they may yield different styles of art.  I only picked the simplest
one to play with.  But you might do better!  (There are also many other
approaches than root finding for producing graphs out of polynomials.)

* Really, the one thing that most amused me in this experiment is how I
could use R for symbolically preparing the computation to do, without
resorting to parsing and deparsing (which I'm instinctively tempted to
avoid.)  I'm quite far from understanding all I should about functions,
expressions, calls and parse trees, but even knowing very little, it was
satisfying being able to rather quickly debug the above function.

* There are likely better ways than those I used.  For example, even if
unlikely, there might be clashes between the variables making up the
expression given, and local variables of the function.  I wonder if the
expression variable could have been more fully abstracted.

* Vectorisation worked surpringly well on that problem, speed-wise.
However, because some regions of the plane converge faster than others
(use `display=plot' and such while calling `polygraph' to study this),
maybe they would be ways towards significant speed-ups.  But since it is
likely that one would loose a good part of vectorisability by doing
so, and add a lot of complexity (with unavoidable bugs in the process),
I wonder how worth it would be in practice.

* Given a matrix of complex results, they should ideally be turned
into N groups, each group being related to one of the N roots of the
equation.  I tried producing factors out of these results, but numerical
approximation made that non-practical.  I would guess that clustering,
which I do not know, may be seen as a way to produce factors fuzzily.

* As a counter-measure to the above difficulty, I used `Arg()' as a way
to produce levels out of the results.  Could have used `Im()' instead.
It seems that `Mod()' and `Re()' are less productive. `image' is kind
enough to turn those levels into colours without any effort from me!


All in all, it is a fun way to explore R capabilities, and it also opens
up all kind of ideas to toy with! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

[R] R mailing list archive difficulty

2005-04-01 Thread François Pinard

Hi, people!  This is my first babble on this list, please be kind! :-)

Last Tuesday, I wrote to the (likely) Webmaster of the R site to report
a little problem, but also to ask for advice about how to get a bulk
copy of the mailing list archives, from 2002 to now.

While I quite understand that from Tuesday to now, there has been little
time, and it is only normal that I did not receive a reply yet, I dare
re-submitting the same question (message appended below) to this list,
in hope someone will reply sooner.  I see some bits of free time, this
week-end, and would like using them, if possible, at the tedious work of
getting those archives, so it sooner gets behind me, instead of ahead...

  Thanks to all.  Enjoy the spring! :-)

-- 
François Pinard   http://pinard.progiciels-bpi.ca
---BeginMessage---
Robert King, hello!

The page `http://www.r-project.org' gives your name as a contact.


The link `http://www.r-project.org/doc/FAQ/R-FAQ.html', near the end of
the page, labelled `Frequently Asked Questions', does not resolve, giving:

Not Found
The requested URL /doc/FAQ/R-FAQ.html was not found on this server.

Apache/1.3.26 Server at www.r-project.org Port 80


I would like to get hold on a copy of R mailing lists archives, for
local, off-line, progressive perusal (I find Web-based browsing of
email extremely inefficient).  So, I recursively got archives from
`ftp://ftp.stat.math.ethz.ch/Mail-archives/'.  The format used in these
files is quite usable locally.  However, the problem is that these
archives do not go beyond 2002.  (Maybe the `http://www.r-project.org'
Web page should mention this.)

Would you be kind enough to advise me with a (simple) way by which I
would get in bulk all R archives from 2003 up to now?  The simplest
format, the better, of course, yet I feel ready to locally reformat HTML
if this is the only format you have at your end.


I'm discovering R with a lot of pleasure, and some fear as well :-).
There is in there an wholly impressive amount of work and knowledge.

  Thanks, and keep happy!

-- 
François Pinard   http://pinard.progiciels-bpi.ca
---End Message---
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

85 matches

Mail list logo