Re: [Rd] Request: Documentation of formulae

2008-06-03 Thread Martin Maechler
 MP == Mike Prager [EMAIL PROTECTED]
 on Mon, 02 Jun 2008 16:29:16 -0400 writes:

MP Mike Prager [EMAIL PROTECTED] wrote:
 I was at a loss to understand the use of / until I looked in
 An Introduction [!] to R, where I found the explanation. 
 
 My request is that more complete material on model formulae be
 lifted from Introduction to R (or elsewhere) and put into the
 main online help files. 

MP I also request the R Core consider remaining An Introduction to
MP R to something like R User's Guide. It spans 100 pages and
MP treats many topics far beyond the introductory level. I was
MP surprised at the wealth of information it contains, and I expect
MP that I would have checked it first, not last, among available
MP resources had it been more accurately named.

Hi Mike, you make very worthy suggestions; but I assume the word
request is really putting off almost all of us R corers.
You *have* heard that R is a volunteer project, that much of its
development has happened in unpaid time of core team mates.
Why are you not using Suggestion instead?

Also, since it's a volunteer project, and you are a capable R
user, you could even consider sending patches, e.g. to the 
formula.Rd  help page.

At last, to the renaming suggestion: I see your point, but

1) in an Intro one is allowed to be non-comprehensive
   where as a user's guide is supposedly touching almost
   everything relevant.  This is not the case, since R has much
   evolved since the An Intro .. had been compiled, and most
   new features are not mentioned in the intro.

2) I don't know your background, but in the math sciences, there
   exist quite a few comprehensive textbooks called An
   Introduction to 
   One example from my bookshelf is
   @BOOK{AndT84,
 author =   {T. W. Anderson},
 title ={An Introduction to Multivariate Statistical Analysis},
 publisher = Wiley,
 address =  NY,
 year = 1984
   }


3) An Introduction to R is mentioned in many many pieces of
   documentation about R; from instructions to students, to
   online guides, to books, etc.

   For this reason, a new title should reall still contain the
   term.

   Maybe simply
An Introduction to R --- User's guide
?
   
Best regards,
Martin Maechler, ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread pmc1
To reply to my own message, that function wasn't quite right. I think
this one works better:

signif.string - function(signum,sigdigs){
  test - abs(signum)
  left - nchar(trunc(test))
  right - nchar(test)-left-1
  if (test1) {left-left-1}
  if (right0) {right-0}
  if (sigdigsleft) {out-as.character(signif(signum,digits=sigdigs))}
  else if (sigdigs==left  trunc(signum) %% 10 == 0)
{out-paste(round(signum),.,sep=)}
  else if (sigdigs=left+right) {out-format(signum,digits=sigdigs)}
  else {out-sprintf(paste(%.,sigdigs-left,f,sep=),signum)}
  return(noquote(out))
}

But it should still have error checking and vector capability, yadda
yadda. Also, I forgot what year it was, so sorry, Scott, for spamming
you with something you're hopefully not still stuck on.

Pat Carr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] significant digits (PR#9682)

2008-06-03 Thread pmc1
I came to report this same bug and found it already in the trash, but
I slightly disagree with that assessment. If it's not a bug, then
perhaps it's a feature request. Comments at the end.

On Mon, May 14, 2007, Duncan Murdoch wrote:
On 13/05/2007 8:46 PM, [EMAIL PROTECTED] wrote:

 In the example below round() does not report to the specified number of
 digits when the last digit to be reported is zero: Compare behaviour for
 0.897575 and 0.946251. Ditto for signif(). The number of sigfigs is
 ambiguous unless the reader knows this behaviour. Is this a bug or
 intended behaviour? Is there a work-around?

 It's not a bug.  It has nothing to do with round(), it is the way R
 prints numbers by default.  If you ask to print 0.90, you'll get

 [1] 0.9

 because 0.9 and 0.90 are the same number.  If you want trailing zeros to
 print, you need to specify a format to do that, e.g.

  noquote(format(0.9, nsmall=2))
 [1] 0.90

 The noquote stops the  from printing.  You could also use sprintf() or
 formatC() for more C-like format specifications.

All of those options require you to specify the number of digits after
the decimal, don't they? Unless sprintf would default to the number of
decimal places passed to it, but it doesn't:

 sprintf(%f,signif(0.90, digits=2))
[1] 0.90;

it defaults to 6. Although %g behaves differently,

 sprintf(%g,signif(0.90, digits=2))
[1] 0.9,

this clearly still isn't the desired behavior.

To continue that vein, the same issue with rounding versus printing
occurs with vectors:

   sapply(c(1:6),function(a){signif(c(18.423,0.90),digits=a)})
  [,1] [,2] [,3]  [,4]   [,5]   [,6]
 [1,] 20.0 18.0 18.4 18.42 18.423 18.423
 [2,]  0.9  0.9  0.9  0.90  0.900  0.900

Trying to get that and more complicated tables to print the correct
number of significant digits gets pretty hairy with sprintf(). I could
be wrong, but I would view the primary purpose of rounding to
significant digits the printed output of the number. That there
doesn't seem to be a way to do this without later having to specify
the number of decimal places would seem to render signif() as it's
written not particularly useful.

There are two solutions I can think of off the top of my head. The
first is to create a new data type of a fixed length real number but
the easier way would be to have a function that returns a string
something like this:

signif.string - function(signum,sigdigs){
  left - nchar(trunc(signum))
  right - nchar(signum-trunc(signum))-2
  if (abs(signum)1 | signum0) {left-left-1}
  if (right0) {right-0}
  if (sigdigsleft) {return(as.character(signif(signum,digits=sigdigs)))}
  else if (sigdigs==left) {return(paste(round(signum),.,sep=))}
  else if (sigdigs=left+right) {return(format(signum,digits=sigdigs))}
  else {return(sprintf(paste(%.,sigdigs-left,f,sep=),signum))}
}

This is just a skeleton that I think suits my needs for the moment and
might also cover the original poster's. One for production would need
to handle scientific notation and would probably want to fix the 5
round behavior on windows.

Pat Carr

version.string R version 2.7.0 (2008-04-22)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Ludo Pagie
recently there was a post on R-help/Rd ?? with this link on benchmarking 
different 'number crunching packages'. They used a series of tests, 
although I didn't check they used all the types you mentioned. I couldn't 
find test code at first glance but myyebe it is available on request???


http://www.sciviews.org/benchmark/

Ludo





On Mon, 2 Jun 2008, Mark Kimpel wrote:


Recently I posted to this list with a question about using the Intel 10.1
compilers in building R and one response was basically, why in the heck
would you want to do that? The answer is that my sysadmin believes that
there will be a performance boost with the Intel vs. Gnu compilers on our
Linux cluster, of which I am one of many users. Wanting to be a good citizen
and use my machine time wisely, I'd of course like to use right tool to
build the most efficient installation of R and associated packages. BTW, we
got R to compile nicely using the settings at the end of this post.

Looking back on previous posts, however, it seems that there is no consensus
as to how to benchmark R. I realize such a task is not trivial, nor
controversial, but does anyone have a set of time-consuming tasks that can
be used to compare R installations? It would seem logical that such a
benchmark would include sub-benchmarks on file access, interpreted intensive
tasks, C intensive tasks, BLAS intensive tasks, etc. You developers know
more about this than I do, but I know enough to realize that there won't be
one simple answer. Nevertheless, I'd like to make my usage decisions on
something rather than anedotal claims.

So, does anyone know of a good benchmarking script or would be willing to
contribute one?

And here are the settings we used to compile R with Intel 10.1 compilers:

../configure --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
--with-system-zlib=/usr/lib64 --with-system-bzlib=/usr/lib64 \
--with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64 --with-tcltk \
--with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/tclConfig.sh \
--with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/tkConfig.sh \
--without-x --without-readline --without-iconv \
CC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc \
CFLAGS=-O3 -no-prec-div -unroll \
F77=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort \
FFLAGS=-O3 -no-prec-div -unroll \
CXX=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icpc \
CXXFLAGS=-O3 -no-prec-div -unroll \
FC=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort \
FCFLAGS=-O3 -no-prec-div -unroll \
OBJC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc \
OBJCFLAGS=-O3 -no-prec-div -unroll \
--disable-R-profiling --disable-memory-profiling
##
make all
make install

Mark

--
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R CMD check in R 2.8.0 checks also .svn folder

2008-06-03 Thread Matthias Kohl

Dear developers,

we develop our packages via r-forge and svn. Under R version 2.8.0 Under 
development (unstable) (2008-06-02 r45826) I now observe the following 
warning for our package distrEx


* checking for executable files ... WARNING
Found the following executable file(s):
 src/.svn/text-base/distrEx.dll.svn-base
Source packages should not contain undeclared executable files.
See section 'Package structure' in manual 'Writing R Extensions'.

Hence, R CMD check also checks the .svn-folder. I don't mind to get this 
warning. I just would like to know if this is intended and if this might 
have an influence on package building. (This warning doesn't occur under 
R version 2.7.0 Patched (2008-06-03 r45828))


My session info is:
R version 2.8.0 Under development (unstable) (2008-06-02 r45826)
i686-pc-linux-gnu

locale:
LC_CTYPE=de_DE.UTF-8;LC_NUMERIC=C;LC_TIME=de_DE.UTF-8;LC_COLLATE=de_DE.UTF-8;LC_MONETARY=C;LC_MESSAGES=de_DE.UTF-8;LC_PAPER=de_DE.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=de_DE.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Best regards,
Matthias

--
Dr. Matthias Kohl
www.stamats.de

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: Documentation of formulae

2008-06-03 Thread S Ellison
Rather than transport quantities of the Introduction to R (a perfectly
sensible title for a very good starting point, IMHO) would it not be
simpler and involve less maintenance to include a link or
cross-reference in the 'formula' help page to the relevant part of the
Introduction? If nothing else, that might get folk to look at the
Introduction if they've bypassed it 

Steve E

 Martin Maechler [EMAIL PROTECTED] 03/06/2008 07:48:46

 MP == Mike Prager [EMAIL PROTECTED]
MP Mike Prager [EMAIL PROTECTED] wrote:
 I was at a loss to understand the use of / until I looked in
 An Introduction [!] to R, where I found the explanation. 
 
 My request is that more complete material on model formulae be
 lifted from Introduction to R (or elsewhere) and put into the
 main online help files. 


***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Duncan Murdoch

[EMAIL PROTECTED] wrote:

I came to report this same bug and found it already in the trash, but
I slightly disagree with that assessment. If it's not a bug, then
perhaps it's a feature request. Comments at the end.

On Mon, May 14, 2007, Duncan Murdoch wrote:
  

On 13/05/2007 8:46 PM, [EMAIL PROTECTED] wrote:

In the example below round() does not report to the specified number of
digits when the last digit to be reported is zero: Compare behaviour for
0.897575 and 0.946251. Ditto for signif(). The number of sigfigs is
ambiguous unless the reader knows this behaviour. Is this a bug or
intended behaviour? Is there a work-around?
  

It's not a bug.  It has nothing to do with round(), it is the way R
prints numbers by default.  If you ask to print 0.90, you'll get

[1] 0.9

because 0.9 and 0.90 are the same number.  If you want trailing zeros to
print, you need to specify a format to do that, e.g.



noquote(format(0.9, nsmall=2))
  

[1] 0.90

The noquote stops the  from printing.  You could also use sprintf() or
formatC() for more C-like format specifications.



All of those options require you to specify the number of digits after
the decimal, don't they? Unless sprintf would default to the number of
decimal places passed to it, but it doesn't:
  


That specification doesn't make sense.  There is no number of decimal 
places passed to it.  What sprintf() sees below is identical to what it 
would see if you called


sprintf(%f, 0.9)

because signif(0.90, digits=2) == 0.9.  Those two objects are identical.

 sprintf(%f,signif(0.90, digits=2))
[1] 0.90;

it defaults to 6. Although %g behaves differently,

 sprintf(%g,signif(0.90, digits=2))
[1] 0.9,

this clearly still isn't the desired behavior.
  


Maybe not what you desired, but certainly reasonable behaviour.  

To continue that vein, the same issue with rounding versus printing
occurs with vectors:

   sapply(c(1:6),function(a){signif(c(18.423,0.90),digits=a)})
  [,1] [,2] [,3]  [,4]   [,5]   [,6]
 [1,] 20.0 18.0 18.4 18.42 18.423 18.423
 [2,]  0.9  0.9  0.9  0.90  0.900  0.900

Trying to get that and more complicated tables to print the correct
number of significant digits gets pretty hairy with sprintf(). I could
be wrong, but I would view the primary purpose of rounding to
significant digits the printed output of the number. That there
doesn't seem to be a way to do this without later having to specify
the number of decimal places would seem to render signif() as it's
written not particularly useful.

There are two solutions I can think of off the top of my head. The
first is to create a new data type of a fixed length real number but
the easier way would be to have a function that returns a string
something like this:

signif.string - function(signum,sigdigs){
  left - nchar(trunc(signum))
  right - nchar(signum-trunc(signum))-2
  if (abs(signum)1 | signum0) {left-left-1}
  if (right0) {right-0}
  if (sigdigsleft) {return(as.character(signif(signum,digits=sigdigs)))}
  else if (sigdigs==left) {return(paste(round(signum),.,sep=))}
  else if (sigdigs=left+right) {return(format(signum,digits=sigdigs))}
  else {return(sprintf(paste(%.,sigdigs-left,f,sep=),signum))}
}

  
This is just a skeleton that I think suits my needs for the moment and

might also cover the original poster's. One for production would need
to handle scientific notation and would probably want to fix the 5
round behavior on windows.
  


As far as I know, rounding is fine in Windows:

 round(1:10 + 0.5)
[1]  2  2  4  4  6  6  8  8 10 10

looks okay to me.  If you want biased rounding instead (getting 2:11 as 
output), simply use trunc(x + 0.5) instead of round(x).


Duncan Murdoch

Pat Carr

version.string R version 2.7.0 (2008-04-22)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: Documentation of formulae

2008-06-03 Thread Duncan Murdoch

S Ellison wrote:

Rather than transport quantities of the Introduction to R (a perfectly
sensible title for a very good starting point, IMHO) would it not be
simpler and involve less maintenance to include a link or
cross-reference in the 'formula' help page to the relevant part of the
Introduction? If nothing else, that might get folk to look at the
Introduction if they've bypassed it 
  


Yes, that would be much simpler.  I am hoping that help.search() will 
soon be able to look through the indices and tables of contents of the 
manuals, so that will be another way to discover the documentation.


Changing the title of the manual would be an unreasonable solution: it 
affects authors of dozens of books, not just R Core, who all have to 
track down every reference to the old title and write some ugly 
disjunction instead.


Duncan Murdoch

Steve E

  

Martin Maechler [EMAIL PROTECTED] 03/06/2008 07:48:46



MP == Mike Prager [EMAIL PROTECTED]


MP Mike Prager [EMAIL PROTECTED] wrote:
 I was at a loss to understand the use of / until I looked in
 An Introduction [!] to R, where I found the explanation. 
 
 My request is that more complete material on model formulae be

 lifted from Introduction to R (or elsewhere) and put into the
 main online help files. 



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Dirk Eddelbuettel
On Tue, Jun 03, 2008 at 11:09:28AM -0400, Mark Kimpel wrote:
 Dirk,
 
 At the moment, our emphasis is getting an installation that will run Rmpi in
 batch mode. I imagine my sysadmin put that line in to minimize potential
 problems. To be honest, I didn't catch it, I was just glad to get a compile
 :)
 
 As to the line that doesn't apply to the R install, I think you are
 referring to the with mpi, which I think he also slipped in not realizing
 it should more properly go with the Rmpi install.

Right. I wasn't implying it would do harm.

As for the x11 choice: I prefer to keep the change vectors
minimal. Otherwise you go nuts trying to debug things.

But it's good to see that you now understand that you have to start
'at the top' with Intel icc, and once you have a working R with you
can start working towards packages.  It's like a port to a different
platform as you completely switch the toolchain.

Good luck and keep us posted. I have looking at icc on the TODO list
too (for my C++ code, though, given R's interactive nature I think
there are lower hanging fruits elsewhere...)

Lastly, as to the benchmarking: It's difficult. Ripley once snarled
that it is probably the application you want to run the most so
there you go ...

Dirk

 
 Mark
 
 On Tue, Jun 3, 2008 at 12:18 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:
 
  On Mon, Jun 02, 2008 at 11:56:16PM -0400, Mark Kimpel wrote:
   ../configure --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
   --with-system-zlib=/usr/lib64 --with-system-bzlib=/usr/lib64 \
   --with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64 --with-tcltk
  \
 
  There is no such option for R's configure.
 
   --with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/tclConfig.sh
  \
   --with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/tkConfig.sh \
   --without-x --without-readline --without-iconv \
 
  So you never want your compute cluster to be able to do an interactive
  plot under x11 ?
 
  Dirk
 
  --
  Three out of two people have difficulties with fractions.
 
 
 
 
 -- 
 Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
 Indiana University School of Medicine
 
 15032 Hunter Court, Westfield, IN  46074
 
 (317) 490-5129 Work,  Mobile  VoiceMail
 (317) 663-0513 Home (no voice mail please)
 
 **

-- 
Three out of two people have difficulties with fractions.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] savePlot() no longer automatically adds an extension to the filename.

2008-06-03 Thread S Ellison
Plaintive squeak: Why the change? 

Some OS's and desktops use the extension, so forgetting it causes
trouble. The new default filename keeps a filetype (as before) but the
user now has to type a filetype twice (once as the type, once as
extension) to get the same effect fo rtheir own filenames. And the
extension isn't then checked for consistency with valid file types, so
it can be mistyped and saved with no warning. Hard to see the advantage
of doing away with it...

Suggestion: Revert to the previous default (extension as type) and
include an 'extension' in the parameter list so that folk who don't want
it can change it and folk who did want it get it automatically.

 
The code would then look something like 

savePlot-function (filename = Rplot, 
type = c(wmf, emf, png, jpg, jpeg, bmp, tif,
tiff, ps, eps, pdf),
device = dev.cur(), 
restoreConsole = TRUE,
extension)   #Added extension
{
type - match.arg(type)
if(missing(extension)) 
   extension - type ##added
devlist - dev.list()
devcur - match(device, devlist, NA)
if (is.na(devcur)) 
stop(no such device)
devname - names(devlist)[devcur]
if (devname != windows) 
stop(can only copy from 'windows' devices)
if (filename == clipboard  type == wmf) 
filename - 
  else
fullname - paste(filename, extension,
sep=ifelse(extension==,,.) )  ##added
invisible(.External(CsavePlot, device, fullname, type,
restoreConsole))##Modded
}

Steve E

PS Yes, I took a while to upgrade from 2.6.x. Otherwise I'd have
squeaked the day I upgraded - like I just did - 'cos I use savePlot a
LOT.



***
This email and any attachments are confidential. Any use...{{dropped:8}}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: Documentation of formulae

2008-06-03 Thread Mike Prager
Martin Maechler [EMAIL PROTECTED] wrote:

 Hi Mike, you make very worthy suggestions; but I assume the word
 request is really putting off almost all of us R corers.
 You *have* heard that R is a volunteer project, that much of its
 development has happened in unpaid time of core team mates.
 Why are you not using Suggestion instead?

Dear Martin,

Thank you for alerting me that my comments could so easily be
taken differently from my intent.  The comments were indeed put
forth as suggestions. 

Thanks also for the characterization as a competent R user.  I
may be that, or nearly so, but time and other resources don't
allow me to contribute changes now. Perhaps in future I will be
in a better position to do so.

With appreciation for R and R-Core,

Mike

-- 
Mike Prager, NOAA, Beaufort, NC
* Opinions expressed are personal and not represented otherwise.
* Any use of tradenames does not constitute a NOAA endorsement.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Mark Kimpel
Dirk,

At the moment, our emphasis is getting an installation that will run Rmpi in
batch mode. I imagine my sysadmin put that line in to minimize potential
problems. To be honest, I didn't catch it, I was just glad to get a compile
:)

As to the line that doesn't apply to the R install, I think you are
referring to the with mpi, which I think he also slipped in not realizing
it should more properly go with the Rmpi install.

Mark

On Tue, Jun 3, 2008 at 12:18 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On Mon, Jun 02, 2008 at 11:56:16PM -0400, Mark Kimpel wrote:
  ../configure --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
  --with-system-zlib=/usr/lib64 --with-system-bzlib=/usr/lib64 \
  --with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64 --with-tcltk
 \

 There is no such option for R's configure.

  --with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/tclConfig.sh
 \
  --with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/tkConfig.sh \
  --without-x --without-readline --without-iconv \

 So you never want your compute cluster to be able to do an interactive
 plot under x11 ?

 Dirk

 --
 Three out of two people have difficulties with fractions.




-- 
Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN  46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Simon Urbanek


On Jun 3, 2008, at 3:58 AM, Ludo Pagie wrote:

recently there was a post on R-help/Rd ?? with this link on  
benchmarking different 'number crunching packages'. They used a  
series of tests, although I didn't check they used all the types you  
mentioned. I couldn't find test code at first glance but myyebe it  
is available on request???


http://www.sciviews.org/benchmark/



It's quite outdated and doesn't work with the current R versions, but  
I have an updated version that works. I have put some benchmarks I'm  
aware of at

http://r.research.att.com/benchmarks/

Cheers,
Simon





On Mon, 2 Jun 2008, Mark Kimpel wrote:

Recently I posted to this list with a question about using the  
Intel 10.1
compilers in building R and one response was basically, why in the  
heck
would you want to do that? The answer is that my sysadmin believes  
that
there will be a performance boost with the Intel vs. Gnu compilers  
on our
Linux cluster, of which I am one of many users. Wanting to be a  
good citizen
and use my machine time wisely, I'd of course like to use right  
tool to
build the most efficient installation of R and associated packages.  
BTW, we

got R to compile nicely using the settings at the end of this post.

Looking back on previous posts, however, it seems that there is no  
consensus

as to how to benchmark R. I realize such a task is not trivial, nor
controversial, but does anyone have a set of time-consuming tasks  
that can

be used to compare R installations? It would seem logical that such a
benchmark would include sub-benchmarks on file access, interpreted  
intensive
tasks, C intensive tasks, BLAS intensive tasks, etc. You developers  
know
more about this than I do, but I know enough to realize that there  
won't be
one simple answer. Nevertheless, I'd like to make my usage  
decisions on

something rather than anedotal claims.

So, does anyone know of a good benchmarking script or would be  
willing to

contribute one?

And here are the settings we used to compile R with Intel 10.1  
compilers:


../configure --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
--with-system-zlib=/usr/lib64 --with-system-bzlib=/usr/lib64 \
--with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64 --with- 
tcltk \
--with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/ 
tclConfig.sh \
--with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/ 
tkConfig.sh \

--without-x --without-readline --without-iconv \
CC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc \
CFLAGS=-O3 -no-prec-div -unroll \
F77=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort \
FFLAGS=-O3 -no-prec-div -unroll \
CXX=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icpc \
CXXFLAGS=-O3 -no-prec-div -unroll \
FC=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort \
FCFLAGS=-O3 -no-prec-div -unroll \
OBJC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc \
OBJCFLAGS=-O3 -no-prec-div -unroll \
--disable-R-profiling --disable-memory-profiling
##
make all
make install

Mark

--
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] savePlot() no longer automatically adds an extension to the filename.

2008-06-03 Thread Gabor Grothendieck
Not sure if this is sufficient but note that if you leave the filename
off entirely then the extension does default to the type.

savePlot() # wmf
savePlot(type = jpg)

 args(savePlot)
function (filename = paste(Rplot, type, sep = .), type = c(wmf,
emf, png, jpg, jpeg, bmp, tif, tiff, ps,
eps, pdf), device = dev.cur(), restoreConsole = TRUE)

https://svn.r-project.org/R/trunk/src/library/grDevices/R/windows/windows.R

On Tue, Jun 3, 2008 at 11:22 AM, S Ellison [EMAIL PROTECTED] wrote:
 Plaintive squeak: Why the change?

 Some OS's and desktops use the extension, so forgetting it causes
 trouble. The new default filename keeps a filetype (as before) but the
 user now has to type a filetype twice (once as the type, once as
 extension) to get the same effect fo rtheir own filenames. And the
 extension isn't then checked for consistency with valid file types, so
 it can be mistyped and saved with no warning. Hard to see the advantage
 of doing away with it...

 Suggestion: Revert to the previous default (extension as type) and
 include an 'extension' in the parameter list so that folk who don't want
 it can change it and folk who did want it get it automatically.


 The code would then look something like

 savePlot-function (filename = Rplot,
type = c(wmf, emf, png, jpg, jpeg, bmp, tif,
 tiff, ps, eps, pdf),
device = dev.cur(),
restoreConsole = TRUE,
extension)   #Added extension
 {
type - match.arg(type)
if(missing(extension))
   extension - type ##added
devlist - dev.list()
devcur - match(device, devlist, NA)
if (is.na(devcur))
stop(no such device)
devname - names(devlist)[devcur]
if (devname != windows)
stop(can only copy from 'windows' devices)
if (filename == clipboard  type == wmf)
filename - 
  else
fullname - paste(filename, extension,
 sep=ifelse(extension==,,.) )  ##added
invisible(.External(CsavePlot, device, fullname, type,
 restoreConsole))##Modded
 }

 Steve E

 PS Yes, I took a while to upgrade from 2.6.x. Otherwise I'd have
 squeaked the day I upgraded - like I just did - 'cos I use savePlot a
 LOT.



 ***
 This email and any attachments are confidential. Any use...{{dropped:8}}

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Duncan Murdoch

On 6/3/2008 11:43 AM, Patrick Carr wrote:

On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:


 because signif(0.90, digits=2) == 0.9.  Those two objects are identical.


My text above that is poorly worded. They're identical internally,
yes. But in terms of the number of significant digits, 0.9 and 0.90
are different. And that matters when the number is printed, say as an
annotation on a graph. Passing it through sprintf() or format() later
requires you to specify the number of digits after the decimal, which
is different than the number of significant digits, and requires case
testing for numbers of different orders of magnitude.

The original complainant (and I) expected this behavior from signif(),
not merely rounding. As I said before, I wrote my own workaround so
this is somewhat academic, but I don't think we're alone.


 As far as I know, rounding is fine in Windows:

  round(1:10 + 0.5)
 [1]  2  2  4  4  6  6  8  8 10 10



It might not be the rounding, then. (windows xp sp3)

signif(12345,digits=4)
   [1] 12340
signif(0.12345,digits=4)
   [1] 0.1235


It's easy to make mistakes in this, but a little outside-of-R 
experimentation suggests those are the right answers.  The number 12345 
is exactly representable, so it is exactly half-way between 12340 and 
12350, so 12340 is the right answer by the unbiased round-to-even rule. 
 The number 0.12345 is not exactly representable, but (I think) it is 
represented by something slightly closer to 0.1235 than to 0.1234.  So 
it looks as though Windows gets it right.




OS X (10.5.2/intel) does not have that problem.


Which would seem to imply OS X gets it wrong.  Both are supposed to be 
using the 64 bit floating point standard, so they should both give the 
same answer:  but the actual arithmetic is being done by run-time 
libraries that are outside our control to a large extent, and it looks 
as though the one on the Mac is less accurate than the one on Windows.



 But (on both windows and OS X):


signif(12345.12345,digits=10)
   [1] 12345.12


This is a different problem.  The number is correctly computed as 
12345.12345 (or at least a representable number quite close to that), 
and then the default display rounds it some more.  Set 
options(digits=19) to see it in its full glory.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] savePlot() no longer automatically adds an extension to the filename.

2008-06-03 Thread Duncan Murdoch

On 6/3/2008 11:22 AM, S Ellison wrote:
Plaintive squeak: Why the change? 


Some OS's and desktops use the extension, so forgetting it causes
trouble. The new default filename keeps a filetype (as before) but the
user now has to type a filetype twice (once as the type, once as
extension) to get the same effect fo rtheir own filenames. And the
extension isn't then checked for consistency with valid file types, so
it can be mistyped and saved with no warning. Hard to see the advantage
of doing away with it...

Suggestion: Revert to the previous default (extension as type) and
include an 'extension' in the parameter list so that folk who don't want
it can change it and folk who did want it get it automatically.

 
The code would then look something like 

savePlot-function (filename = Rplot, 
	type = c(wmf, emf, png, jpg, jpeg, bmp, tif,

tiff, ps, eps, pdf),
	device = dev.cur(), 
	restoreConsole = TRUE,

extension)   #Added extension
{
type - match.arg(type)
if(missing(extension)) 
   extension - type	 ##added

devlist - dev.list()
devcur - match(device, devlist, NA)
if (is.na(devcur)) 
stop(no such device)

devname - names(devlist)[devcur]
if (devname != windows) 
stop(can only copy from 'windows' devices)
if (filename == clipboard  type == wmf) 
filename - 

  else
fullname - paste(filename, extension,
sep=ifelse(extension==,,.) )  ##added
invisible(.External(CsavePlot, device, fullname, type,
restoreConsole))##Modded
}

Steve E

PS Yes, I took a while to upgrade from 2.6.x. Otherwise I'd have
squeaked the day I upgraded - like I just did - 'cos I use savePlot a
LOT.


Another way to avoid surprises like this is to pay attention to the 
daily announcements of changes.  This one was announced back in February:


http://developer.r-project.org/blosxom.cgi/R-devel/CHANGES/2008/02/26#c2008-02-26

Personally, I would have chosen to add an extension if the name 
contained no dot, but I didn't make this change, and don't really see a 
problem with it.  However, if you want to submit a patch that does this 
(including changes to src/library/grDevices/R/windows/windows.R, 
src/library/grDevices/man/windows/savePlot.Rd, and to 
src/gnuwin32/CHANGES) before Friday, I'll commit it.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Simon Urbanek


On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:


On 6/3/2008 11:43 AM, Patrick Carr wrote:

On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:


because signif(0.90, digits=2) == 0.9.  Those two objects are  
identical.

My text above that is poorly worded. They're identical internally,
yes. But in terms of the number of significant digits, 0.9 and 0.90
are different. And that matters when the number is printed, say as an
annotation on a graph. Passing it through sprintf() or format() later
requires you to specify the number of digits after the decimal, which
is different than the number of significant digits, and requires case
testing for numbers of different orders of magnitude.
The original complainant (and I) expected this behavior from  
signif(),

not merely rounding. As I said before, I wrote my own workaround so
this is somewhat academic, but I don't think we're alone.

As far as I know, rounding is fine in Windows:

 round(1:10 + 0.5)
[1]  2  2  4  4  6  6  8  8 10 10


It might not be the rounding, then. (windows xp sp3)
   signif(12345,digits=4)
  [1] 12340
   signif(0.12345,digits=4)
  [1] 0.1235


It's easy to make mistakes in this, but a little outside-of-R  
experimentation suggests those are the right answers.  The number  
12345 is exactly representable, so it is exactly half-way between  
12340 and 12350, so 12340 is the right answer by the unbiased round- 
to-even rule.  The number 0.12345 is not exactly representable, but  
(I think) it is represented by something slightly closer to 0.1235  
than to 0.1234.  So it looks as though Windows gets it right.




OS X (10.5.2/intel) does not have that problem.


Which would seem to imply OS X gets it wrong.


This has nothing to do with OS X, you get that same answer on pretty  
much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/Sun, ...).  
Windows is the only one delivering the incorrect result here.



 Both are supposed to be using the 64 bit floating point standard,  
so they should both give the same answer:


Should, yes, but Windows doesn't. In fact 1.0 is exactly  
representable and so is 1234.5 which is the correct result that all  
except Windows get. I don't have a Windows box handy, so I can't tell  
why - but if you go through fprec this is what you get on the  
platforms I tested (log10 may vary slightly but that's irrelevant here):


x = 0.123454174439
l10 = -0.908508905732048899217546, e10 = 4
pow10 = 1.
x*pow10 = 1234.5000

Cheers,
Simon


 but the actual arithmetic is being done by run-time libraries that  
are outside our control to a large extent, and it looks as though  
the one on the Mac is less accurate than the one on Windows.



But (on both windows and OS X):

   signif(12345.12345,digits=10)
  [1] 12345.12


This is a different problem.  The number is correctly computed as  
12345.12345 (or at least a representable number quite close to  
that), and then the default display rounds it some more.  Set  
options(digits=19) to see it in its full glory.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread simon . urbanek

On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:

 On 6/3/2008 11:43 AM, Patrick Carr wrote:
 On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:

 because signif(0.90, digits=2) == 0.9.  Those two objects are  
 identical.
 My text above that is poorly worded. They're identical internally,
 yes. But in terms of the number of significant digits, 0.9 and 0.90
 are different. And that matters when the number is printed, say as an
 annotation on a graph. Passing it through sprintf() or format() later
 requires you to specify the number of digits after the decimal, which
 is different than the number of significant digits, and requires case
 testing for numbers of different orders of magnitude.
 The original complainant (and I) expected this behavior from  
 signif(),
 not merely rounding. As I said before, I wrote my own workaround so
 this is somewhat academic, but I don't think we're alone.
 As far as I know, rounding is fine in Windows:

  round(1:10 + 0.5)
 [1]  2  2  4  4  6  6  8  8 10 10

 It might not be the rounding, then. (windows xp sp3)
signif(12345,digits=4)
   [1] 12340
signif(0.12345,digits=4)
   [1] 0.1235

 It's easy to make mistakes in this, but a little outside-of-R  
 experimentation suggests those are the right answers.  The number  
 12345 is exactly representable, so it is exactly half-way between  
 12340 and 12350, so 12340 is the right answer by the unbiased round- 
 to-even rule.  The number 0.12345 is not exactly representable, but  
 (I think) it is represented by something slightly closer to 0.1235  
 than to 0.1234.  So it looks as though Windows gets it right.


 OS X (10.5.2/intel) does not have that problem.

 Which would seem to imply OS X gets it wrong.

This has nothing to do with OS X, you get that same answer on pretty  
much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/Sun, ...).  
Windows is the only one delivering the incorrect result here.


  Both are supposed to be using the 64 bit floating point standard,  
 so they should both give the same answer:

Should, yes, but Windows doesn't. In fact 1.0 is exactly  
representable and so is 1234.5 which is the correct result that all  
except Windows get. I don't have a Windows box handy, so I can't tell  
why - but if you go through fprec this is what you get on the  
platforms I tested (log10 may vary slightly but that's irrelevant here):

x = 0.123454174439
l10 = -0.908508905732048899217546, e10 = 4
pow10 = 1.
x*pow10 = 1234.5000

Cheers,
Simon


  but the actual arithmetic is being done by run-time libraries that  
 are outside our control to a large extent, and it looks as though  
 the one on the Mac is less accurate than the one on Windows.


 But (on both windows and OS X):
signif(12345.12345,digits=10)
   [1] 12345.12

 This is a different problem.  The number is correctly computed as  
 12345.12345 (or at least a representable number quite close to  
 that), and then the default display rounds it some more.  Set  
 options(digits=19) to see it in its full glory.

 Duncan Murdoch

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread murdoch
On 6/3/2008 4:36 PM, Simon Urbanek wrote:
 On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:
 
 On 6/3/2008 11:43 AM, Patrick Carr wrote:
 On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:

 because signif(0.90, digits=2) == 0.9.  Those two objects are  
 identical.
 My text above that is poorly worded. They're identical internally,
 yes. But in terms of the number of significant digits, 0.9 and 0.90
 are different. And that matters when the number is printed, say as an
 annotation on a graph. Passing it through sprintf() or format() later
 requires you to specify the number of digits after the decimal, which
 is different than the number of significant digits, and requires case
 testing for numbers of different orders of magnitude.
 The original complainant (and I) expected this behavior from  
 signif(),
 not merely rounding. As I said before, I wrote my own workaround so
 this is somewhat academic, but I don't think we're alone.
 As far as I know, rounding is fine in Windows:

  round(1:10 + 0.5)
 [1]  2  2  4  4  6  6  8  8 10 10

 It might not be the rounding, then. (windows xp sp3)
signif(12345,digits=4)
   [1] 12340
signif(0.12345,digits=4)
   [1] 0.1235

 It's easy to make mistakes in this, but a little outside-of-R  
 experimentation suggests those are the right answers.  The number  
 12345 is exactly representable, so it is exactly half-way between  
 12340 and 12350, so 12340 is the right answer by the unbiased round- 
 to-even rule.  The number 0.12345 is not exactly representable, but  
 (I think) it is represented by something slightly closer to 0.1235  
 than to 0.1234.  So it looks as though Windows gets it right.


 OS X (10.5.2/intel) does not have that problem.

 Which would seem to imply OS X gets it wrong.
 
 This has nothing to do with OS X, you get that same answer on pretty  
 much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/Sun, ...).  
 Windows is the only one delivering the incorrect result here.
 
 
  Both are supposed to be using the 64 bit floating point standard,  
 so they should both give the same answer:
 
 Should, yes, but Windows doesn't. In fact 1.0 is exactly  
 representable and so is 1234.5 which is the correct result that all  
 except Windows get. 

I think you skipped a step.  The correct answer is either 0.1234 or 
0.1235, not something 1 times bigger.  The first important question 
is whether 0.12345 is exactly representable, and the answer is no.  The 
second question is whether it is represented by a number bigger or 
smaller than the real number 0.12345.  If it is bigger, the answer 
should be 0.1235, and if it is smaller, the answer is 0.1234.  My 
experiments suggest it is bigger.  Yours don't look relevant.  It 
certainly isn't exactly equal to 1234.5/1, because that number is 
not representable.  It's equal to x/2^y, for some x and y, and it's a 
pain to figure out exactly what they are.

However, I am pretty sure R is representing it (at least on Windows) as 
the binary expansion

0.0001100110100110101101011011001001

while the true binary expansion (using exact rational arithmetic) starts 
out

0.000110011010011010110101101100100011101...

If you line those up, you'll see that the first number is bigger than 
the second.  (Ugly code to derive these is down below.)

Clearly the top representation is the correct one to that number of 
binary digits, so I think Windows got it right, and all those other 
systems didn't.  This is probably because R on Windows is using extended 
precision (64 bit mantissas) for intermediate results, and those other 
systems stick with 53 bit mantissas.

Duncan Murdoch

# Convert number to binary expansion; add the decimal point manually

x - 0.12345
while (x != 0) {
   cat(trunc(x))
   x - x - trunc(x)
   x - x * 2
}

# Do the same thing in exact rational arithmetic

num - 12345
denom - 10
for (i in 1:60) {
   cat(ifelse(num  10, 1, 0))
   num - num %% 10
   num - 2*num
}

# Manually cut and paste the results to get these:

0.0001100110100110101101011011001001
0.000110011010011010110101101100100011101

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Patrick Carr
On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:

  because signif(0.90, digits=2) == 0.9.  Those two objects are identical.

My text above that is poorly worded. They're identical internally,
yes. But in terms of the number of significant digits, 0.9 and 0.90
are different. And that matters when the number is printed, say as an
annotation on a graph. Passing it through sprintf() or format() later
requires you to specify the number of digits after the decimal, which
is different than the number of significant digits, and requires case
testing for numbers of different orders of magnitude.

The original complainant (and I) expected this behavior from signif(),
not merely rounding. As I said before, I wrote my own workaround so
this is somewhat academic, but I don't think we're alone.

  As far as I know, rounding is fine in Windows:

   round(1:10 + 0.5)
  [1]  2  2  4  4  6  6  8  8 10 10


It might not be the rounding, then. (windows xp sp3)

signif(12345,digits=4)
   [1] 12340
signif(0.12345,digits=4)
   [1] 0.1235

OS X (10.5.2/intel) does not have that problem. But (on both windows and OS X):

signif(12345.12345,digits=10)
   [1] 12345.12

Pat Carr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] bug in sockconn.c: invalid memory allocation (PR#11565)

2008-06-03 Thread leydold
Dear R developers,

The following patch should fix a memory allocation bug:

Index: src/modules/internet/sockconn.c
===
--- src/modules/internet/sockconn.c (revision 45828)
+++ src/modules/internet/sockconn.c (working copy)
@@ -174,7 +174,7 @@
 
 new = (Rconnection) malloc(sizeof(struct Rconn));
 if(!new) error(_(allocation of socket connection failed));
-new-class = (char *) malloc(strlen(socket) + 1);
+new-class = (char *) malloc(strlen(sockconn) + 1);
 if(!new-class) {
free(new);
error(_(allocation of socket connection failed));


The in release 45780 there has been changed a strcpy command without
changing the size of the character array.
The following diff also contains the above change to show the problem:

Index: src/modules/internet/sockconn.c
===
--- src/modules/internet/sockconn.c (revision 45779)
+++ src/modules/internet/sockconn.c (working copy)
@@ -174,12 +174,12 @@
 
 new = (Rconnection) malloc(sizeof(struct Rconn));
 if(!new) error(_(allocation of socket connection failed));
-new-class = (char *) malloc(strlen(socket) + 1);
+new-class = (char *) malloc(strlen(sockconn) + 1);
 if(!new-class) {
free(new);
error(_(allocation of socket connection failed));
 }
-strcpy(new-class, socket);
+strcpy(new-class, sockconn);
 new-description = (char *) malloc(strlen(host) + 10);
 if(!new-description) {
free(new-class); free(new);

$ svn info
Path: .
URL: https://svn.R-project.org/R/trunk
Repository Root: https://svn.R-project.org/R
Repository UUID: 00db46b3-68df-0310-9c12-caf00c1e9a41
Revision: 45828
Node Kind: directory
Schedule: normal
Last Changed Author: hornik
Last Changed Rev: 45827
Last Changed Date: 2008-06-03 08:44:26 +0200 (Tue, 03 Jun 2008)



Josef

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Patrick Carr
On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:

  It's easy to make mistakes in this, but a little outside-of-R
 experimentation suggests those are the right answers.  The number 12345 is
 exactly representable, so it is exactly half-way between 12340 and 12350, so
 12340 is the right answer by the unbiased round-to-even rule.  The number
 0.12345 is not exactly representable, but (I think) it is represented by
 something slightly closer to 0.1235 than to 0.1234.  So it looks as though
 Windows gets it right.

Well, right within the limitations of binary floating-point
arithmetic. Not right right.

In the grander scheme, this is a nicety which is largely
inconsequential--if I need a real measure precision (precise
precision?) I'll use a +/- notation of a propagated error and/or edit
the typography of the numbers by hand immediately before the final
output. But again, final printed output of the number is basically the
useful use I see for a function that returns significant digits. And
for that purpose I think it should be right right, and actually output
the number of significant digits requested.

 signif(12345.12345,digits=10)
[1] 12345.12

  This is a different problem.  The number is correctly computed as
 12345.12345 (or at least a representable number quite close to that), and
 then the default display rounds it some more.  Set options(digits=19) to see
 it in its full glory.

Aha, my mistake; I missed that setting.

Pat Carr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Simon Urbanek


On Jun 3, 2008, at 5:12 PM, Duncan Murdoch wrote:


On 6/3/2008 4:36 PM, Simon Urbanek wrote:

On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:

On 6/3/2008 11:43 AM, Patrick Carr wrote:

On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:


because signif(0.90, digits=2) == 0.9.  Those two objects are   
identical.

My text above that is poorly worded. They're identical internally,
yes. But in terms of the number of significant digits, 0.9 and 0.90
are different. And that matters when the number is printed, say  
as an
annotation on a graph. Passing it through sprintf() or format()  
later
requires you to specify the number of digits after the decimal,  
which
is different than the number of significant digits, and requires  
case

testing for numbers of different orders of magnitude.
The original complainant (and I) expected this behavior from   
signif(),

not merely rounding. As I said before, I wrote my own workaround so
this is somewhat academic, but I don't think we're alone.

As far as I know, rounding is fine in Windows:

 round(1:10 + 0.5)
[1]  2  2  4  4  6  6  8  8 10 10


It might not be the rounding, then. (windows xp sp3)
  signif(12345,digits=4)
 [1] 12340
  signif(0.12345,digits=4)
 [1] 0.1235


It's easy to make mistakes in this, but a little outside-of-R   
experimentation suggests those are the right answers.  The number   
12345 is exactly representable, so it is exactly half-way between   
12340 and 12350, so 12340 is the right answer by the unbiased  
round- to-even rule.  The number 0.12345 is not exactly  
representable, but  (I think) it is represented by something  
slightly closer to 0.1235  than to 0.1234.  So it looks as though  
Windows gets it right.




OS X (10.5.2/intel) does not have that problem.


Which would seem to imply OS X gets it wrong.
This has nothing to do with OS X, you get that same answer on  
pretty  much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/ 
Sun, ...).  Windows is the only one delivering the incorrect result  
here.
Both are supposed to be using the 64 bit floating point standard,   
so they should both give the same answer:
Should, yes, but Windows doesn't. In fact 1.0 is exactly   
representable and so is 1234.5 which is the correct result that  
all  except Windows get.


I think you skipped a step.


I didn't - I was just pointing out that what you are trying to show is  
irrelevant. We are dealing with FP arithmetics here, so although your  
reasoning is valid algebraically, it's not in FP world. You missed the  
fact that FP operations are used to actually get the result (*1.0,  
round and divide again) and thus those operation will influence it as  
well.



 The correct answer is either 0.1234 or 0.1235, not something 1  
times bigger.  The first important question is whether 0.12345 is  
exactly representable, and the answer is no.  The second question is  
whether it is represented by a number bigger or smaller than the  
real number 0.12345.  If it is bigger, the answer should be 0.1235,  
and if it is smaller, the answer is 0.1234.


No. That was what I was trying to point out. You can see clearly from  
my post that 0.12345 is not exactly representable and that the  
representation is slightly bigger. This is, however, irrelevant,  
because the next step is to multiply that number by 1 (see fprec  
source) and this is where your reasoning breaks down - the result is  
exact representation of 1234.5, because the imprecision gets lost in  
the operation on all platforms but Windows. The result is that Windows  
is inconsistent with others, whether that is a bug or feature I don't  
care. All I really wanted to say is that this has nothing to do with  
OS X - if anything then it's a Windows issue.




 My experiments suggest it is bigger.


I was not claiming otherwise.



 Yours don't look relevant.


Vice versa as it turns out.

Cheers,
Simon


It certainly isn't exactly equal to 1234.5/1, because that  
number is not representable.  It's equal to x/2^y, for some x and y,  
and it's a pain to figure out exactly what they are.


However, I am pretty sure R is representing it (at least on Windows)  
as the binary expansion


0.0001100110100110101101011011001001

while the true binary expansion (using exact rational arithmetic)  
starts out


0.000110011010011010110101101100100011101...

If you line those up, you'll see that the first number is bigger  
than the second.  (Ugly code to derive these is down below.)


Clearly the top representation is the correct one to that number of  
binary digits, so I think Windows got it right, and all those other  
systems didn't.  This is probably because R on Windows is using  
extended precision (64 bit mantissas) for intermediate results, and  
those other systems stick with 53 bit mantissas.




However, this means that Windows doesn't conform


Duncan Murdoch

# Convert number to binary expansion; add the decimal point 

Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread simon . urbanek

On Jun 3, 2008, at 5:12 PM, Duncan Murdoch wrote:

 On 6/3/2008 4:36 PM, Simon Urbanek wrote:
 On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:
 On 6/3/2008 11:43 AM, Patrick Carr wrote:
 On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:

 because signif(0.90, digits=2) == 0.9.  Those two objects are   
 identical.
 My text above that is poorly worded. They're identical internally,
 yes. But in terms of the number of significant digits, 0.9 and 0.90
 are different. And that matters when the number is printed, say  
 as an
 annotation on a graph. Passing it through sprintf() or format()  
 later
 requires you to specify the number of digits after the decimal,  
 which
 is different than the number of significant digits, and requires  
 case
 testing for numbers of different orders of magnitude.
 The original complainant (and I) expected this behavior from   
 signif(),
 not merely rounding. As I said before, I wrote my own workaround so
 this is somewhat academic, but I don't think we're alone.
 As far as I know, rounding is fine in Windows:

  round(1:10 + 0.5)
 [1]  2  2  4  4  6  6  8  8 10 10

 It might not be the rounding, then. (windows xp sp3)
   signif(12345,digits=4)
  [1] 12340
   signif(0.12345,digits=4)
  [1] 0.1235

 It's easy to make mistakes in this, but a little outside-of-R   
 experimentation suggests those are the right answers.  The number   
 12345 is exactly representable, so it is exactly half-way between   
 12340 and 12350, so 12340 is the right answer by the unbiased  
 round- to-even rule.  The number 0.12345 is not exactly  
 representable, but  (I think) it is represented by something  
 slightly closer to 0.1235  than to 0.1234.  So it looks as though  
 Windows gets it right.


 OS X (10.5.2/intel) does not have that problem.

 Which would seem to imply OS X gets it wrong.
 This has nothing to do with OS X, you get that same answer on  
 pretty  much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/ 
 Sun, ...).  Windows is the only one delivering the incorrect result  
 here.
 Both are supposed to be using the 64 bit floating point standard,   
 so they should both give the same answer:
 Should, yes, but Windows doesn't. In fact 1.0 is exactly   
 representable and so is 1234.5 which is the correct result that  
 all  except Windows get.

 I think you skipped a step.

I didn't - I was just pointing out that what you are trying to show is  
irrelevant. We are dealing with FP arithmetics here, so although your  
reasoning is valid algebraically, it's not in FP world. You missed the  
fact that FP operations are used to actually get the result (*1.0,  
round and divide again) and thus those operation will influence it as  
well.


  The correct answer is either 0.1234 or 0.1235, not something 1  
 times bigger.  The first important question is whether 0.12345 is  
 exactly representable, and the answer is no.  The second question is  
 whether it is represented by a number bigger or smaller than the  
 real number 0.12345.  If it is bigger, the answer should be 0.1235,  
 and if it is smaller, the answer is 0.1234.

No. That was what I was trying to point out. You can see clearly from  
my post that 0.12345 is not exactly representable and that the  
representation is slightly bigger. This is, however, irrelevant,  
because the next step is to multiply that number by 1 (see fprec  
source) and this is where your reasoning breaks down - the result is  
exact representation of 1234.5, because the imprecision gets lost in  
the operation on all platforms but Windows. The result is that Windows  
is inconsistent with others, whether that is a bug or feature I don't  
care. All I really wanted to say is that this has nothing to do with  
OS X - if anything then it's a Windows issue.


  My experiments suggest it is bigger.

I was not claiming otherwise.


  Yours don't look relevant.

Vice versa as it turns out.

Cheers,
Simon


 It certainly isn't exactly equal to 1234.5/1, because that  
 number is not representable.  It's equal to x/2^y, for some x and y,  
 and it's a pain to figure out exactly what they are.

 However, I am pretty sure R is representing it (at least on Windows)  
 as the binary expansion

 0.0001100110100110101101011011001001

 while the true binary expansion (using exact rational arithmetic)  
 starts out

 0.000110011010011010110101101100100011101...

 If you line those up, you'll see that the first number is bigger  
 than the second.  (Ugly code to derive these is down below.)

 Clearly the top representation is the correct one to that number of  
 binary digits, so I think Windows got it right, and all those other  
 systems didn't.  This is probably because R on Windows is using  
 extended precision (64 bit mantissas) for intermediate results, and  
 those other systems stick with 53 bit mantissas.


However, this means that Windows doesn't conform

 Duncan Murdoch

 # Convert number to 

Re: [Rd] significant digits (PR#9682)

2008-06-03 Thread Duncan Murdoch

On 6/3/2008 4:36 PM, Simon Urbanek wrote:

On Jun 3, 2008, at 2:48 PM, Duncan Murdoch wrote:


On 6/3/2008 11:43 AM, Patrick Carr wrote:

On 6/3/08, Duncan Murdoch [EMAIL PROTECTED] wrote:


because signif(0.90, digits=2) == 0.9.  Those two objects are  
identical.

My text above that is poorly worded. They're identical internally,
yes. But in terms of the number of significant digits, 0.9 and 0.90
are different. And that matters when the number is printed, say as an
annotation on a graph. Passing it through sprintf() or format() later
requires you to specify the number of digits after the decimal, which
is different than the number of significant digits, and requires case
testing for numbers of different orders of magnitude.
The original complainant (and I) expected this behavior from  
signif(),

not merely rounding. As I said before, I wrote my own workaround so
this is somewhat academic, but I don't think we're alone.

As far as I know, rounding is fine in Windows:

 round(1:10 + 0.5)
[1]  2  2  4  4  6  6  8  8 10 10


It might not be the rounding, then. (windows xp sp3)
   signif(12345,digits=4)
  [1] 12340
   signif(0.12345,digits=4)
  [1] 0.1235


It's easy to make mistakes in this, but a little outside-of-R  
experimentation suggests those are the right answers.  The number  
12345 is exactly representable, so it is exactly half-way between  
12340 and 12350, so 12340 is the right answer by the unbiased round- 
to-even rule.  The number 0.12345 is not exactly representable, but  
(I think) it is represented by something slightly closer to 0.1235  
than to 0.1234.  So it looks as though Windows gets it right.




OS X (10.5.2/intel) does not have that problem.


Which would seem to imply OS X gets it wrong.


This has nothing to do with OS X, you get that same answer on pretty  
much all other platforms (Intel/Linux, MIPS/IRIX, Sparc/Sun, ...).  
Windows is the only one delivering the incorrect result here.



 Both are supposed to be using the 64 bit floating point standard,  
so they should both give the same answer:


Should, yes, but Windows doesn't. In fact 1.0 is exactly  
representable and so is 1234.5 which is the correct result that all  
except Windows get. 


I think you skipped a step.  The correct answer is either 0.1234 or 
0.1235, not something 1 times bigger.  The first important question 
is whether 0.12345 is exactly representable, and the answer is no.  The 
second question is whether it is represented by a number bigger or 
smaller than the real number 0.12345.  If it is bigger, the answer 
should be 0.1235, and if it is smaller, the answer is 0.1234.  My 
experiments suggest it is bigger.  Yours don't look relevant.  It 
certainly isn't exactly equal to 1234.5/1, because that number is 
not representable.  It's equal to x/2^y, for some x and y, and it's a 
pain to figure out exactly what they are.


However, I am pretty sure R is representing it (at least on Windows) as 
the binary expansion


0.0001100110100110101101011011001001

while the true binary expansion (using exact rational arithmetic) starts 
out


0.000110011010011010110101101100100011101...

If you line those up, you'll see that the first number is bigger than 
the second.  (Ugly code to derive these is down below.)


Clearly the top representation is the correct one to that number of 
binary digits, so I think Windows got it right, and all those other 
systems didn't.  This is probably because R on Windows is using extended 
precision (64 bit mantissas) for intermediate results, and those other 
systems stick with 53 bit mantissas.


Duncan Murdoch

# Convert number to binary expansion; add the decimal point manually

x - 0.12345
while (x != 0) {
  cat(trunc(x))
  x - x - trunc(x)
  x - x * 2
}

# Do the same thing in exact rational arithmetic

num - 12345
denom - 10
for (i in 1:60) {
  cat(ifelse(num  10, 1, 0))
  num - num %% 10
  num - 2*num
}

# Manually cut and paste the results to get these:

0.0001100110100110101101011011001001
0.000110011010011010110101101100100011101

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Mark Kimpel
Dirk,

Thanks for the helpful reply. I agree with the concept of testing an app
that is commonly used. I suppose someone who uses R to interface alot with a
database could have a very differnt experience that one who uses it mostly
for matrix manupulations.

Still, as a scientist I like to have some empirical evidence as to why I am
doing  something, especially when it entails the extra headache of using a
compiler wiith which I am not familiar. As a non-computer-scientist,
however, I I wasn't sure the best way about finding the answer.

Mark

On Tue, Jun 3, 2008 at 11:16 AM, Dirk Eddelbuettel [EMAIL PROTECTED] wrote:

 On Tue, Jun 03, 2008 at 11:09:28AM -0400, Mark Kimpel wrote:
  Dirk,
 
  At the moment, our emphasis is getting an installation that will run Rmpi
 in
  batch mode. I imagine my sysadmin put that line in to minimize potential
  problems. To be honest, I didn't catch it, I was just glad to get a
 compile
  :)
 
  As to the line that doesn't apply to the R install, I think you are
  referring to the with mpi, which I think he also slipped in not
 realizing
  it should more properly go with the Rmpi install.

 Right. I wasn't implying it would do harm.

 As for the x11 choice: I prefer to keep the change vectors
 minimal. Otherwise you go nuts trying to debug things.

 But it's good to see that you now understand that you have to start
 'at the top' with Intel icc, and once you have a working R with you
 can start working towards packages.  It's like a port to a different
 platform as you completely switch the toolchain.

 Good luck and keep us posted. I have looking at icc on the TODO list
 too (for my C++ code, though, given R's interactive nature I think
 there are lower hanging fruits elsewhere...)

 Lastly, as to the benchmarking: It's difficult. Ripley once snarled
 that it is probably the application you want to run the most so
 there you go ...

 Dirk

 
  Mark
 
  On Tue, Jun 3, 2008 at 12:18 AM, Dirk Eddelbuettel [EMAIL PROTECTED]
 wrote:
 
   On Mon, Jun 02, 2008 at 11:56:16PM -0400, Mark Kimpel wrote:
../configure --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
--with-system-zlib=/usr/lib64 --with-system-bzlib=/usr/lib64 \
--with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64
 --with-tcltk
   \
  
   There is no such option for R's configure.
  
   
 --with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/tclConfig.sh
   \
   
 --with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/tkConfig.sh \
--without-x --without-readline --without-iconv \
  
   So you never want your compute cluster to be able to do an interactive
   plot under x11 ?
  
   Dirk
  
   --
   Three out of two people have difficulties with fractions.
  
 
 
 
  --
  Mark W. Kimpel MD  ** Neuroinformatics ** Dept. of Psychiatry
  Indiana University School of Medicine
 
  15032 Hunter Court, Westfield, IN  46074
 
  (317) 490-5129 Work,  Mobile  VoiceMail
  (317) 663-0513 Home (no voice mail please)
 
  **

 --
 Three out of two people have difficulties with fractions.




-- 
Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry
Indiana University School of Medicine

15032 Hunter Court, Westfield, IN 46074

(317) 490-5129 Work,  Mobile  VoiceMail
(317) 663-0513 Home (no voice mail please)

**

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Martin Maechler
 SU == Simon Urbanek [EMAIL PROTECTED]
 on Tue, 3 Jun 2008 11:52:14 -0400 writes:

SU On Jun 3, 2008, at 3:58 AM, Ludo Pagie wrote:

 recently there was a post on R-help/Rd ?? with this link
 on benchmarking different 'number crunching
 packages'. They used a series of tests, although I didn't
 check they used all the types you mentioned. I couldn't
 find test code at first glance but myyebe it is available
 on request???
 
 http://www.sciviews.org/benchmark/
 

SU It's quite outdated and doesn't work with the current R versions, 

Yes, that was the topic of the recent R-help post mentioned
above.  And because of that I did post an updated version back to the
list.
Here's the posting, as found on http://search.r-project.org :
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/130270.html
and the R script here
   
https://stat.ethz.ch/pipermail/r-help/attachments/20080514/0ccea72b/attachment.pl

SU but I have an updated version that works. I have put some
SU benchmarks I'm aware of at

SU http://r.research.att.com/benchmarks/

That's cool!  Thanks, Simon!
Martin

 On Mon, 2 Jun 2008, Mark Kimpel wrote:
 
 Recently I posted to this list with a question about
 using the Intel 10.1 compilers in building R and one
 response was basically, why in the heck would you want
 to do that? The answer is that my sysadmin believes
 that there will be a performance boost with the Intel
 vs. Gnu compilers on our Linux cluster, of which I am
 one of many users. Wanting to be a good citizen and use
 my machine time wisely, I'd of course like to use right
 tool to build the most efficient installation of R and
 associated packages.  BTW, we got R to compile nicely
 using the settings at the end of this post.
 
 Looking back on previous posts, however, it seems that
 there is no consensus as to how to benchmark R. I
 realize such a task is not trivial, nor controversial,
 but does anyone have a set of time-consuming tasks that
 can be used to compare R installations? It would seem
 logical that such a benchmark would include
 sub-benchmarks on file access, interpreted intensive
 tasks, C intensive tasks, BLAS intensive tasks, etc. You
 developers know more about this than I do, but I know
 enough to realize that there won't be one simple
 answer. Nevertheless, I'd like to make my usage
 decisions on something rather than anedotal claims.
 
 So, does anyone know of a good benchmarking script or
 would be willing to contribute one?
 
 And here are the settings we used to compile R with
 Intel 10.1 compilers:
 
 ../configure
 --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
 --with-system-zlib=/usr/lib64
 --with-system-bzlib=/usr/lib64 \
 --with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64
 --with- tcltk \
 --with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/
 tclConfig.sh \
 --with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/
 tkConfig.sh \ --without-x --without-readline
 --without-iconv \
 CC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc
 \ CFLAGS=-O3 -no-prec-div -unroll \
 F77=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort
 \ FFLAGS=-O3 -no-prec-div -unroll \
 CXX=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icpc
 \ CXXFLAGS=-O3 -no-prec-div -unroll \
 FC=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort
 \ FCFLAGS=-O3 -no-prec-div -unroll \
 OBJC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc
 \ OBJCFLAGS=-O3 -no-prec-div -unroll \
 --disable-R-profiling --disable-memory-profiling
 ##
 make all make install
 
 Mark
 
 -- 
 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of
 Psychiatry Indiana University School of Medicine
 
 15032 Hunter Court, Westfield, IN 46074
 
 (317) 490-5129 Work,  Mobile  VoiceMail (317) 663-0513
 Home (no voice mail please)
 
 **
 
 [[alternative HTML version deleted]]
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 
 

__
SU R-devel@r-project.org mailing list
SU https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] savePlot() no longer automatically adds an extension to the filename.

2008-06-03 Thread Duncan Murdoch

On 03/06/2008 2:35 PM, Mike Prager wrote:

S Ellison [EMAIL PROTECTED] wrote:

Plaintive squeak: Why the change? 


Some OS's and desktops use the extension, so forgetting it causes
trouble. The new default filename keeps a filetype (as before) but the
user now has to type a filetype twice (once as the type, once as
extension) to get the same effect fo rtheir own filenames. And the
extension isn't then checked for consistency with valid file types, so
it can be mistyped and saved with no warning. Hard to see the advantage
of doing away with it...


Just for the record. . .

This change broke a *lot* of my code, including code used by
others.  Windows depends on file extensions.  Fortunately, fixes
using getRversion are not too difficult.



Then you'll be happy to hear that Steve put together a patch and it's 
already committed, so it should make it into 2.7.1.  The patch adds the 
extension if there's no dot in the name, leaves the filename as-is if it 
sees one.  So this should be compatible with the majority of uses, only 
messing up cases where people really don't want an extension (now 
they'll have to add a dot at the end of their filename), or where they 
want an automatic one, but have another dot in the name.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] benchmarking R installations

2008-06-03 Thread Mark Kimpel
Thanks, once I get my new Intel installation running in tandem with a
gcc one, I'll report back. mark

On Tue, Jun 3, 2008 at 7:04 PM, Martin Maechler
[EMAIL PROTECTED] wrote:
 SU == Simon Urbanek [EMAIL PROTECTED]
 on Tue, 3 Jun 2008 11:52:14 -0400 writes:

SU On Jun 3, 2008, at 3:58 AM, Ludo Pagie wrote:

 recently there was a post on R-help/Rd ?? with this link
 on benchmarking different 'number crunching
 packages'. They used a series of tests, although I didn't
 check they used all the types you mentioned. I couldn't
 find test code at first glance but myyebe it is available
 on request???

 http://www.sciviews.org/benchmark/


SU It's quite outdated and doesn't work with the current R versions,

 Yes, that was the topic of the recent R-help post mentioned
 above.  And because of that I did post an updated version back to the
 list.
 Here's the posting, as found on http://search.r-project.org :
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/130270.html
 and the R script here
   
 https://stat.ethz.ch/pipermail/r-help/attachments/20080514/0ccea72b/attachment.pl

SU but I have an updated version that works. I have put some
SU benchmarks I'm aware of at

SU http://r.research.att.com/benchmarks/

 That's cool!  Thanks, Simon!
 Martin

 On Mon, 2 Jun 2008, Mark Kimpel wrote:

 Recently I posted to this list with a question about
 using the Intel 10.1 compilers in building R and one
 response was basically, why in the heck would you want
 to do that? The answer is that my sysadmin believes
 that there will be a performance boost with the Intel
 vs. Gnu compilers on our Linux cluster, of which I am
 one of many users. Wanting to be a good citizen and use
 my machine time wisely, I'd of course like to use right
 tool to build the most efficient installation of R and
 associated packages.  BTW, we got R to compile nicely
 using the settings at the end of this post.

 Looking back on previous posts, however, it seems that
 there is no consensus as to how to benchmark R. I
 realize such a task is not trivial, nor controversial,
 but does anyone have a set of time-consuming tasks that
 can be used to compare R installations? It would seem
 logical that such a benchmark would include
 sub-benchmarks on file access, interpreted intensive
 tasks, C intensive tasks, BLAS intensive tasks, etc. You
 developers know more about this than I do, but I know
 enough to realize that there won't be one simple
 answer. Nevertheless, I'd like to make my usage
 decisions on something rather than anedotal claims.

 So, does anyone know of a good benchmarking script or
 would be willing to contribute one?

 And here are the settings we used to compile R with
 Intel 10.1 compilers:

 ../configure
 --prefix=/N/u/mkimpel/R_HOME/R-patched/R-build \
 --with-system-zlib=/usr/lib64
 --with-system-bzlib=/usr/lib64 \
 --with-mpi=/N/soft/linux-rhel4-x86_64/openmpi/1.2.5/intel-64
 --with- tcltk \
 --with-tcl-config=/N/soft/linux-rhel4-x86_64/tcl8.4.16/lib64/
 tclConfig.sh \
 --with-tk-config=/N/soft/linux-rhel4-x86_64/tk8.4.16/lib64/
 tkConfig.sh \ --without-x --without-readline
 --without-iconv \
 CC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc
 \ CFLAGS=-O3 -no-prec-div -unroll \
 F77=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort
 \ FFLAGS=-O3 -no-prec-div -unroll \
 CXX=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icpc
 \ CXXFLAGS=-O3 -no-prec-div -unroll \
 FC=/N/soft/linux-rhel4-x86_64/intel/fce/10.1.013/bin/ifort
 \ FCFLAGS=-O3 -no-prec-div -unroll \
 OBJC=/N/soft/linux-rhel4-x86_64/intel/cce/10.1.013/bin/icc
 \ OBJCFLAGS=-O3 -no-prec-div -unroll \
 --disable-R-profiling --disable-memory-profiling
 ##
 make all make install

 Mark

 --
 Mark W. Kimpel MD ** Neuroinformatics ** Dept. of
 Psychiatry Indiana University School of Medicine

 15032 Hunter Court, Westfield, IN 46074

 (317) 490-5129 Work,  Mobile  VoiceMail (317) 663-0513
 Home (no voice mail please)

 **

 [[alternative HTML version deleted]]

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel



 __
SU R-devel@r-project.org mailing list
SU https://stat.ethz.ch/mailman/listinfo/r-devel

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel




-- 
Mark W. Kimpel MD ** 

Re: [Rd] savePlot() no longer automatically adds an extension to the filename.

2008-06-03 Thread Simon Urbanek


On Jun 3, 2008, at 8:40 PM, Duncan Murdoch wrote:


On 03/06/2008 2:35 PM, Mike Prager wrote:

S Ellison [EMAIL PROTECTED] wrote:

Plaintive squeak: Why the change?
Some OS's and desktops use the extension, so forgetting it causes
trouble. The new default filename keeps a filetype (as before) but  
the

user now has to type a filetype twice (once as the type, once as
extension) to get the same effect fo rtheir own filenames. And the
extension isn't then checked for consistency with valid file  
types, so
it can be mistyped and saved with no warning. Hard to see the  
advantage

of doing away with it...

Just for the record. . .
This change broke a *lot* of my code, including code used by
others.  Windows depends on file extensions.  Fortunately, fixes
using getRversion are not too difficult.


Then you'll be happy to hear that Steve put together a patch and  
it's already committed, so it should make it into 2.7.1.  The patch  
adds the extension if there's no dot in the name, leaves the  
filename as-is if it sees one.  So this should be compatible with  
the majority of uses, only messing up cases where people really  
don't want an extension (now they'll have to add a dot at the end of  
their filename), or where they want an automatic one, but have  
another dot in the name.




AFAICS the savePlot() behavior is now (as of r45830) inconsistent  
across platforms due to the patch (r458229). The inconsistency is IMHO  
a bad thing - you shouldn't expect the same function to behave  
differently across platforms.


I'd strongly recommend against this change for several reasons: it  
changes the behavior of the function between 2.7.0 and 2.7.1, so that  
now you have to special-case three different versions (pre 2.7.0,  
2.7.0 and 2.7.1), there is now no way to specify a file without a dot  
(which is quite common in non-Windows world) and the behavior is  
incompatible with other similar functions.


I think the change of behavior in 2.7.0 was deliberate and in favor of  
consistency, because a filename specification should not be randomly  
mangled by the function (I have made that mistake myself before, so I  
know the pitfalls ;)). Extension is part of the filename, it's not a  
separate concept (also note that .foo is a valid  filename that  
doesn't have an extension). The argument about typos is moot since you  
can always define functions like
saveFoo - function(prefix) savePlot(filename = paste(prefix, foo,  
sep=.), type=foo)
At any rate I don't see how this can realistically be part of 2.7.1  
since it's not a bugfix and it changes the meaning of a function  
parameter. (And I usually don't mind disguising small features as  
bugfixes ;P)


Whether the change in 2.7.0 could be done differently (e.g. using  
another parameter for a full file name) is a different story, but I  
suspect that it should have been discussed before the 2.7.0 release...


Cheers,
Simon

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] savePlot() no longer automatically adds an extension to the filename.

2008-06-03 Thread Gabor Grothendieck
On Tue, Jun 3, 2008 at 10:36 PM, Simon Urbanek
[EMAIL PROTECTED] wrote:

 On Jun 3, 2008, at 8:40 PM, Duncan Murdoch wrote:

 On 03/06/2008 2:35 PM, Mike Prager wrote:

 S Ellison [EMAIL PROTECTED] wrote:

 Plaintive squeak: Why the change?
 Some OS's and desktops use the extension, so forgetting it causes
 trouble. The new default filename keeps a filetype (as before) but the
 user now has to type a filetype twice (once as the type, once as
 extension) to get the same effect fo rtheir own filenames. And the
 extension isn't then checked for consistency with valid file types, so
 it can be mistyped and saved with no warning. Hard to see the advantage
 of doing away with it...

 Just for the record. . .
 This change broke a *lot* of my code, including code used by
 others.  Windows depends on file extensions.  Fortunately, fixes
 using getRversion are not too difficult.

 Then you'll be happy to hear that Steve put together a patch and it's
 already committed, so it should make it into 2.7.1.  The patch adds the
 extension if there's no dot in the name, leaves the filename as-is if it
 sees one.  So this should be compatible with the majority of uses, only
 messing up cases where people really don't want an extension (now they'll
 have to add a dot at the end of their filename), or where they want an
 automatic one, but have another dot in the name.


 AFAICS the savePlot() behavior is now (as of r45830) inconsistent across
 platforms due to the patch (r458229). The inconsistency is IMHO a bad thing
 - you shouldn't expect the same function to behave differently across
 platforms.

 I'd strongly recommend against this change for several reasons: it changes
 the behavior of the function between 2.7.0 and 2.7.1, so that now you have
 to special-case three different versions (pre 2.7.0, 2.7.0 and 2.7.1), there
 is now no way to specify a file without a dot (which is quite common in
 non-Windows world) and the behavior is incompatible with other similar
 functions.

 I think the change of behavior in 2.7.0 was deliberate and in favor of
 consistency, because a filename specification should not be randomly mangled
 by the function (I have made that mistake myself before, so I know the
 pitfalls ;)). Extension is part of the filename, it's not a separate concept
 (also note that .foo is a valid  filename that doesn't have an extension).
 The argument about typos is moot since you can always define functions like
 saveFoo - function(prefix) savePlot(filename = paste(prefix, foo,
 sep=.), type=foo)
 At any rate I don't see how this can realistically be part of 2.7.1 since
 it's not a bugfix and it changes the meaning of a function parameter. (And I
 usually don't mind disguising small features as bugfixes ;P)

 Whether the change in 2.7.0 could be done differently (e.g. using another
 parameter for a full file name) is a different story, but I suspect that it
 should have been discussed before the 2.7.0 release...


One way to fix this so that the filename is a complete name is to derive
the default type from the filename rather than the default filename
from the type.

That is if type is not specified and the last four characters of the file name
are .wmf, .jpg, ..., etc. then the type would be set to that.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel