Re: [Rd] readBin differences on Windows and Linux/mac

2008-01-01 Thread Uwe Ligges
I see. It is either a bug or something related to the following 
paragraph from ?seek:

  We have found so many errors in the Windows implementation of file
  positioning that users are advised to use it only at their own
  risk, and asked not to waste the R developers' time with bug
  reports on Windows' deficiencies.

I will investigate more closely when I am back in office end of this week.

Best,
Uwe




Sean Davis wrote:
 Sorry, Uwe.  Of course:
 
 Both in relatively recent R-devel (one mac, one windows):
 
 ### gunzip pulled from R.utils to be a simple function
 ### In R.utils, implemented as a method
 gunzip - function(filename, destname=gsub([.]gz$, , filename), 
 overwrite=FALSE, remove=TRUE, BFR.SIZE=1e7) {
   if (filename == destname)
 stop(sprintf(Argument 'filename' and 'destname' are identical: %s, 
 filename));
   if (!overwrite  file.exists(destname))
 stop(sprintf(File already exists: %s, destname));
 
   inn - gzfile(filename, rb);
   on.exit(if (!is.null(inn)) close(inn));
 
   out - file(destname, wb);
   on.exit(close(out), add=TRUE);
 
   nbytes - 0;
   repeat {
 bfr - readBin(inn, what=raw(0), size=1, n=BFR.SIZE);
 n - length(bfr);
 if (n == 0)
   break;
 nbytes - nbytes + n;
 writeBin(bfr, con=out, size=1);
   };
 
   if (remove) {
 close(inn);
 inn - NULL;
 file.remove(filename);
   }

   invisible(nbytes);
 }
 download.file(' 
 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz'
  
 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz')
 gunzip('test.txt.gz')
 
 Under windows, this results in the error reported below.  Under mac and 
 linux, results in test.txt being created in the current working 
 directory.  The actual gunzip function is pretty bare bones, so I don't 
 think it complicates matters much to use it in this example. 
 
 Sean
 
 
 On Dec 31, 2007 1:24 PM, Uwe Ligges [EMAIL PROTECTED] 
 mailto:[EMAIL PROTECTED] wrote:
 
 Can you give a reproducible example, pelase?
 
 Uwe Ligges
 
 
 Sean Davis wrote:
   I have been trying to use the gunzip function in the R.utils
 package.  It
   opens a connection to a gzfile, uses readBin to read from that
 connection,
   and then uses writeBin to write out the raw data to a new file.
  This works
   as expected under linux/mac, but under Windows, I get:
  
   Error in readBin(inn, what= raw(0), size = 1, n=BFR.SIZE)  :
 negative length vectors are not allowed
  
   A simple traceback shows the error in readBin.  I wouldn't be
 surprised if
   this is a programming issue not located in readBin, but I am
 confused about
   the difference in behaviors on Windows versus mac/linux.  Any
 insight into
   what I can do to remedy the issue and have a cross-platform gunzip()?
  
   Thanks,
   Sean
  
 [[alternative HTML version deleted]]
  
   __
   R-devel@r-project.org mailto:R-devel@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-devel
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readBin differences on Windows and Linux/mac

2008-01-01 Thread Henrik Bengtsson
Also make sure the problem is not due to downloading a gzip file in
text mode, because to the best of my understanding that is platform
dependent.  That is, use download.file(..., mode=wb) instead of the
default, which is mode=w.  (This is such a common error that I would
like to suggest mode=wb to become the default.)

/Henrik

On 01/01/2008, Uwe Ligges [EMAIL PROTECTED] wrote:
 I see. It is either a bug or something related to the following
 paragraph from ?seek:

   We have found so many errors in the Windows implementation of file
   positioning that users are advised to use it only at their own
   risk, and asked not to waste the R developers' time with bug
   reports on Windows' deficiencies.

 I will investigate more closely when I am back in office end of this week.

 Best,
 Uwe




 Sean Davis wrote:
  Sorry, Uwe.  Of course:
 
  Both in relatively recent R-devel (one mac, one windows):
 
  ### gunzip pulled from R.utils to be a simple function
  ### In R.utils, implemented as a method
  gunzip - function(filename, destname=gsub([.]gz$, , filename),
  overwrite=FALSE, remove=TRUE, BFR.SIZE=1e7) {
if (filename == destname)
  stop(sprintf(Argument 'filename' and 'destname' are identical: %s,
  filename));
if (!overwrite  file.exists(destname))
  stop(sprintf(File already exists: %s, destname));
 
inn - gzfile(filename, rb);
on.exit(if (!is.null(inn)) close(inn));
 
out - file(destname, wb);
on.exit(close(out), add=TRUE);
 
nbytes - 0;
repeat {
  bfr - readBin(inn, what=raw(0), size=1, n=BFR.SIZE);
  n - length(bfr);
  if (n == 0)
break;
  nbytes - nbytes + n;
  writeBin(bfr, con=out, size=1);
};
 
if (remove) {
  close(inn);
  inn - NULL;
  file.remove(filename);
}
 
invisible(nbytes);
  }
  download.file('
  ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz'
  ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz')
  gunzip('test.txt.gz')
 
  Under windows, this results in the error reported below.  Under mac and
  linux, results in test.txt being created in the current working
  directory.  The actual gunzip function is pretty bare bones, so I don't
  think it complicates matters much to use it in this example.
 
  Sean
 
 
  On Dec 31, 2007 1:24 PM, Uwe Ligges [EMAIL PROTECTED]
  mailto:[EMAIL PROTECTED] wrote:
 
  Can you give a reproducible example, pelase?
 
  Uwe Ligges
 
 
  Sean Davis wrote:
I have been trying to use the gunzip function in the R.utils
  package.  It
opens a connection to a gzfile, uses readBin to read from that
  connection,
and then uses writeBin to write out the raw data to a new file.
   This works
as expected under linux/mac, but under Windows, I get:
   
Error in readBin(inn, what= raw(0), size = 1, n=BFR.SIZE)  :
  negative length vectors are not allowed
   
A simple traceback shows the error in readBin.  I wouldn't be
  surprised if
this is a programming issue not located in readBin, but I am
  confused about
the difference in behaviors on Windows versus mac/linux.  Any
  insight into
what I can do to remedy the issue and have a cross-platform gunzip()?
   
Thanks,
Sean
   
  [[alternative HTML version deleted]]
   
__
R-devel@r-project.org mailto:R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
 
 

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readBin differences on Windows and Linux/mac

2008-01-01 Thread Henrik Bengtsson
On 01/01/2008, Henrik Bengtsson [EMAIL PROTECTED] wrote:
 Also make sure the problem is not due to downloading a gzip file in
 text mode, because to the best of my understanding that is platform
 dependent.  That is, use download.file(..., mode=wb) instead of the
 default, which is mode=w.  (This is such a common error that I would
 like to suggest mode=wb to become the default.)

Ok, that solves the problem with your example file.   On WinXP/R v2.6.1:

 library(R.utils)
 uri - 
 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz;

 download.file(uri, test.txt.gz)  # mode=w
trying URL 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_ma
trix.txt.gz'
ftp data connection made, file length 918804 bytes
opened URL
downloaded 897 Kb
 file.info(test.txt.gz)$size
[1] 922243

 download.file(uri, test2.txt.gz)
ftp data connection made, file length 918804 bytes
opened URL
downloaded 897 Kb
 file.info(test2.txt.gz)$size
[1] 918804

 gunzip(test.txt.gz)
Error in readBin(inn, what = raw(0), size = 1, n = BFR.SIZE) :
  negative length vectors are not allowed
 gunzip(test2.txt.gz)
 file.info(test2.txt)$size
[1] 3338362

/H


 /Henrik

 On 01/01/2008, Uwe Ligges [EMAIL PROTECTED] wrote:
  I see. It is either a bug or something related to the following
  paragraph from ?seek:
 
We have found so many errors in the Windows implementation of file
positioning that users are advised to use it only at their own
risk, and asked not to waste the R developers' time with bug
reports on Windows' deficiencies.
 
  I will investigate more closely when I am back in office end of this week.
 
  Best,
  Uwe
 
 
 
 
  Sean Davis wrote:
   Sorry, Uwe.  Of course:
  
   Both in relatively recent R-devel (one mac, one windows):
  
   ### gunzip pulled from R.utils to be a simple function
   ### In R.utils, implemented as a method
   gunzip - function(filename, destname=gsub([.]gz$, , filename),
   overwrite=FALSE, remove=TRUE, BFR.SIZE=1e7) {
 if (filename == destname)
   stop(sprintf(Argument 'filename' and 'destname' are identical: %s,
   filename));
 if (!overwrite  file.exists(destname))
   stop(sprintf(File already exists: %s, destname));
  
 inn - gzfile(filename, rb);
 on.exit(if (!is.null(inn)) close(inn));
  
 out - file(destname, wb);
 on.exit(close(out), add=TRUE);
  
 nbytes - 0;
 repeat {
   bfr - readBin(inn, what=raw(0), size=1, n=BFR.SIZE);
   n - length(bfr);
   if (n == 0)
 break;
   nbytes - nbytes + n;
   writeBin(bfr, con=out, size=1);
 };
  
 if (remove) {
   close(inn);
   inn - NULL;
   file.remove(filename);
 }
  
 invisible(nbytes);
   }
   download.file('
   ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz'
   ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz')
   gunzip('test.txt.gz')
  
   Under windows, this results in the error reported below.  Under mac and
   linux, results in test.txt being created in the current working
   directory.  The actual gunzip function is pretty bare bones, so I don't
   think it complicates matters much to use it in this example.
  
   Sean
  
  
   On Dec 31, 2007 1:24 PM, Uwe Ligges [EMAIL PROTECTED]
   mailto:[EMAIL PROTECTED] wrote:
  
   Can you give a reproducible example, pelase?
  
   Uwe Ligges
  
  
   Sean Davis wrote:
 I have been trying to use the gunzip function in the R.utils
   package.  It
 opens a connection to a gzfile, uses readBin to read from that
   connection,
 and then uses writeBin to write out the raw data to a new file.
This works
 as expected under linux/mac, but under Windows, I get:

 Error in readBin(inn, what= raw(0), size = 1, n=BFR.SIZE)  :
   negative length vectors are not allowed

 A simple traceback shows the error in readBin.  I wouldn't be
   surprised if
 this is a programming issue not located in readBin, but I am
   confused about
 the difference in behaviors on Windows versus mac/linux.  Any
   insight into
 what I can do to remedy the issue and have a cross-platform 
   gunzip()?

 Thanks,
 Sean

   [[alternative HTML version deleted]]

 __
 R-devel@r-project.org mailto:R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
  
  
 
  __
  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with dyn.load'ed code

2008-01-01 Thread Matt Calder
Andrew,
Thanks! The version script worked like a charm. Specifically I now
build using:

g++ -shared -Wl,--version-script=ver.map to_dyn_load.cc -o to_dyn_load.so 
-larpack

where ver.map is the file:

{
 global: R_func_*;
 local:*;
};

and any function I want exported to R is named R_func_*. This is going
to be my new SOP. 
Thanks again Andrew, and also Simon. I greatly appreciate you taking
your time to solve this problem for me.

Matt


On Mon, 2007-12-31 at 15:30 -0500, Andrew Piskorski wrote:
 On Sun, Dec 30, 2007 at 10:43:50PM -0500, Matt Calder wrote:
  Simon,
  Thanks for the reply. Indeed, declaring the function static fixes the
  example. Unfortunately, the real problem that gave rise to the example
  arises in a large Fortran library that is not under my control (ARPACK).
  The author is providing BLAS and LAPACK functionality intentionally.
  That may or may not be good practice, but it is a given in this case.
 
 Ok, so R is calling its one dnrm2_ function, let's call this A,
 while ARPACK defines a second, different dnrm2_, which we'll call
 B.  You want to call function A from your own C code, while R keeps
 calling function A as before without any change or interference.  And
 of course, A and B are two C-coded functions with different behaviors
 but the exact same name.  You can make that work, it just requires
 some tricks.
 
  I still feel like the linker ought to be able to solve this problem for
  me. My impression was that the static keyword passed to the linker
 
 It can, you just need to tell it exactly what you want.  I assume you
 are building your own custom C code into a shared library, which you
 then load into R.
 
 Thus, one solution is to statically link the ARPACK library into your
 own shared library, and then carefully tell the linker which symbols
 to export and which to keep private inside your shared library.  As
 long as the symbol ARPACK's B dnrm2_ function is kept private inside
 your own shared library (not exported), R will never see it and will
 happily keep using dnrm2_ A as before.
 
 That's how I've solved this sort of name collision problem in the
 past.  In your src/Makevars, you may want something like to this:
 
   PKG_LIBS = -Wl,--version-script=vis.map -Wl,-Bstatic 
 -L/usr/local/lib/ARPACK -lARPACK -Wl,-Bdynamic
 
 You may also need a PG_PKG_LIBS with the same stuff, but I don't
 remember why.  The '--version-script=' and related matters were also
 disccussed here back in February:
 
   https://stat.ethz.ch/pipermail/r-devel/2007-February/044531.html


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readBin differences on Windows and Linux/mac

2008-01-01 Thread Uwe Ligges
Thank you, Henrik! This saves us a lot of time!

Uwe


Henrik Bengtsson wrote:
 On 01/01/2008, Henrik Bengtsson [EMAIL PROTECTED] wrote:
 Also make sure the problem is not due to downloading a gzip file in
 text mode, because to the best of my understanding that is platform
 dependent.  That is, use download.file(..., mode=wb) instead of the
 default, which is mode=w.  (This is such a common error that I would
 like to suggest mode=wb to become the default.)
 
 Ok, that solves the problem with your example file.   On WinXP/R v2.6.1:
 
 library(R.utils)
 uri - 
 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz;
 
 download.file(uri, test.txt.gz)  # mode=w
 trying URL 
 'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_ma
 trix.txt.gz'
 ftp data connection made, file length 918804 bytes
 opened URL
 downloaded 897 Kb
 file.info(test.txt.gz)$size
 [1] 922243
 
 download.file(uri, test2.txt.gz)
 ftp data connection made, file length 918804 bytes
 opened URL
 downloaded 897 Kb
 file.info(test2.txt.gz)$size
 [1] 918804
 
 gunzip(test.txt.gz)
 Error in readBin(inn, what = raw(0), size = 1, n = BFR.SIZE) :
   negative length vectors are not allowed
 gunzip(test2.txt.gz)
 file.info(test2.txt)$size
 [1] 3338362
 
 /H
 
 /Henrik

 On 01/01/2008, Uwe Ligges [EMAIL PROTECTED] wrote:
 I see. It is either a bug or something related to the following
 paragraph from ?seek:

   We have found so many errors in the Windows implementation of file
   positioning that users are advised to use it only at their own
   risk, and asked not to waste the R developers' time with bug
   reports on Windows' deficiencies.

 I will investigate more closely when I am back in office end of this week.

 Best,
 Uwe




 Sean Davis wrote:
 Sorry, Uwe.  Of course:

 Both in relatively recent R-devel (one mac, one windows):

 ### gunzip pulled from R.utils to be a simple function
 ### In R.utils, implemented as a method
 gunzip - function(filename, destname=gsub([.]gz$, , filename),
 overwrite=FALSE, remove=TRUE, BFR.SIZE=1e7) {
   if (filename == destname)
 stop(sprintf(Argument 'filename' and 'destname' are identical: %s,
 filename));
   if (!overwrite  file.exists(destname))
 stop(sprintf(File already exists: %s, destname));

   inn - gzfile(filename, rb);
   on.exit(if (!is.null(inn)) close(inn));

   out - file(destname, wb);
   on.exit(close(out), add=TRUE);

   nbytes - 0;
   repeat {
 bfr - readBin(inn, what=raw(0), size=1, n=BFR.SIZE);
 n - length(bfr);
 if (n == 0)
   break;
 nbytes - nbytes + n;
 writeBin(bfr, con=out, size=1);
   };

   if (remove) {
 close(inn);
 inn - NULL;
 file.remove(filename);
   }

   invisible(nbytes);
 }
 download.file('
 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz'
 ftp://ftp.ncbi.nih.gov/pub/geo/DATA/SeriesMatrix/GSE1/GSE1_series_matrix.txt.gz','test.txt.gz')
 gunzip('test.txt.gz')

 Under windows, this results in the error reported below.  Under mac and
 linux, results in test.txt being created in the current working
 directory.  The actual gunzip function is pretty bare bones, so I don't
 think it complicates matters much to use it in this example.

 Sean


 On Dec 31, 2007 1:24 PM, Uwe Ligges [EMAIL PROTECTED]
 mailto:[EMAIL PROTECTED] wrote:

 Can you give a reproducible example, pelase?

 Uwe Ligges


 Sean Davis wrote:
   I have been trying to use the gunzip function in the R.utils
 package.  It
   opens a connection to a gzfile, uses readBin to read from that
 connection,
   and then uses writeBin to write out the raw data to a new file.
  This works
   as expected under linux/mac, but under Windows, I get:
  
   Error in readBin(inn, what= raw(0), size = 1, n=BFR.SIZE)  :
 negative length vectors are not allowed
  
   A simple traceback shows the error in readBin.  I wouldn't be
 surprised if
   this is a programming issue not located in readBin, but I am
 confused about
   the difference in behaviors on Windows versus mac/linux.  Any
 insight into
   what I can do to remedy the issue and have a cross-platform 
 gunzip()?
  
   Thanks,
   Sean
  
 [[alternative HTML version deleted]]
  
   __
   R-devel@r-project.org mailto:R-devel@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-devel


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readBin differences on Windows and Linux/mac

2008-01-01 Thread Sean Davis
On Jan 1, 2008 12:36 PM, Uwe Ligges [EMAIL PROTECTED]
wrote:

 Thank you, Henrik! This saves us a lot of time!

 Uwe


 Henrik Bengtsson wrote:
  On 01/01/2008, Henrik Bengtsson [EMAIL PROTECTED] wrote:
  Also make sure the problem is not due to downloading a gzip file in
  text mode, because to the best of my understanding that is platform
  dependent.  That is, use download.file(..., mode=wb) instead of the
  default, which is mode=w.  (This is such a common error that I would
  like to suggest mode=wb to become the default.)


Great!!  Thanks, Henrik and Uwe.

Sean

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Problem with dyn.load'ed code

2008-01-01 Thread Simon Urbanek
Matt,

On Jan 1, 2008, at 12:08 PM, Matt Calder wrote:

 Andrew,
   Thanks! The version script worked like a charm. Specifically I now
 build using:

 g++ -shared -Wl,--version-script=ver.map to_dyn_load.cc -o  
 to_dyn_load.so -larpack


just a word of warning - this is in no way portable. It is probably ok  
for your private compilation, but it won't work in general. In any  
case, I'd strongly recommend enabling this hack only if you know that  
the target system supports it for the sake of portability. However, if  
you are concerned about the latter, you should probably have a look at  
libtool and --export-symbols - it allows you to control the visibility  
on most systems that support it.
Note that there are systems that don't support it at all.

Cheers,
Simon


 where ver.map is the file:

 {
 global: R_func_*;
 local:*;
 };

 and any function I want exported to R is named R_func_*. This is going
 to be my new SOP.
   Thanks again Andrew, and also Simon. I greatly appreciate you taking
 your time to solve this problem for me.

   Matt

   
 On Mon, 2007-12-31 at 15:30 -0500, Andrew Piskorski wrote:
 On Sun, Dec 30, 2007 at 10:43:50PM -0500, Matt Calder wrote:
 Simon,
 Thanks for the reply. Indeed, declaring the function static fixes  
 the
 example. Unfortunately, the real problem that gave rise to the  
 example
 arises in a large Fortran library that is not under my control  
 (ARPACK).
 The author is providing BLAS and LAPACK functionality intentionally.
 That may or may not be good practice, but it is a given in this  
 case.

 Ok, so R is calling its one dnrm2_ function, let's call this A,
 while ARPACK defines a second, different dnrm2_, which we'll call
 B.  You want to call function A from your own C code, while R keeps
 calling function A as before without any change or interference.  And
 of course, A and B are two C-coded functions with different behaviors
 but the exact same name.  You can make that work, it just requires
 some tricks.

 I still feel like the linker ought to be able to solve this  
 problem for
 me. My impression was that the static keyword passed to the linker

 It can, you just need to tell it exactly what you want.  I assume you
 are building your own custom C code into a shared library, which you
 then load into R.

 Thus, one solution is to statically link the ARPACK library into your
 own shared library, and then carefully tell the linker which symbols
 to export and which to keep private inside your shared library.  As
 long as the symbol ARPACK's B dnrm2_ function is kept private  
 inside
 your own shared library (not exported), R will never see it and will
 happily keep using dnrm2_ A as before.

 That's how I've solved this sort of name collision problem in the
 past.  In your src/Makevars, you may want something like to this:

  PKG_LIBS = -Wl,--version-script=vis.map -Wl,-Bstatic -L/usr/local/ 
 lib/ARPACK -lARPACK -Wl,-Bdynamic

 You may also need a PG_PKG_LIBS with the same stuff, but I don't
 remember why.  The '--version-script=' and related matters were also
 disccussed here back in February:

  https://stat.ethz.ch/pipermail/r-devel/2007-February/044531.html


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Wish List

2008-01-01 Thread Gabor Grothendieck
Most of the items on this list have been mentioned before but it
may be useful to see them altogether and at any rate every year
I have posted my R wishlist at the beginning of the year.

High priority items pertain to the foundations of R (promises,
environments) since those form the basis of everything
else and the foundation needs to be looked after first.

The medium items are focused on scripting since with a few additional
features R could work more smoothly with other software.

For the Low priority items we listed the rest.  They are not necessarily
low in terms of desirability but I wanted to focus the high and
medium items on foundations and scripting.

There is also a section at the end focusing on addon packages.
These may be strictly speaking part of R but are widely used.

High

1. Some way of inspecting promises.  It is possible to get
the expression associated with a promise using substitute but
not its environment.  Also need a way to copy a promise without
forcing it.  See:
https://stat.ethz.ch/pipermail/r-devel/2007-September/046966.html

2. Fix bug when promises are stored in lists:

f - function(x) environment()
as.list(f(0))$x == 0 # gives error.  Should be TRUE.

3. If a package uses LazyLoad: true then R changes the class of
certain top level objects.  This does not occur if Lazyload: false
is used.  For an example see:
https://stat.ethz.ch/pipermail/r-devel/2007-October/047118.html

4. If two environment variables point to the same environment they
cannot have different attributes.  This effectively thwarts subclassing
of environments (contrary to OO principles).

Medium

5. Sweave. A common scanario is spawning a Sweave job from
another program (such as from a program controlling a
web site).  The caller needs to pass some information to the
Sweave program such as the file name of a report to produce.
Its possible to spawn R and have R spawn sweave but given the
existence of R CMD Sweave it would be nice to be able to just
spawn R CMD Sweave directly.  Features that would help here
would be:

- support --args or some other method of passing arguments
  from R CMD Sweave line to the Sweave script

- have a facility whereby R CMD Sweave can directly generate
  the .pdf file and an argument which allows the caller to
  define the name of the resulting pdf file, e.g. -o.  (With
  automated reports one may need to have many different outputs
  from the same Rnw file so its important to name them differently.)

- an -x argument similar to Perl/Python/Ruby such that if one calls
  R CMD Sweave -x abc myfile.Rnw then all lines up to the first one
  matching the indicated regexp, abc here, are skipped.  This
  facilitates combining the script with a shell or batch file if the
  previous is not enough.

Thus one could spawn this from their program:
R CMD Sweave --pdf myfile.Rnw -o myfile-123.pdf --args 23
and it would generate a pdf file from myfile.Rnw of the
indicated name passing 23 as arg1 to the R code embedded in the
Sweave file.

See:
https://stat.ethz.ch/pipermail/r-devel/2007-October/047195.html
https://stat.ethz.ch/pipermail/r-help/2007-December/148091.html

6. -x flag on Rscript as in perl/python/ruby.  Useful for combining batch
   and R file into a single file on non-UNIX systems.  It would cause all
   lines until a line starting with #!Rscript to be skipped by the R
   processor.  See:
   https://www.stat.math.ethz.ch/pipermail/r-devel/2007-January/044433.html
   Also see
   http://www.datafocus.com/docs/perl/pod/perlwin32.asp#running_perl_scripts
   since the same considerations as for Perl scripts applies.
   There is also some discussion here:
   https://stat.ethz.ch/pipermail/r-help/2007-November/145279.html
   https://stat.ethz.ch/pipermail/r-help/2007-November/145301.html

Low

7. Define Lag - function(x, k = 1, ...) lag(x, -k, ,..)

   so the user has his choice of which orientation he prefers.
   Many packages could make use of it if it were in the core of R including
   zoo, dyn, dynlm, fame and others.  This would also address comments
   such as in ISSUE 4 on this page which is associated with a popular
   book on time series:
   http://www.stat.pitt.edu/stoffer/tsa2/Rissues.htm

8. On Windows, package build tools should check that Cygwin is in
correct position on PATH and issue meaningful error if not.  If
you get this wrong currently its quite hard to diagnose unless
you know about it.

9. Implement the R shell() and shell.exec() commands on
non-Windows systems.

10. print.function should be improved to make it obvious how to find what
the user is undoubtedly looking for in both the S3 and S4 cases.
That would address one of the criticisms here:

http://www.stat.columbia.edu/~cook/movabletype/archives/2007/08/they_started_me.html
(The other criticisms at this link are worth addressing too -- ggplot2
and several existing or upcoming books on grid, lattice and ggplot
R graphics are
presumably addressing  the criticism that creating 

[Rd] Error from wilcox.test

2008-01-01 Thread Wolfgang Huber
When one of the two groups has only one member and the other one more 
than 49, wilcox.test will exit with the below error message,

  n=51; wilcox.test(1:n ~ 1:n==1, conf.int=TRUE)

  Error in uniroot(wdiff, c(mumin, mumax), tol = 1e-04, zq = 
qnorm(alpha/2,  :
   f() values at end points not of opposite sign


whereas with n=50 a result is returned:

  n=50; wilcox.test(1:n ~ 1:n==1, conf.int=TRUE)

 Wilcoxon rank sum test

data:  1:n by 1:n == 1
W = 49, p-value = 0.04
alternative hypothesis: true location shift is not equal to 0
95 percent confidence interval:
   1 49
sample estimates:
difference in location
 25


I wonder whether it would be worthwhile to make the wilcox.test function 
handle such (admittedly pathologic) cases more gracefully.


Happy New Year to all -
   Wolfgang

--
Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] large netCDF files under Windows.

2008-01-01 Thread Prof Brian Ripley
For the record: Thomas provided a test case and both ncdf and RNetCDF have 
been rebuilt with support for files  2Gb.  Updated versions are now on 
the CRANextras repository.

On Sun, 30 Dec 2007, Prof Brian Ripley wrote:

 The netcdf code is rather silly: it uses 64-bit versions of seek etc when
 building a DLL, not on Windows.  I can rebuild it using 64-bit seek, but
 could you please send me (privately) a test case?

 On Sat, 29 Dec 2007, Thomas Lumley wrote:

 Has anyone successfully used R to access netCDF files larger than 2Gb
 under Windows?

 With the version of the ncdf package that Brian Ripley provides for CRAN
 extras I get an assertion failure with a 12Gb file, but not a 1Gb subset
 of it. The same 12Gb file is ok with ncdf on Mac OS X (32bit R) and on
 Linux(64bit R).

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Error from wilcox.test

2008-01-01 Thread Prof Brian Ripley
Try exact = TRUE: the default switches to a normal approximation that will 
not be adequate in your extreme example.

On Tue, 1 Jan 2008, Wolfgang Huber wrote:

 When one of the two groups has only one member and the other one more
 than 49, wilcox.test will exit with the below error message,

  n=51; wilcox.test(1:n ~ 1:n==1, conf.int=TRUE)

  Error in uniroot(wdiff, c(mumin, mumax), tol = 1e-04, zq =
 qnorm(alpha/2,  :
   f() values at end points not of opposite sign


 whereas with n=50 a result is returned:

  n=50; wilcox.test(1:n ~ 1:n==1, conf.int=TRUE)

 Wilcoxon rank sum test

 data:  1:n by 1:n == 1
 W = 49, p-value = 0.04
 alternative hypothesis: true location shift is not equal to 0
 95 percent confidence interval:
   1 49
 sample estimates:
 difference in location
 25


 I wonder whether it would be worthwhile to make the wilcox.test function
 handle such (admittedly pathologic) cases more gracefully.


 Happy New Year to all -
   Wolfgang

 --
 Wolfgang Huber  EBI/EMBL  Cambridge UK  http://www.ebi.ac.uk/huber

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel