[Rd] faster base::sequence

2010-11-28 Thread Romain Francois

Hello,

Based on yesterday's R-help thread (help: program efficiency), and 
following Bill's suggestions, it appeared that sequence:


 sequence
function (nvec)
unlist(lapply(nvec, seq_len))
environment: namespace:base

could benefit from being written in C to avoid unnecessary memory 
allocations.


I made this version using inline:

require( inline )
sequence_c - local( {
fx - cfunction( signature( x = integer), '
int n = length(x) ;
int* px = INTEGER(x) ;
int x_i, s = 0 ;
/* error checking */
for( int i=0; in; i++){
x_i = px[i] ;
/* this includes the check for NA */
if( x_i = 0 ) error( needs non negative integer ) ;
s += x_i ;
}

SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
int * p_res = INTEGER(res) ;
for( int i=0; in; i++){
x_i = px[i] ;
for( int j=0; jx_i; j++, p_res++)
*p_res = j+1 ;
}
UNPROTECT(1) ;
return res ;
' )
function( nvec ){
fx( as.integer(nvec) )
}
})


And here are some timings:

 x - 1:1
 system.time( a - sequence(x ) )
utilisateur système  écoulé
  0.191   0.108   0.298
 system.time( b - sequence_c(x ) )
utilisateur système  écoulé
  0.060   0.063   0.122
 identical( a, b )
[1] TRUE



 system.time( for( i in 1:1) sequence(1:10) )
utilisateur système  écoulé
  0.119   0.000   0.119

 system.time( for( i in 1:1) sequence_c(1:10) )
utilisateur système  écoulé
  0.019   0.000   0.019


I would write a proper patch if someone from R-core is willing to push it.

Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] faster base::sequence

2010-11-28 Thread Prof Brian Ripley

Is sequence used enough to warrant this?  As the help page says

 Note that ‘sequence - function(nvec) unlist(lapply(nvec,
 seq_len))’ and it mainly exists in reverence to the very early
 history of R.

I regard it as unsafe to assume that NA_INTEGER will always be 
negative, and bear in mind that at some point not so far off R 
integers (or at least lengths) will need to be more than 32-bit.


On Sun, 28 Nov 2010, Romain Francois wrote:


Hello,

Based on yesterday's R-help thread (help: program efficiency), and following 
Bill's suggestions, it appeared that sequence:



sequence

function (nvec)
unlist(lapply(nvec, seq_len))
environment: namespace:base

could benefit from being written in C to avoid unnecessary memory 
allocations.


I made this version using inline:

require( inline )
sequence_c - local( {
   fx - cfunction( signature( x = integer), '
   int n = length(x) ;
   int* px = INTEGER(x) ;
   int x_i, s = 0 ;
   /* error checking */
   for( int i=0; in; i++){
   x_i = px[i] ;
   /* this includes the check for NA */
   if( x_i = 0 ) error( needs non negative integer ) ;
   s += x_i ;
   }

   SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
   int * p_res = INTEGER(res) ;
   for( int i=0; in; i++){
   x_i = px[i] ;
   for( int j=0; jx_i; j++, p_res++)
   *p_res = j+1 ;
   }
   UNPROTECT(1) ;
   return res ;
   ' )
   function( nvec ){
   fx( as.integer(nvec) )
   }
})


And here are some timings:


x - 1:1
system.time( a - sequence(x ) )

utilisateur système  écoulé
 0.191   0.108   0.298

system.time( b - sequence_c(x ) )

utilisateur système  écoulé
 0.060   0.063   0.122

identical( a, b )

[1] TRUE




system.time( for( i in 1:1) sequence(1:10) )

utilisateur système  écoulé
 0.119   0.000   0.119


system.time( for( i in 1:1) sequence_c(1:10) )

utilisateur système  écoulé
 0.019   0.000   0.019


I would write a proper patch if someone from R-core is willing to push it.

Romain

--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] faster base::sequence

2010-11-28 Thread Romain Francois

Le 28/11/10 10:30, Prof Brian Ripley a écrit :

Is sequence used enough to warrant this? As the help page says

Note that ‘sequence - function(nvec) unlist(lapply(nvec,
seq_len))’ and it mainly exists in reverence to the very early
history of R.


I don't know. Would it be used more if it were more efficient ?


I regard it as unsafe to assume that NA_INTEGER will always be negative,
and bear in mind that at some point not so far off R integers (or at
least lengths) will need to be more than 32-bit.


sure. updated and dressed up as a patch.

I've made it a .Call because I'm not really comfortable with .Internal, 
etc ...


Do you mean that I should also use something else instead of int and 
int*. Is there some future proof typedef or macro for the type 
associated with INTSXP ?




On Sun, 28 Nov 2010, Romain Francois wrote:


Hello,

Based on yesterday's R-help thread (help: program efficiency), and
following Bill's suggestions, it appeared that sequence:


sequence

function (nvec)
unlist(lapply(nvec, seq_len))
environment: namespace:base

could benefit from being written in C to avoid unnecessary memory
allocations.

I made this version using inline:

require( inline )
sequence_c - local( {
fx - cfunction( signature( x = integer), '
int n = length(x) ;
int* px = INTEGER(x) ;
int x_i, s = 0 ;
/* error checking */
for( int i=0; in; i++){
x_i = px[i] ;
/* this includes the check for NA */
if( x_i = 0 ) error( needs non negative integer ) ;
s += x_i ;
}

SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
int * p_res = INTEGER(res) ;
for( int i=0; in; i++){
x_i = px[i] ;
for( int j=0; jx_i; j++, p_res++)
*p_res = j+1 ;
}
UNPROTECT(1) ;
return res ;
' )
function( nvec ){
fx( as.integer(nvec) )
}
})


And here are some timings:


x - 1:1
system.time( a - sequence(x ) )

utilisateur système écoulé
0.191 0.108 0.298

system.time( b - sequence_c(x ) )

utilisateur système écoulé
0.060 0.063 0.122

identical( a, b )

[1] TRUE




system.time( for( i in 1:1) sequence(1:10) )

utilisateur système écoulé
0.119 0.000 0.119


system.time( for( i in 1:1) sequence_c(1:10) )

utilisateur système écoulé
0.019 0.000 0.019


I would write a proper patch if someone from R-core is willing to push
it.

Romain


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube

Index: src/library/base/R/seq.R
===
--- src/library/base/R/seq.R(revision 53680)
+++ src/library/base/R/seq.R(working copy)
@@ -85,4 +85,6 @@
 }
 
 ## In reverence to the very first versions of R which already had sequence():
-sequence - function(nvec) unlist(lapply(nvec, seq_len))
+# sequence - function(nvec) unlist(lapply(nvec, seq_len))
+sequence - function(nvec) .Call( sequence, as.integer(nvec), PACKAGE = 
base )
+
Index: src/main/registration.c
===
--- src/main/registration.c (revision 53680)
+++ src/main/registration.c (working copy)
@@ -245,6 +245,8 @@
 CALLDEF(bitwiseOr, 2),
 CALLDEF(bitwiseXor, 2),
 
+/* sequence */
+CALLDEF(sequence,1),
 {NULL, NULL, 0}
 };
 
Index: src/main/seq.c
===
--- src/main/seq.c  (revision 53680)
+++ src/main/seq.c  (working copy)
@@ -679,3 +679,28 @@
 
 return ans;
 }
+
+SEXP attribute_hidden sequence(SEXP x)
+{
+   R_len_t n = length(x), s = 0 ;
+   int *px = INTEGER(x) ;
+   int x_i ;
+   /* error checking */
+   for( int i=0; in; i++){
+   x_i = px[i] ;
+   if( x_i == NA_INTEGER || x_i = 0 ) 
+   error( _(argument must be coercible to non-negative integer) ) ;
+   s += x_i ;
+   }
+   
+   SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
+   int *p_res = INTEGER(res) ;
+   for( int i=0; in; i++){
+   x_i = px[i] ;
+   for( int j=0; jx_i; j++, p_res++)
+   *p_res = j+1 ;
+   }
+   UNPROTECT(1) ;
+   return res ;
+}
+
Index: src/main/basedecl.h
===
--- src/main/basedecl.h (revision 53680)
+++ src/main/basedecl.h (working copy)
@@ -114,3 +114,6 @@
 SEXP bitwiseAnd(SEXP, SEXP);
 SEXP bitwiseOr(SEXP, SEXP);
 SEXP bitwiseXor(SEXP, SEXP);
+
+SEXP sequence(SEXP);
+
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] .Rdata file in data subdirectory won't load

2010-11-28 Thread Ronald Barry
Greetings,
  I wanted to add a dataset to a complete R package I am working on
(the package cleanly installs and passes the R CMD check).
The data (a matrix) was saved, and the save() image dragged to the
/data folder, and is a .Rdata file.  It can be read directly using
load (see below), but now the R CMD check indicates subdirectory data
contains no datasets and it won't load using 'data()'.  I have read
through the R extensions manual and Leisch's tutorial, but can't find
a good hint as to what is going wrong.  I also tried this with
'polygon1'
added to the 'export' in the NAMESPACE, with no effect.

 library(latticeDensity)
 data(polygon1)
Warning message:
In data(polygon1) : data set 'polygon1' not found
 file.choose()
[1] C:\\Documents and Settings\\Ronald Barry\\My
Documents\\latticeDensity\\data\\polygon1.Rdata
 load( C:\\Documents and Settings\\Ronald Barry\\My 
 Documents\\latticeDensity\\data\\polygon1.Rdata)
 polygon1
   [,1]  [,2]
 [1,] 0.6421053 0.8132050
 [2,] 0.6845247 0.4814305
 [3,] 0.7057345 0.2858322
 [4,] 0.6696779 0.2066025
 [5,] 0.5190888 0.1892710
 [6,] 0.5445405 0.4145805
 [7,] 0.5424195 0.6275103
 [8,] 0.5233307 0.7983494
 [9,] 0.500 0.7983494
[10,] 0.5127258 0.5458047
[11,] 0.5042419 0.3303989
[12,] 0.4851532 0.1348006
[13,] 0.2836606 0.1843191
[14,] 0.2582090 0.3675378
[15,] 0.1733700 0.6795048
[16,] 0.3154753 0.8057772
[17,] 0.2391202 1.0162311
[18,] 0.5381775 0.9592847
[19,] 0.7333071 0.9320495



Thank you for any pointers.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] faster base::sequence

2010-11-28 Thread Prof Brian Ripley

On Sun, 28 Nov 2010, Romain Francois wrote:


Le 28/11/10 10:30, Prof Brian Ripley a écrit :

Is sequence used enough to warrant this? As the help page says

Note that ‘sequence - function(nvec) unlist(lapply(nvec,
seq_len))’ and it mainly exists in reverence to the very early
history of R.


I don't know. Would it be used more if it were more efficient ?


It is for you to make a compelling case for others to do work 
(maintain changed code) for your wish.



I regard it as unsafe to assume that NA_INTEGER will always be negative,
and bear in mind that at some point not so far off R integers (or at
least lengths) will need to be more than 32-bit.


sure. updated and dressed up as a patch.

I've made it a .Call because I'm not really comfortable with .Internal, etc 
...


Do you mean that I should also use something else instead of int and 
int*. Is there some future proof typedef or macro for the type associated 
with INTSXP ?


Not yet.  I was explaining why NA_INTEGER might change.




On Sun, 28 Nov 2010, Romain Francois wrote:


Hello,

Based on yesterday's R-help thread (help: program efficiency), and
following Bill's suggestions, it appeared that sequence:


sequence

function (nvec)
unlist(lapply(nvec, seq_len))
environment: namespace:base

could benefit from being written in C to avoid unnecessary memory
allocations.

I made this version using inline:

require( inline )
sequence_c - local( {
fx - cfunction( signature( x = integer), '
int n = length(x) ;
int* px = INTEGER(x) ;
int x_i, s = 0 ;
/* error checking */
for( int i=0; in; i++){
x_i = px[i] ;
/* this includes the check for NA */
if( x_i = 0 ) error( needs non negative integer ) ;
s += x_i ;
}

SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
int * p_res = INTEGER(res) ;
for( int i=0; in; i++){
x_i = px[i] ;
for( int j=0; jx_i; j++, p_res++)
*p_res = j+1 ;
}
UNPROTECT(1) ;
return res ;
' )
function( nvec ){
fx( as.integer(nvec) )
}
})


And here are some timings:


x - 1:1
system.time( a - sequence(x ) )

utilisateur système écoulé
0.191 0.108 0.298

system.time( b - sequence_c(x ) )

utilisateur système écoulé
0.060 0.063 0.122

identical( a, b )

[1] TRUE




system.time( for( i in 1:1) sequence(1:10) )

utilisateur système écoulé
0.119 0.000 0.119


system.time( for( i in 1:1) sequence_c(1:10) )

utilisateur système écoulé
0.019 0.000 0.019


I would write a proper patch if someone from R-core is willing to push
it.

Romain


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] faster base::sequence

2010-11-28 Thread Romain Francois

Le 28/11/10 11:30, Prof Brian Ripley a écrit :

On Sun, 28 Nov 2010, Romain Francois wrote:


Le 28/11/10 10:30, Prof Brian Ripley a écrit :

Is sequence used enough to warrant this? As the help page says

Note that ‘sequence - function(nvec) unlist(lapply(nvec,
seq_len))’ and it mainly exists in reverence to the very early
history of R.


I don't know. Would it be used more if it were more efficient ?


It is for you to make a compelling case for others to do work (maintain
changed code) for your wish.


No trouble. The patch is there, if anyone finds it interesting or 
compelling, they will speak up I suppose.


Otherwise it is fine for me if it ends up in no man's land. I have the 
code, if I want to use it, I can squeeze it in a package.



I regard it as unsafe to assume that NA_INTEGER will always be negative,
and bear in mind that at some point not so far off R integers (or at
least lengths) will need to be more than 32-bit.


sure. updated and dressed up as a patch.

I've made it a .Call because I'm not really comfortable with
.Internal, etc ...



Do you mean that I should also use something else instead of int and
int*. Is there some future proof typedef or macro for the type
associated with INTSXP ?


Not yet. I was explaining why NA_INTEGER might change.


sure. thanks for the reminder.


On Sun, 28 Nov 2010, Romain Francois wrote:


Hello,

Based on yesterday's R-help thread (help: program efficiency), and
following Bill's suggestions, it appeared that sequence:


sequence

function (nvec)
unlist(lapply(nvec, seq_len))
environment: namespace:base

could benefit from being written in C to avoid unnecessary memory
allocations.

I made this version using inline:

require( inline )
sequence_c - local( {
fx - cfunction( signature( x = integer), '
int n = length(x) ;
int* px = INTEGER(x) ;
int x_i, s = 0 ;
/* error checking */
for( int i=0; in; i++){
x_i = px[i] ;
/* this includes the check for NA */
if( x_i = 0 ) error( needs non negative integer ) ;
s += x_i ;
}

SEXP res = PROTECT( allocVector( INTSXP, s ) ) ;
int * p_res = INTEGER(res) ;
for( int i=0; in; i++){
x_i = px[i] ;
for( int j=0; jx_i; j++, p_res++)
*p_res = j+1 ;
}
UNPROTECT(1) ;
return res ;
' )
function( nvec ){
fx( as.integer(nvec) )
}
})


And here are some timings:


x - 1:1
system.time( a - sequence(x ) )

utilisateur système écoulé
0.191 0.108 0.298

system.time( b - sequence_c(x ) )

utilisateur système écoulé
0.060 0.063 0.122

identical( a, b )

[1] TRUE




system.time( for( i in 1:1) sequence(1:10) )

utilisateur système écoulé
0.119 0.000 0.119


system.time( for( i in 1:1) sequence_c(1:10) )

utilisateur système écoulé
0.019 0.000 0.019


I would write a proper patch if someone from R-core is willing to push
it.

Romain


--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube







--
Romain Francois
Professional R Enthusiast
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://bit.ly/9VOd3l : ZAT! 2010
|- http://bit.ly/c6DzuX : Impressionnism with R
`- http://bit.ly/czHPM7 : Rcpp Google tech talk on youtube

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] .Rdata file in data subdirectory won't load

2010-11-28 Thread Prof Brian Ripley
It needs to be polygon1.RData, not .Rdata.  You've not actually told 
us your OS, but it looks like you imagine that R is case-insensitive 
just because Windows is.


On Sun, 28 Nov 2010, Ronald Barry wrote:


Greetings,
 I wanted to add a dataset to a complete R package I am working on
(the package cleanly installs and passes the R CMD check).
The data (a matrix) was saved, and the save() image dragged to the
/data folder, and is a .Rdata file.  It can be read directly using
load (see below), but now the R CMD check indicates subdirectory data
contains no datasets and it won't load using 'data()'.  I have read
through the R extensions manual and Leisch's tutorial, but can't find


Try ?data !


a good hint as to what is going wrong.  I also tried this with
'polygon1'
added to the 'export' in the NAMESPACE, with no effect.


library(latticeDensity)
data(polygon1)

Warning message:
In data(polygon1) : data set 'polygon1' not found

file.choose()

[1] C:\\Documents and Settings\\Ronald Barry\\My
Documents\\latticeDensity\\data\\polygon1.Rdata

load( C:\\Documents and Settings\\Ronald Barry\\My 
Documents\\latticeDensity\\data\\polygon1.Rdata)
polygon1

  [,1]  [,2]
[1,] 0.6421053 0.8132050
[2,] 0.6845247 0.4814305
[3,] 0.7057345 0.2858322
[4,] 0.6696779 0.2066025
[5,] 0.5190888 0.1892710
[6,] 0.5445405 0.4145805
[7,] 0.5424195 0.6275103
[8,] 0.5233307 0.7983494
[9,] 0.500 0.7983494
[10,] 0.5127258 0.5458047
[11,] 0.5042419 0.3303989
[12,] 0.4851532 0.1348006
[13,] 0.2836606 0.1843191
[14,] 0.2582090 0.3675378
[15,] 0.1733700 0.6795048
[16,] 0.3154753 0.8057772
[17,] 0.2391202 1.0162311
[18,] 0.5381775 0.9592847
[19,] 0.7333071 0.9320495





Thank you for any pointers.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] package matrix dummy.cpp

2010-11-28 Thread Ambrus Kaposi
Hi,

The recommended package matrix contains an empty file src/dummy.cpp
which results in using g++ instead of gcc to link Matrix.so.
What is the reason for that? Is there any difference between using g++
or gcc? (There are no other cpp files in the source)
I asked the maintainers of the package (matrix-auth...@r-project.org)
3 weeks ago but haven't received any answer.
On my system (NixOS Linux distribution, http://nixos.org) I can't
compile package Matrix unless this file is deleted.

Thank you very much,
Ambrus Kaposi

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] package matrix dummy.cpp

2010-11-28 Thread Prof Brian Ripley

It is Matrix, not matrix 

I too have corresponded with them about this.  It seems to be a legacy 
from when the package contained C++ code, and can now be deleted.


On Sun, 28 Nov 2010, Ambrus Kaposi wrote:


Hi,

The recommended package matrix contains an empty file src/dummy.cpp
which results in using g++ instead of gcc to link Matrix.so.
What is the reason for that? Is there any difference between using g++
or gcc? (There are no other cpp files in the source)
I asked the maintainers of the package (matrix-auth...@r-project.org)
3 weeks ago but haven't received any answer.
On my system (NixOS Linux distribution, http://nixos.org) I can't
compile package Matrix unless this file is deleted.


Most likely you have not installed the C++ compiler (which is usually 
g++ on Linux) -- but you shouldn't need to in order to install R.



Thank you very much,
Ambrus Kaposi

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] switch() disallowing multiple default values Re: Bug in parseNamespaceFile or switch( , ... ) ?

2010-11-28 Thread Duncan Murdoch
I've now committed changes in R-devel and R-patched to detect cases 
where a call to switch() contains multiple unnamed alternatives.  The 
code only complains if the EXPR argument is a character string; unnamed 
alternatives are fine with numeric switching.


Adding this check turned up 3 more typos like this in the base code 
besides the one in parseNamespaceFile.  I expect it will turn up quite a 
few more in CRAN and Bioconductor packages.


Please let me know right away if you've got correct code that generates 
the warnings or errors.


Duncan Murdoch


  In R-devel they're an error, in R-patched they'll just give a warning.

On 27/11/2010 7:09 PM, Duncan Murdoch wrote:

On 27/11/2010 6:50 PM, Duncan Murdoch wrote:

On 27/11/2010 5:58 PM, Charles C. Berry wrote:


parseNamespaceFile() doesn't seem to detect misspelled directives. Looking
at its code I see

switch(as.character(e[[1L]]),

lots of args omitted here,

stop(gettextf(unknown namespace directive: %s,
deparse(e)), call. = FALSE, domain = NA))

but this doesn't seem to function as I expect, viz. to stop with an error
if I type a wrong directive.


You're right, there was a typo in parseNamespaceFile.  (The typo was in
this line:

  =, - = {

This should have been

  = =, - = {

Without the extra = sign, the = was taken as the default value of the
switch, and the stop() was never reached.

Conceivably switch() should complain if it is called with more than one
default.


I suspect when I fix this it's going to flush out some typos in packages
on CRAN...

Duncan Murdoch



Duncan Murdoch



Details:

# create dummy NAMESPACE file with two bad / one good directives
cat(blah( nada )\nblee( nil )\nexport( outDS )\n,file=NAMESPACE)
readLines(NAMESPACE)

[1] blah( nada )blee( nil ) export( outDS )

parseNamespaceFile(,.) # now parse it

$imports
list()

$exports
[1] outDS

$exportPatterns
character(0)

$importClasses
list()

$importMethods
list()

$exportClasses
character(0)

$exportMethods
character(0)

$exportClassPatterns
character(0)

$dynlibs
character(0)

$nativeRoutines
list()

$S3methods
 [,1] [,2] [,3]





So, it picked up 'export' and ignored the other two lines.


Chuck

p.s.


sessionInfo()

R version 2.12.0 (2010-10-15)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base








Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel






__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] tar R command

2010-11-28 Thread Henrik Bengtsson
First, if you look carefully, then you see that argument 'files'
should specify *filepaths*, i.e. directories and not specific files.
Thus, if you for instance place your files in directory foo/ and
then call

tar(foo.tar, files=foo/);

you would do the right thing.

HOWEVER, looking at the internals of base::tar(), it seems to be
designed for a non-Windows platform, i.e. it will not work on Windows
as it stands (more below).  A workaround that also illustrating the
problems are the following patch(es):

# PATCH for file.info() such that tar() works on Windows
tar - utils::tar; environment(tar) - globalenv();
file.info - function(...) {
  fi - base::file.info(...);
  fi[setdiff(c(uid, gid, uname, grname), names(fi))] - NA;
  fi;
} # file.info()

Example:

dir.create(foo/);
cat(file=foo/foo.txt, rep(letters, times=100));
tar(foo.tar, files=foo/);
str(file.info(foo.tar));

'data.frame':   1 obs. of  11 variables:
 $ size  : num 7680
 $ isdir : logi FALSE
 $ mode  :Class 'octmode'  int 438
 $ mtime : POSIXct, format: 2010-11-28 20:24:05
 $ ctime : POSIXct, format: 2010-11-28 20:03:56
 $ atime : POSIXct, format: 2010-11-28 20:07:40
 $ exe   : chr no
 $ uid   : logi NA
 $ gid   : logi NA
 $ uname : logi NA
 $ grname: logi NA

This seems to generate a valid foo.tar file.


PROBLEMS:
Here are a few problems I have identified with tar().

PROBLEM #1:
The default for argument files=NULL is documented to archive all
files under the current directory.  In reality it gives:

  Error in list.files(files, recursive = TRUE, all.files = TRUE,
full.names = TRUE: invalid 'directory' argument

because list.files(NULL) is invalid.  The default should instead be files=..

PROBLEM #2:
If passing a non-existing path (argument 'files'), then tar()
generates an invalid *.tar file of size 1024 bytes (not empty as OP
say).  Better would be to assert that each of the directories
requested really exists and are directories, e.g. using
file.info()$dir.

PROBLEM #3:
tar() assumes that file.info() returns a data.frame with fields 'uid',
'gid' and 'uname'.  That is not the case for file.info() on Windows.


 sessionInfo()
R version 2.12.0 Patched (2010-11-24 r53656)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

My $0.20

/Henrik


On Sun, Nov 28, 2010 at 7:00 PM, Dario Strbenac
d.strbe...@garvan.org.au wrote:
 Hello,

 The documentation for the tar command leads me to think there is an internal 
 implementation when the command can't be found in the OS.

 However, it doesn't seem to be the case, as I get an empty .tar file 
 generated on a small example I made :

 dir(pattern = jpg)
 [1] MA56237502_635.jpg
 file.info(MA56237502_635.jpg)
                     size isdir mode               mtime               ctime   
             atime exe
 MA56237502_635.jpg 229831 FALSE  666 2010-11-29 13:05:49 2010-11-29 13:00:36 
 2010-11-29 13:00:36  no
 tar(example.tar, files = dir(pattern = jpg))
 file.info(example.tar)
            size isdir mode               mtime               ctime            
    atime exe
 example.tar 1024 FALSE  666 2010-11-29 13:43:29 2010-11-29 13:42:30 
 2010-11-29 13:42:30  no

 Is this an unimplemented feature ?

 sessionInfo()
 R version 2.12.0 (2010-10-15)
 Platform: x86_64-pc-mingw32/x64 (64-bit)
 ...                ...               ...

 Thanks,
       Dario.

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel