Re: [Rd] Compression of largish expression array files in the DAAGbio/inst/doc directory?

2011-04-09 Thread Prof Brian Ripley
As far as I can see read.maimages is built on top of R's own 
file-reading facilties, and they all read compressed (but not zipped) 
files as from R 2.10.0.


So simply use

gzip -9 coral55?.spot

and rename the files back to *.spot.

If you need more compression, use xz -9e.  (You can also do this in R: 
readLines() on the file, writeLines() using gzfile or xzfile.)


You will need to make the package 'Depends: R (= 2.10)'.

On Sat, 9 Apr 2011, John Maindonald wrote:


The inst/doc directory of the DAAG package has 6 files coral551.spot, ... that
are around 0.85 MB each.  It would be useful to be able to zip then, but that
as matters stand interferes with the use of the Sweave file that uses them to
demonstrate input of expression array data that is in the spot format.  They
do not automatically get unzipped when required.  I have checked that
read.maimages (in limma) does not, unless I have missed something, have
an option for reading zipped files.  Is there any way to get around this without
substantially complicating the exposition in marray-notes.pdf (also in the
inst/doc subdirectory)?

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics  Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Compression of largish expression array files in the DAAGbio/inst/doc directory?

2011-04-09 Thread John Maindonald
Thanks.  That seems to work.

John Maindonald email: john.maindon...@anu.edu.au
phone : +61 2 (6125)3473fax  : +61 2(6125)5549
Centre for Mathematics  Its Applications, Room 1194,
John Dedman Mathematical Sciences Building (Building 27)
Australian National University, Canberra ACT 0200.
http://www.maths.anu.edu.au/~johnm

On 09/04/2011, at 4:58 PM, Prof Brian Ripley wrote:

 As far as I can see read.maimages is built on top of R's own file-reading 
 facilties, and they all read compressed (but not zipped) files as from R 
 2.10.0.
 
 So simply use
 
 gzip -9 coral55?.spot
 
 and rename the files back to *.spot.
 
 If you need more compression, use xz -9e.  (You can also do this in R: 
 readLines() on the file, writeLines() using gzfile or xzfile.)
 
 You will need to make the package 'Depends: R (= 2.10)'.
 
 On Sat, 9 Apr 2011, John Maindonald wrote:
 
 The inst/doc directory of the DAAG package has 6 files coral551.spot, ... 
 that
 are around 0.85 MB each.  It would be useful to be able to zip then, but that
 as matters stand interferes with the use of the Sweave file that uses them to
 demonstrate input of expression array data that is in the spot format.  
 They
 do not automatically get unzipped when required.  I have checked that
 read.maimages (in limma) does not, unless I have missed something, have
 an option for reading zipped files.  Is there any way to get around this 
 without
 substantially complicating the exposition in marray-notes.pdf (also in the
 inst/doc subdirectory)?
 
 John Maindonald email: john.maindon...@anu.edu.au
 phone : +61 2 (6125)3473fax  : +61 2(6125)5549
 Centre for Mathematics  Its Applications, Room 1194,
 John Dedman Mathematical Sciences Building (Building 27)
 Australian National University, Canberra ACT 0200.
 http://www.maths.anu.edu.au/~johnm
 
 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel
 
 
 -- 
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] stats/arima.c memory allocation

2011-04-09 Thread Matteo Bertini
Looking at the arima.c code related to arima fitting I noticed that the code
is mainly a merge of:

- Gardner, G, Harvey, A. C. and Phillips, G. D. A. (1980) Algorithm AS154.
An algorithm for exact maximum likelihood estimation of
autoregressive-moving average models by means of Kalman filtering. Applied
Statistics 29, 311–322.
- Jones, R. H. (1980) Maximum likelihood fitting of ARMA models to time
series with missing observations. Technometrics 20 389–395.

The first is used to fit the initial P0 matrix, and the second to do the
forecasts.

The AS154 implementation of P0 computation is O(r^4/8) in memory
requirements, where r is roughly the period length.

This is the origin of the ugly:

  src/library/stats/src/arima.c:838:if(r  350) error(_(maximum
supported lag is 350));

I noted on the same AS154 paper that the initial P0 verify this equation:

  P0 = T P0 T' + R R'

So I modified the arima.c code to find iteratively the solution of this
equation (starting from P0 = I)

The resulting code finds a solution very similar to the one of the original
code in a fraction of the occupied memory and in a time that is similar for
small lags and faster for bigger lags (without the r350 limit).

Here the modified code: https://gist.github.com/911292

The question is, there are theoretical guarantees that the iterative
solution is the right solution?

Some hints/directions/books?
Matteo Bertini

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] duplicates() function

2011-04-09 Thread Petr Savicky
On Fri, Apr 08, 2011 at 10:59:10AM -0400, Duncan Murdoch wrote:
 I need a function which is similar to duplicated(), but instead of 
 returning TRUE/FALSE, returns indices of which element was duplicated.  
 That is,
 
  x - c(9,7,9,3,7)
  duplicated(x)
 [1] FALSE FALSE  TRUE FALSE TRUE
 
  duplicates(x)
 [1] NA NA  1 NA  2
 
 (so that I know that element 3 is a duplicate of element 1, and element 
 5 is a duplicate of element 2, whereas the others were not duplicated 
 according to our definition.)
 
 Is there a simple way to write this function?

A possible strategy is to use sorting. In a sorted matrix
or data frame, the elements, which are duplicates of the
same element, form consecutive blocks. These blocks may
be identified using !duplicated(), which determines the
first elements of these blocks. Since sorting is stable,
when we map these blocks back to the original order, the
first element of each block is mapped to the first ocurrence
of the given row in the original order.

An implementation may be done as follows.

  duplicates - function(dat)
  {
  s - do.call(order, as.data.frame(dat))
  non.dup - !duplicated(dat[s, ])
  orig.ind - s[non.dup]
  first.occ - orig.ind[cumsum(non.dup)]
  first.occ[non.dup] - NA
  first.occ[order(s)]
  }
 
  x -  cbind(1, c(9,7,9,3,7) )
  duplicates(x)
  [1] NA NA  1 NA  2

The line

  orig.ind - s[non.dup]

creates a vector, whose length is the number of non-duplicated
rows in the sorted dat. Its components are indices of the
corresponding first occurrences of these rows in the original
order. For this, the stability of the order is needed.

The lines

  first.occ - orig.ind[cumsum(non.dup)]
  first.occ[non.dup] - NA

expand orig.ind to a vector, which satisfies: If i-th row of the
sorted dat is duplicated, then first.occ[i] is the index of the
first row in the original dat, which is equal to this row. So, the
values in first.occ are those, which are required for the output
of duplicates(), but they are in the order of the sorted dat. The
last line 

  first.occ[order(s)]

reorders the vector to the original order of the rows.

Petr Savicky.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Paul Johnson
Years ago, I did lots of Perl programming. Perl will let you be lazy
and write functions that refer to undefined variables (like R does),
but there is also a strict mode so the interpreter will block anything
when a variable is mentioned that has not been defined. I wish there
were a strict mode for checking R functions.

Here's why. We have a lot of students writing R functions around here
and they run into trouble because they use the same name for things
inside and outside of functions. When they call functions that have
mistaken or undefined references to names that they use elsewhere,
then variables that are in the environment are accidentally used. Know
what I mean?

dat - whatever

someNewFunction - function(z, w){
   #do something with z and w and create a new dat
   # but forget to name it dat
lm (y, x, data=dat)
   # lm just used wrong data
}

I wish R had a strict mode to return an error in that case. Users
don't realize they are getting nonsense because R finds things to fill
in for their mistakes.

Is this possible?  Does anybody agree it would be good?

-- 
Paul E. Johnson
Professor, Political Science
1541 Lilac Lane, Room 504
University of Kansas

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Duncan Murdoch

On 11-04-09 3:51 PM, Paul Johnson wrote:

Years ago, I did lots of Perl programming. Perl will let you be lazy
and write functions that refer to undefined variables (like R does),
but there is also a strict mode so the interpreter will block anything
when a variable is mentioned that has not been defined. I wish there
were a strict mode for checking R functions.

Here's why. We have a lot of students writing R functions around here
and they run into trouble because they use the same name for things
inside and outside of functions. When they call functions that have
mistaken or undefined references to names that they use elsewhere,
then variables that are in the environment are accidentally used. Know
what I mean?

dat- whatever

someNewFunction- function(z, w){
#do something with z and w and create a new dat
# but forget to name it dat
 lm (y, x, data=dat)
# lm just used wrong data
}

I wish R had a strict mode to return an error in that case. Users
don't realize they are getting nonsense because R finds things to fill
in for their mistakes.

Is this possible?  Does anybody agree it would be good?



It would be really bad, unless done carefully.

In your function the free (undefined) variables are dat and lm.  You 
want to be warned about dat, but you don't want to be warned about lm. 
What rule should R use to determine that?


(One possible rule would work in a package with a namespace.  In that 
case, all variables must be found in declared dependencies, the search 
could stop before it got to globalenv().  But it seems unlikely that 
your students are writing packages with namespaces.)


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Rtools questions

2011-04-09 Thread Duncan Murdoch

On 11-04-06 2:45 PM, Henrik Bengtsson wrote:

On Wed, Apr 6, 2011 at 4:54 AM, Duncan Murdochmurdoch.dun...@gmail.com  wrote:

On 11-04-05 7:51 PM, Henrik Bengtsson wrote:

On Tue, Apr 5, 2011 at 3:44 PM, Duncan Murdochmurdoch.dun...@gmail.com
  wrote:

On 11-04-05 6:22 PM, Spencer Graves wrote:


Hello:


 1.  How can I tell when the development version of Rtools has
changed?


I don't make announcements of the changes, you just need to check the web
site.  There are online tools that can do this for you automatically, but
I
don't know which one to recommend.  Google suggests lots of them.


I also asked myself this before and I must admit it took me a while to
interpret the contents of the webpage.  There are multiple sections,
e.g. 'Changes since R 2.12.2', 'Changes since R 2.11.1', 'Changes
since R 2.11.0', and so on.  Then within each section there are some
dates mentioned.  Given my current R version (say R 2.13.0 beta) and
Rtools (Rtools213.exe), it not fully clear to me which section to look
at, e.g. 'Changes since R 2.12.2'?


Well, that depends on when you downloaded it.  I use the R version releases
as bookmarks.  If you last downloaded Rtools after the release of R 2.12.2,
then you only need to look at the last section.

The problem with collecting changes into those that apply to each Rtools
version is just that the change lists would be longer:  Rtools212 will get
changes through several R releases.  When there are compiler changes,
RtoolsXYZ generally comes out during the previous R version, because the
compiler may only work with the R-devel version.  For instance, Rtools212
was introduced between R 2.11.0 and 2.11.1 and was updated a number of times
up to quite recently.  (It is now frozen, so if you download it now and are
working with the R versions it supports you never need to worry about
updates to it.)


I understand, and I suspected this was the reason too.



However, if you want to reformat the page, go ahead, and send me the new
version.  It's a hand edited HTML page so I'd be happy to incorporate
changes that make it more readable, as long as it's still easy to edit by
hand.

Gabor asked how to know which version was downloaded.  If you have the
installer file you can tell:  right click on it, choose Properties, look at
the Version tab.  If you didn't keep the installer, I don't know a way to
find out, but it might be recorded in the unins000.dat file that the
uninstaller uses.  Of course, without downloading the new one you can't find
out its version:  so back to my original suggestion to monitor changes to
the web page.  I'll see if there's a way to automatically include the
revision number in the filename.


This is useful - I didn't know about this version number of InnoSetup.
  I've browsed the online InnoSetup help, but I couldn't locate what
the version parameter is called.  With it, would it be possible to use
a [Code] block having InnoSetup write the version number to a VERSION
file in the Rtools installation directory?  That would make it
possible to compare what's online and what's installed.

Another alternative for figuring out if Rtools have changed would be
to compare the timestamp of the installed Rtools directory (because
you typically install immediately after download) and the
Rtools213.exe timestamp on the web server.  This could be achieved by
moving the files to, say,
http://www.murdoch-sutherland.com/Rtools/download/ and enable indexing
of files in that directory.

Either way, know about the version number is certainly good enough for
me.  After installing Rtools, I can simply put the installer file in
the Rtools directory to allow me to compare to it later. (I kind of
did this before by comparing file sizes.)


I've just uploaded a small change:  now Rtools.txt records the version 
number (and if I remember to update it, you can download only that file 
to see if you are up to date).  There's also a VERSION.txt file that 
contains the version number, which is likely to maintain its format more 
consistently, so if you want an automatic check, you should look at that 
file.  It's also on the web site.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What

2011-04-09 Thread Ted Harding
On 09-Apr-11 20:37:28, Duncan Murdoch wrote:
 On 11-04-09 3:51 PM, Paul Johnson wrote:
 Years ago, I did lots of Perl programming. Perl will let you be lazy
 and write functions that refer to undefined variables (like R does),
 but there is also a strict mode so the interpreter will block anything
 when a variable is mentioned that has not been defined. I wish there
 were a strict mode for checking R functions.

 Here's why. We have a lot of students writing R functions around here
 and they run into trouble because they use the same name for things
 inside and outside of functions. When they call functions that have
 mistaken or undefined references to names that they use elsewhere,
 then variables that are in the environment are accidentally used. Know
 what I mean?

 dat- whatever

 someNewFunction- function(z, w){
 #do something with z and w and create a new dat
 # but forget to name it dat
  lm (y, x, data=dat)
 # lm just used wrong data
 }

 I wish R had a strict mode to return an error in that case. Users
 don't realize they are getting nonsense because R finds things to fill
 in for their mistakes.

 Is this possible?  Does anybody agree it would be good?

 
 It would be really bad, unless done carefully.
 
 In your function the free (undefined) variables are dat and lm.  You 
 want to be warned about dat, but you don't want to be warned about lm. 
 What rule should R use to determine that?
 
 (One possible rule would work in a package with a namespace.  In that 
 case, all variables must be found in declared dependencies, the search 
 could stop before it got to globalenv().  But it seems unlikely that 
 your students are writing packages with namespaces.)
 
 Duncan Murdoch

I'm with Duncan on this one! On the other hand, I can understand the
issues that Paul's students might encounter.

I think the right thing to so is to introduce the students to the
basics of scoping, early in the process of learning R.

Thus, when there is a variable (such as 'lm' in the example) which
you *expect* to already be out there (since 'lm' is in 'stats'
which is pre-loaded by default), then you can go ahead and use it.

But when your function uses a variable (e.g. 'dat') which just
*happened* to be out there when you first wrote the function,
then when you re-use the same function definition in a different
context things are likely to go wrong. So teach them that variables
which occur in functions, which might have any meaning in whatever
the context of use may be, should either be named arguments in
the argument list, or should be specifically defined within the
function, and not assumed to already exist unless that is already
guaranteed in every context in which the function would be used.

This is basic good practice which, once routinely adopted, should
ensure that the right thing is done every time!

Ted.


E-Mail: (Ted Harding) ted.hard...@wlandres.net
Fax-to-email: +44 (0)870 094 0861
Date: 09-Apr-11   Time: 22:08:10
-- XFMail --

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Hadley Wickham
On Sat, Apr 9, 2011 at 2:51 PM, Paul Johnson pauljoh...@gmail.com wrote:
 Years ago, I did lots of Perl programming. Perl will let you be lazy
 and write functions that refer to undefined variables (like R does),
 but there is also a strict mode so the interpreter will block anything
 when a variable is mentioned that has not been defined. I wish there
 were a strict mode for checking R functions.

 Here's why. We have a lot of students writing R functions around here
 and they run into trouble because they use the same name for things
 inside and outside of functions. When they call functions that have
 mistaken or undefined references to names that they use elsewhere,
 then variables that are in the environment are accidentally used. Know
 what I mean?

 dat - whatever

 someNewFunction - function(z, w){
   #do something with z and w and create a new dat
   # but forget to name it dat
    lm (y, x, data=dat)
   # lm just used wrong data
 }

 I wish R had a strict mode to return an error in that case. Users
 don't realize they are getting nonsense because R finds things to fill
 in for their mistakes.

 Is this possible?  Does anybody agree it would be good?


 library(codetools)
 checkUsage(someNewFunction)
anonymous: no visible binding for global variable ‘y’
anonymous: no visible binding for global variable ‘x’
anonymous: no visible binding for global variable ‘dat’

Which also picks up another bug in your function ;)

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Spencer Graves

On 4/9/2011 2:31 PM, Hadley Wickham wrote:

On Sat, Apr 9, 2011 at 2:51 PM, Paul Johnsonpauljoh...@gmail.com  wrote:

Years ago, I did lots of Perl programming. Perl will let you be lazy
and write functions that refer to undefined variables (like R does),
but there is also a strict mode so the interpreter will block anything
when a variable is mentioned that has not been defined. I wish there
were a strict mode for checking R functions.

Here's why. We have a lot of students writing R functions around here
and they run into trouble because they use the same name for things
inside and outside of functions. When they call functions that have
mistaken or undefined references to names that they use elsewhere,
then variables that are in the environment are accidentally used. Know
what I mean?

dat- whatever

someNewFunction- function(z, w){
   #do something with z and w and create a new dat
   # but forget to name it dat
lm (y, x, data=dat)
   # lm just used wrong data
}

I wish R had a strict mode to return an error in that case. Users
don't realize they are getting nonsense because R finds things to fill
in for their mistakes.

Is this possible?  Does anybody agree it would be good?



library(codetools)
checkUsage(someNewFunction)

anonymous: no visible binding for global variable ‘y’
anonymous: no visible binding for global variable ‘x’
anonymous: no visible binding for global variable ‘dat’

Which also picks up another bug in your function ;)


  Is this run by R CMD check?  I've seen this message.


  R CMD check will give this message sometimes when I don't feel 
it's appropriate.  For example, I define a data object ETB in a package, 
then give that as the default in a function call like 
f(data.=ETB){if(missing(data.))data(ETB);  data.}.  When I run R CMD 
check, I get no visible binding for global variable 'ETB', even 
though the function is tested and works during R CMD check.



  Spencer


Hadley




--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Duncan Murdoch

On 11-04-09 7:02 PM, Spencer Graves wrote:

On 4/9/2011 2:31 PM, Hadley Wickham wrote:

On Sat, Apr 9, 2011 at 2:51 PM, Paul Johnsonpauljoh...@gmail.com   wrote:

Years ago, I did lots of Perl programming. Perl will let you be lazy
and write functions that refer to undefined variables (like R does),
but there is also a strict mode so the interpreter will block anything
when a variable is mentioned that has not been defined. I wish there
were a strict mode for checking R functions.

Here's why. We have a lot of students writing R functions around here
and they run into trouble because they use the same name for things
inside and outside of functions. When they call functions that have
mistaken or undefined references to names that they use elsewhere,
then variables that are in the environment are accidentally used. Know
what I mean?

dat- whatever

someNewFunction- function(z, w){
#do something with z and w and create a new dat
# but forget to name it dat
 lm (y, x, data=dat)
# lm just used wrong data
}

I wish R had a strict mode to return an error in that case. Users
don't realize they are getting nonsense because R finds things to fill
in for their mistakes.

Is this possible?  Does anybody agree it would be good?



library(codetools)
checkUsage(someNewFunction)

anonymous: no visible binding for global variable ‘y’
anonymous: no visible binding for global variable ‘x’
anonymous: no visible binding for global variable ‘dat’

Which also picks up another bug in your function ;)


Is this run by R CMD check?  I've seen this message.


R CMD check will give this message sometimes when I don't feel
it's appropriate.  For example, I define a data object ETB in a package,
then give that as the default in a function call like
f(data.=ETB){if(missing(data.))data(ETB);  data.}.  When I run R CMD
check, I get no visible binding for global variable 'ETB', even
though the function is tested and works during R CMD check.


What is ETB?  Your code is looking for a global variable by that name, 
and that's what codetools is telling you.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Spencer Graves

On 4/9/2011 6:12 PM, Duncan Murdoch wrote:

On 11-04-09 7:02 PM, Spencer Graves wrote:

On 4/9/2011 2:31 PM, Hadley Wickham wrote:
On Sat, Apr 9, 2011 at 2:51 PM, Paul Johnsonpauljoh...@gmail.com   
wrote:

Years ago, I did lots of Perl programming. Perl will let you be lazy
and write functions that refer to undefined variables (like R does),
but there is also a strict mode so the interpreter will block anything
when a variable is mentioned that has not been defined. I wish there
were a strict mode for checking R functions.

Here's why. We have a lot of students writing R functions around here
and they run into trouble because they use the same name for things
inside and outside of functions. When they call functions that have
mistaken or undefined references to names that they use elsewhere,
then variables that are in the environment are accidentally used. Know
what I mean?

dat- whatever

someNewFunction- function(z, w){
#do something with z and w and create a new dat
# but forget to name it dat
 lm (y, x, data=dat)
# lm just used wrong data
}

I wish R had a strict mode to return an error in that case. Users
don't realize they are getting nonsense because R finds things to fill
in for their mistakes.

Is this possible?  Does anybody agree it would be good?



library(codetools)
checkUsage(someNewFunction)

anonymous: no visible binding for global variable ‘y’
anonymous: no visible binding for global variable ‘x’
anonymous: no visible binding for global variable ‘dat’

Which also picks up another bug in your function ;)


Is this run by R CMD check?  I've seen this message.


R CMD check will give this message sometimes when I don't feel
it's appropriate.  For example, I define a data object ETB in a package,
then give that as the default in a function call like
f(data.=ETB){if(missing(data.))data(ETB);  data.}.  When I run R CMD
check, I get no visible binding for global variable 'ETB', even
though the function is tested and works during R CMD check.


What is ETB?  Your code is looking for a global variable by that name, 
and that's what codetools is telling you.


Duncan:  Thanks for the question.


ETB is a data object in my package.  codetools can't find it because 
data(ETB) is needed before ETB becomes available.  codetools is not 
smart enough to check to see if ETB is a data object in the package.



Spencer



Duncan Murdoch



--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish there were a strict mode for R interpreter. What about You?

2011-04-09 Thread Spencer Graves

On 4/9/2011 6:12 PM, Duncan Murdoch wrote:

On 11-04-09 7:02 PM, Spencer Graves wrote:

On 4/9/2011 2:31 PM, Hadley Wickham wrote:

On Sat, Apr 9, 2011 at 2:51 PM, Paul Johnsonpauljoh...@gmail.com
wrote:

Years ago, I did lots of Perl programming. Perl will let you be lazy
and write functions that refer to undefined variables (like R does),
but there is also a strict mode so the interpreter will block anything
when a variable is mentioned that has not been defined. I wish there
were a strict mode for checking R functions.

Here's why. We have a lot of students writing R functions around here
and they run into trouble because they use the same name for things
inside and outside of functions. When they call functions that have
mistaken or undefined references to names that they use elsewhere,
then variables that are in the environment are accidentally used. Know
what I mean?

dat- whatever

someNewFunction- function(z, w){
#do something with z and w and create a new dat
# but forget to name it dat
 lm (y, x, data=dat)
# lm just used wrong data
}

I wish R had a strict mode to return an error in that case. Users
don't realize they are getting nonsense because R finds things to fill
in for their mistakes.

Is this possible?  Does anybody agree it would be good?



library(codetools)
checkUsage(someNewFunction)

anonymous: no visible binding for global variable ‘y’
anonymous: no visible binding for global variable ‘x’
anonymous: no visible binding for global variable ‘dat’

Which also picks up another bug in your function ;)


Is this run by R CMD check?  I've seen this message.


R CMD check will give this message sometimes when I don't feel
it's appropriate.  For example, I define a data object ETB in a package,
then give that as the default in a function call like
f(data.=ETB){if(missing(data.))data(ETB);  data.}.  When I run R CMD
check, I get no visible binding for global variable 'ETB', even
though the function is tested and works during R CMD check.


What is ETB?  Your code is looking for a global variable by that name,
and that's what codetools is telling you.


Duncan:  Thanks for the question.


ETB is a data object in my package.  codetools can't find it because 
data(ETB) is needed before ETB becomes available.  codetools is not 
smart enough to check to see if ETB is a data object in the package.



Spencer



Duncan Murdoch



--
Spencer Graves, PE, PhD
President and Chief Operating Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] deparse operators in expressions

2011-04-09 Thread Yihui Xie
Hi,

I observed a slight problem in deparse(): it will add spaces around
most operators except /. I wonder if this is easy to fix. I know this
is quite trivial, but I will appreciate if / is not treated as an
exception. Examples:

 deparse(expression(1/1))
[1] expression(1/1)
 deparse(expression(1+1))
[1] expression(1 + 1)
 deparse(expression(1%in%1))
[1] expression(1 %in% 1)

 sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8
 [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
 [7] LC_PAPER=en_US.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base



Regards,
Yihui
--
Yihui Xie xieyi...@gmail.com
Phone: 515-294-2465 Web: http://yihui.name
Department of Statistics, Iowa State University
2215 Snedecor Hall, Ames, IA

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel