Re: [Rd] Best practices for writing R functions

2011-07-26 Thread Brian G. Peterson
On Tue, 2011-07-26 at 15:19 -0700, Davor Cubranic wrote:
> On 2011-07-23, at 5:57 AM, Alireza Mahani wrote:
> 
> > Another trick to reduce verbosity of code (and focus on algorithm logic
> > rather than boilerplate code) is to maintain a global copy of variables (in
> > the global environment) which makes them visible to all functions (where
> > appropriate, of course). Once the development and testing is finished, one
> > can tidy things up and modify the function prototypes, add lines for
> > unpacking lists inside functions, etc.
> 
> I think you'd be better off to stay away from such tricks. It's asking for 
> trouble later on, because unless you have really good unit tests it is very 
> easy to miss a variable during "tidying up" and end up with code that works 
> fine in your development environment but is full of bugs once you distribute 
> it to others.

Isn't this specifically one of the things that environment are *for*?

Have your package/script/functions create an environment, and store
'loose variables' there.  Use get/assign to manage.  Don't
clutter .GlobalEnv.

-- 
Brian

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Best practices for writing R functions

2011-07-26 Thread Davor Cubranic
On 2011-07-23, at 5:57 AM, Alireza Mahani wrote:

> Another trick to reduce verbosity of code (and focus on algorithm logic
> rather than boilerplate code) is to maintain a global copy of variables (in
> the global environment) which makes them visible to all functions (where
> appropriate, of course). Once the development and testing is finished, one
> can tidy things up and modify the function prototypes, add lines for
> unpacking lists inside functions, etc.

I think you'd be better off to stay away from such tricks. It's asking for 
trouble later on, because unless you have really good unit tests it is very 
easy to miss a variable during "tidying up" and end up with code that works 
fine in your development environment but is full of bugs once you distribute it 
to others.

Davor
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN mirror size mushrooming; consider archiving some?

2011-07-26 Thread Jeffrey Ryan
Or one could buy an iPod and host it from there ;-)

160 GB for US$250.  Uwe's plan is probably better though...

Jeff

On Tue, Jul 26, 2011 at 5:08 PM, Hadley Wickham  wrote:

> >> I'm setting up a new CRAN mirror and filled up the disk space the
> >> server allotted me.  I asked for more, then filled that up.  Now the
> >> system administrators want me to buy an $800 fiber channel card and a
> >> storage device.  I'm going to do that, but it does make want to
> >> suggest to you that this is a problem.
> >
> > Why? Just for the mirror? That's nonsense. A 6 year old outdated desktop
> > machine (say upgraded to 2GB RAM) with a 1T harddisc for 50$ should be
> fine
> > for your first tries. The bottleneck will probably be your network
> > connection rather than the storage.
>
> Another perspective is that it costs ~$10 / month to store 68 Gb of
> data on amazon's S3.  And then you pay 12c / GB for download.
>
> Hadley
>
> --
> Assistant Professor / Dobelman Family Junior Chair
> Department of Statistics / Rice University
> http://had.co.nz/
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Jeffrey Ryan
jeffrey.r...@lemnica.com

www.lemnica.com
www.esotericR.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN mirror size mushrooming; consider archiving some?

2011-07-26 Thread Hadley Wickham
>> I'm setting up a new CRAN mirror and filled up the disk space the
>> server allotted me.  I asked for more, then filled that up.  Now the
>> system administrators want me to buy an $800 fiber channel card and a
>> storage device.  I'm going to do that, but it does make want to
>> suggest to you that this is a problem.
>
> Why? Just for the mirror? That's nonsense. A 6 year old outdated desktop
> machine (say upgraded to 2GB RAM) with a 1T harddisc for 50$ should be fine
> for your first tries. The bottleneck will probably be your network
> connection rather than the storage.

Another perspective is that it costs ~$10 / month to store 68 Gb of
data on amazon's S3.  And then you pay 12c / GB for download.

Hadley

-- 
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] default par

2011-07-26 Thread Greg Snow
For number 1, one option is to use the setHook function with the hook in 
plot.new.  Using this you can create a function that will be called before 
every new plot is created, your function could then call par with the options 
that you want, this will set the parameters on all devices.  However it could 
cause problems if you ever wanted to change those values for a plot, your call 
to par would be overwritten by the hook function.

For number 2, S-PLUS did have the default to warn when points were outside the 
plotting region, this was annoying when people intentionally used the limits to 
look at only part of the data, so I don't think it would be popular to bring 
back this behavior in general.  You can use the zoomplot function in the 
TeachingDemos package to expand the range of your current plot to show data 
that was outside the limits, or I believe that if you use ggplot2 the plots 
will be expanded automatically to include all the data (unless you limit the 
range in the call).  You could also write your own points or plot function that 
would check the range and give warnings then call the regular points or plot 
function.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


> -Original Message-
> From: r-devel-boun...@r-project.org [mailto:r-devel-bounces@r-
> project.org] On Behalf Of Berry Boessenkool
> Sent: Friday, July 22, 2011 7:47 AM
> To: r-devel@r-project.org
> Subject: [Rd] default par
> 
> 
> 
> Hello dear R-developers,
> 
> two questions on an otherwise magnificent program:
> 
> 1)
> Is there a way to set defaults for par differently than R offers
> normally?
> I for example would like to have las default to 1. (or in the same
> style, sometimes type in plot() could be "l" per default).
> 
> Tthe following post desribes pretty much exactly that:
> https://stat.ethz.ch/pipermail/r-help/2007-March/126646.html
> It was written four years ago, but it seems like there has been no real
> elegant solution.
> Did I just miss something there? If so, could someone give me an
> update?
> If not, is there a chance that such a feature  would be added to future
> R-versions?
> I could live with the idea to assign the par$element default in
> Rprofile.site.
> 
> 2)
> Would it appear sensible to have R give a warning, when points() is
> used, and some/all values are out of plotting range in the active
> device?
> It has happened some times that I needed quite a bit of time to figure
> out why nothing was plotted.
> Such a warning (or maybe even a beep?) would give users the clue to
> look at the values right away...
> (What I mean is this:    plot(1:10)  ; points(11,3)    just in case
> it's unclear)
> 
> 
> Thanks ahead for pondering, and again: R ist the most beautiful thing I
> discovered in the last three years.
> Keep up the good work!
> 
> Berry
> 
> -
> Berry Boessenkool
> University of Potsdam, Germany
> -
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R result objects as lists

2011-07-26 Thread Duncan Murdoch

On 26/07/2011 11:26 AM, Thorsten wrote:

Hello List,
I want to communicate between a minimalistc lisp that has
only numbers, symbols (also used as strings) and lists as datatypes, and R.

It should be no problem to send command strings from the lisp process
to the R childprocess.

I know, R is mostly implemented in Scheme,


No, the design of the original R interpreter was based on a Scheme 
interpreter, but it is mostly implemented in C.



  and I read recently, that
these special return objects of R are really lists under the
hood. Therefore my questions:

1. When I send a command from a lisp (that iks not elisp) to an R
subprocess, how can I recieve the R result object as a list (and not a
special R object)?

2. Apart from graphics - are all R result objects lists (or numbers or
strings)? That is, is it safe to assume that the result of an R call
will always be either a number, a string or a list (under the hood)?


No, you need to treat the results as C structures under the hood.  Some 
are implemented as Lisp-like lists, but most are vectors with additional 
information about the type of object that is contained within (in a 
C-style array).


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] plot.function documentation/export?

2011-07-26 Thread Ben Bolker
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

  OK, I see that BDR did this on 2011-06-08 -- I was getting confused by
looking at the code of the development version but running the release
version.

  Thanks.

   Ben


On 07/26/2011 02:33 PM, Uwe Ligges wrote:
> Now I see the difference: I was using R-devel and that worked as you
> expected.
> 
> Best,
> Uwe Ligges
> 
> On 25.07.2011 19:01, Ben Bolker wrote:
> On 07/25/2011 12:55 PM, Uwe Ligges wrote:


 On 25.07.2011 17:45, Ben Bolker wrote:

 I recently suggested to someone (
 http://stackoverflow.com/questions/6789055/r-inconsistency-why-add-t-sometimes-works-and-sometimes-not-in-the-plot-functi/6789098#6789098


 ) that the should use methods("plot") or methods(class="function") to
 locate the documentation on the plot method for objects of class
 "function", but they pointed out that these don't actually work.

 I can't figure out why not: src/library/graphics/man/curve.Rd
 contains
 the line

 \method{plot}{function}(x, y = 0, to = 1, from = y, xlim = NULL, ylab =
 NULL, \dots)

 and src/library/graphics/DESCRIPTION contains


> you mean the following line is in NAMESPACE rather than DESCRIPTION.

 S3method(plot, "function")
> 
>Yes, sorry.
> 


[presumably the extra quotes are in there because function is a
 reserved word?]

I'm not sure where else the information should be.  Searching
 around in
 the code tree for information on tail.function (which is listed in the
 methods:

>>> methods(class="function")
 [1] as.list.function head.function*   print.function   tail.function*

 I find the same S3method syntax, so I guess the quotation marks aren't
 the problem ...

> ?tail.function

> tells us this one is from package "utils" and you can search for this
> function in the sources of the utils package

> Or you could ask for

 getAnywhere("tail.function")

> and R tells you

> A single object matching tail.function was found
> It was found in the following places
>registered S3 method for tail from namespace utils
>namespace:utils
> [.]

> Best wishes,
> Uwe



> 
>Sorry, I didn't frame my question very clearly.  I can find
> "tail.function" just fine, or I could if I wanted to.   What I don't
> know is why methods("plot") and methods(class="function") don't list
> "plot.function" even though its documentation and setup seem to be
> similar to "tail.function", which *does* show up in
> methods(class="function") ...
> 
>cheers
>  Ben Bolker
> 
> 
> =
> 
>No plot.function listing in either of these ...
> 
 library("graphics")
 methods("plot")
>   [1] plot.acf*   plot.data.frame*plot.decomposed.ts*
>   [4] plot.defaultplot.dendrogram*plot.density
>   [7] plot.ecdf   plot.factor*plot.formula*
> [10] plot.hclust*plot.histogram* plot.HoltWinters*
> [13] plot.isoreg*plot.lm plot.medpolish*
> [16] plot.mlmplot.ppr*   plot.prcomp*
> [19] plot.princomp*  plot.profile.nls*   plot.spec
> [22] plot.spec.coherency plot.spec.phase plot.stepfun
> [25] plot.stl*   plot.table* plot.ts
> [28] plot.tskernel*  plot.TukeyHSD
> 
> Non-visible functions are asterisked
 methods(class="function")
> [1] as.list.function head.function*   print.function   tail.function*
> 
> Non-visible functions are asterisked
> 
> 
> 
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4vDBIACgkQc5UpGjwzenP54QCghWmpGf5gpmRVYqNxJ+gm41n4
ErgAoJlXroIs3DLIPnJ4qyEPy1izMrMl
=ptBG
-END PGP SIGNATURE-

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] plot.function documentation/export?

2011-07-26 Thread Uwe Ligges
Now I see the difference: I was using R-devel and that worked as you 
expected.


Best,
Uwe Ligges

On 25.07.2011 19:01, Ben Bolker wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/25/2011 12:55 PM, Uwe Ligges wrote:



On 25.07.2011 17:45, Ben Bolker wrote:

I recently suggested to someone (
http://stackoverflow.com/questions/6789055/r-inconsistency-why-add-t-sometimes-works-and-sometimes-not-in-the-plot-functi/6789098#6789098

) that the should use methods("plot") or methods(class="function") to
locate the documentation on the plot method for objects of class
"function", but they pointed out that these don't actually work.

I can't figure out why not: src/library/graphics/man/curve.Rd contains
the line

\method{plot}{function}(x, y = 0, to = 1, from = y, xlim = NULL, ylab =
NULL, \dots)

and src/library/graphics/DESCRIPTION contains



you mean the following line is in NAMESPACE rather than DESCRIPTION.


S3method(plot, "function")


   Yes, sorry.




   [presumably the extra quotes are in there because function is a
reserved word?]

   I'm not sure where else the information should be.  Searching around in
the code tree for information on tail.function (which is listed in the
methods:


methods(class="function")

[1] as.list.function head.function*   print.function   tail.function*

I find the same S3method syntax, so I guess the quotation marks aren't
the problem ...


?tail.function



tells us this one is from package "utils" and you can search for this
function in the sources of the utils package



Or you could ask for


getAnywhere("tail.function")


and R tells you



A single object matching tail.function was found
It was found in the following places
   registered S3 method for tail from namespace utils
   namespace:utils
[.]



Best wishes,
Uwe






   Sorry, I didn't frame my question very clearly.  I can find
"tail.function" just fine, or I could if I wanted to.   What I don't
know is why methods("plot") and methods(class="function") don't list
"plot.function" even though its documentation and setup seem to be
similar to "tail.function", which *does* show up in
methods(class="function") ...

   cheers
 Ben Bolker


=

   No plot.function listing in either of these ...


library("graphics")
methods("plot")

  [1] plot.acf*   plot.data.frame*plot.decomposed.ts*
  [4] plot.defaultplot.dendrogram*plot.density
  [7] plot.ecdf   plot.factor*plot.formula*
[10] plot.hclust*plot.histogram* plot.HoltWinters*
[13] plot.isoreg*plot.lm plot.medpolish*
[16] plot.mlmplot.ppr*   plot.prcomp*
[19] plot.princomp*  plot.profile.nls*   plot.spec
[22] plot.spec.coherency plot.spec.phase plot.stepfun
[25] plot.stl*   plot.table* plot.ts
[28] plot.tskernel*  plot.TukeyHSD

Non-visible functions are asterisked

methods(class="function")

[1] as.list.function head.function*   print.function   tail.function*

Non-visible functions are asterisked



-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk4toYcACgkQc5UpGjwzenMyFACggRdP+48u++szSbV82S4HhTxj
MJcAnAsZ0iOXAsXtSeB8PZ4JmlgUgb9t
=2lyp
-END PGP SIGNATURE-


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] CRAN mirror size mushrooming; consider archiving some?

2011-07-26 Thread Uwe Ligges



On 25.07.2011 19:47, Paul Johnson wrote:

Hi, everybody

I'm setting up a new CRAN mirror and filled up the disk space the
server allotted me.  I asked for more, then filled that up.  Now the
system administrators want me to buy an $800 fiber channel card and a
storage device.  I'm going to do that, but it does make want to
suggest to you that this is a problem.


Why? Just for the mirror? That's nonsense. A 6 year old outdated desktop 
machine (say upgraded to 2GB RAM) with a 1T harddisc for 50$ should be 
fine for your first tries. The bottleneck will probably be your network 
connection rather than the storage.




CRAN now is about 68GB, and about 3/4 of that is in the bin folder,
where one finds copies of compiled packages for macosx and windows.
If the administrators of CRAN would move the packages for R before,
say, 2.12, to long term storage, then mirror management would be a bit
more, well, manageable.

Moving the R for windows packages for, say, R 2.0 through 2.10 would
save some space, and possibly establish a useful precedent for the
long term.



That is right, but then users of R < 2.11.0 could no longer use 
install.packages() and friends. If we want to move stuff around in 
future, we may want to implement that in R first. We thought about 
removing old binaries before, but then disk space increased roughly as 
exponentially as repository space in the past and we decided to stay 
with it as is.




Here's the bin/windows folder. Note it is expanding exponentially (or nearly so)


And you see that quite a lot of efforts were made during the last 
release cycles to reduce the amount of used memory (e.g. using better 
compression).


Best wishes,
Uwe




$ du --max-depth=1 | sort
1012644 ./2.6
103504  ./1.7
122200  ./1.8
1239876 ./2.7
1487024 ./2.8
15220   ./ATLAS
167668  ./1.9
17921604.
1866196 ./2.9
204392  ./2.0
2207708 ./2.10
2340120 ./2.13
2356272 ./2.12
2403176 ./2.11
298620  ./2.1
364292  ./2.2
438044  ./2.3
595920  ./2.4
698064  ./2.5



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R result objects as lists

2011-07-26 Thread Thorsten
Hello List,
I want to communicate between a minimalistc lisp that has
only numbers, symbols (also used as strings) and lists as datatypes, and R.

It should be no problem to send command strings from the lisp process
to the R childprocess.  

I know, R is mostly implemented in Scheme, and I read recently, that
these special return objects of R are really lists under the
hood. Therefore my questions:

1. When I send a command from a lisp (that iks not elisp) to an R
subprocess, how can I recieve the R result object as a list (and not a
special R object)?

2. Apart from graphics - are all R result objects lists (or numbers or
strings)? That is, is it safe to assume that the result of an R call
will always be either a number, a string or a list (under the hood)?

Cheers
Thorsten

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] --max-vsize

2011-07-26 Thread Christophe Rhodes
Prof Brian Ripley  writes:

> Point 1 is as documented: you have exceeded the maximum integer and it
> does say that it gives NA.  So the only 'odd' is reporting that you
> did not read the documentation.

I'm sorry; I thought that my message made it clear that I was aware that
the NA came from exceeding the maximum representable integer.  To
belatedly address the other information I failed to provide, I use R on
Linux, both 32-bit and 64-bit (with 64-bit R).

> Point 2 is R not using the correct units for --max-vsize (it used the
> number of Vcells, as was once documented), and I have fixed.

Thank you; I've read the changes and I think they meet my needs.  (I
will try to explain how/why I want to use larger-than-integer
mem.limits() below.  If there's a better or more supported way to
achieve what I want, that'd be fine too)

> But I do wonder why you are using --max-vsize: the documentation says
> it is very rarely needed, and I suspect that there are better ways to
> do this.

Here's the basic idea: I would like to be able to restrict R to a large
amount of memory (say 4GB, for the sake of argument), but in a way such
that I can increase that limit temporarily if it turns out to be
necessary for some reason.

The desire for a restriction is that I have found it fairly difficult to
predict in advance how much memory a given calculation or analysis is
going to take.  Part of that is my inexperience with R, leading to
hilarious thinkos, but I think that part of that difficulty to predict
is going to remain even as I gain experience.  I use R both on
multi-user systems and on single-user-multiple-use systems, and in both
cases it is usually bad if my R session causes the machine to swap;
usually that swapping is not the result of a desired computation -- most
often, it's from a straightforward mistake -- but it can take
substantial amounts of time for the machine to respond to aborts or kill
requests, and usually if the process grows enough to touch swap it will
continue growing beyond the swap limit too.

So, why not simply slap on an address-space ulimit instead (that being
the kind of ulimit in Linux that actually works...)?  Well, one reason
is that it then becomes necessary to estimate at the start of an R
session how much memory will be needed over the lifetime of that
session; guess too low, and at some point later (maybe days or even
weeks later) I might get a failure to allocate.  My options at that
stage would be to save the workspace and restart the session with a
higher limit, or attempt to delete enough things from the existing
workspace to allow the allocation to succeed.  (Have I missed anything?)
Saving and restarting will take substantial time (from writing ~4GB to
disk) while deleting things from the existing session involves cognitive
overhead that is irrelevant to my current investigation and may in any
case not succeed to free enough.

So, being able to raise the limit to something generally large for a
short time to perform a computation, get the results, and then lower the
limit again allows me to protect myself in general from overwhelming the
machine with mistaken computations, while also allowing in specific
cases the ability to dedicate more resources to a particular
computation.

> I don't find reporting values of several GB as bytes very useful, but
> then mem.limits() is not useful to me either 

Ah, I'm not particularly interested in the reporting side of
mem.limits() :-); the setting side, on the other hand, very much so.

Thank you again for the fixes.

Best,

Christophe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] --max-vsize

2011-07-26 Thread Prof Brian Ripley
Point 1 is as documented: you have exceeded the maximum integer and it 
does say that it gives NA.  So the only 'odd' is reporting that you 
did not read the documentation.


Point 2 is R not using the correct units for --max-vsize (it used the 
number of Vcells, as was once documented), and I have fixed.


But I do wonder why you are using --max-vsize: the documentation says 
it is very rarely needed, and I suspect that there are better ways to 
do this.


Also, you ignored the posting guide and did not tell us the 'at a 
minimum' information requested: what OS was this, and was it a 32- or 
64-bit R if a 64-bit OS?


I don't find reporting values of several GB as bytes very useful, but 
then mem.limits() is not useful to me either 


On Thu, 21 Jul 2011, Christophe Rhodes wrote:


Hi,

In both R 2.13 and the SVN trunk, I observe odd behaviour with the
--max-vsize command-line argument:

1. passing a largeish value (about 260M or greater) makes mem.limits()
  report NA for the vsize limit; gc() continues to report a value...

2. ...but that value (and the actual limit) is wrong by a factor of 8.

I attach a patch for issue 2, lightly tested.  I believe that fixing
issue 1 involves changing the return convention of do_memlimits -- not
returning a specialized integer vector, but a more general numeric; I
wasn't confident to do that.




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel