Re: [Rd] Simplify and By Convert Factors To Numeric Values

2017-06-16 Thread Charles C. Berry

On Fri, 16 Jun 2017, Dario Strbenac wrote:


Good day,



It's not described anywhere in the help page, but tapply and by 
functions will, by default, convert factors into numeric values. Perhaps 
this needs to be documented or the behaviour changed.



It *is* described in the help page.

This returns a list of objects and each object class has "factor"

tapply(rep(1:2,2), rep(1:2,2),
  function(x) factor(LETTERS[x], levels = LETTERS))

and this




tapply(1:3, 1:3, function(x) factor(LETTERS[x], levels = LETTERS))

1 2 3
1 2 3


returns a vector object with no class.





The documentation states "... tapply returns a multi-way array 
containing the values ..." but doesn't mention anything about converting 
factors into integers. I'd expect the values to be of the same type.


and also states

"If FUN returns a single atomic value for each such cell ... and when 
simplify is TRUE ...  if the return value has a class (e.g., an object of 
class "Date") the class is discarded."


which is what just happened in your example.

Maybe you want:

unlist(tapply(1:3, 1:3, function(x) factor(LETTERS[x],
  levels = LETTERS),simplify=FALSE))

Trying to preserve class worked here in a way you might have 
hoped/expected, but might lead to difficulties in other uses.


HTH,

Chuck

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] A trap for young players with the lapply() function.

2017-03-27 Thread Charles C. Berry

On Mon, 27 Mar 2017, Rolf Turner wrote:



From time to time I get myself into a state of bewilderment when using
apply() by calling it with FUN equal to a function which has an "optional" 
argument named "X".


E.g.

   xxx <- lapply(y,function(x,X){cos(x*X)},X=2*pi)

which produces the error message


Error in get(as.character(FUN), mode = "function", envir = envir) :
  object 'y' of mode 'function' was not found


This of course happens because the name of the first argument of lapply() is 
"X" and so it takes the value of this first argument to be the supplied X 
(2*pi in the foregoing example) and then expects what the user has denoted by 
"y" to be the value of FUN, and (obviously!) it isn't.




The lapply help page addresses this issue in `Details' :

"it is good practice to name the first two arguments X and FUN if ... is 
passed through: this both avoids partial matching to FUN and ensures that 
a sensible error message is given if arguments named X or FUN are passed 
through ..."


So that advice suggests something like:

xxx <- lapply( X=y, FUN=function(X,x){cos(X*x)}, x=2*pi )

Best,

Chuck

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] How do I reliably and efficiently hash a function?

2015-12-10 Thread Charles C. Berry

On Thu, 10 Dec 2015, Konrad Rudolph wrote:


I’ve got the following scenario: I need to store information about an
R function, and retrieve it at a later point. In other programming
languages I’d implement this using a dictionary with the functions as
keys. In R, I’d usually use `attr(f, 'some-name')`. However, for my
purposes I do not want to use `attr` because the information that I
want to store is an implementation detail that should be hidden from
the user of the function (and, just as importantly, it shouldn’t
clutter the display when the function is printed on the console).

`comment` would be almost perfect since it’s hidden from the output
when printing a function — unfortunately, the information I’m storing
is not a character string (it’s in fact an environment), so I cannot
use `comment`.

How can this be achieved?



See

https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Scope

For example, these commands:

foo <- function() {info <- "abc";function(x) x+1}
func <- foo()
find("func")
func(1)
ls(envir=environment(func))
get("info",environment(func))
func

Yield these printed results:

: [1] ".GlobalEnv"
: [1] 2
: [1] "info"
: [1] "abc"
: function (x)
: x + 1
: 

The environment of the function gets printed, but 'info' and other
objects that might exist in that environment do not get printed unless
you explicitly call for them.

HTH,

Chuck

p.s. 'environment(func)$info' also works.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD build failure

2015-07-09 Thread Charles C. Berry

On Thu, 9 Jul 2015, Therneau, Terry M., Ph.D. wrote:


I have a local library 'dart' that imports httr.


[snip `R CMD build' can't find dart]



Any ideas?  There is no mention in the Writing R Extentions manual that it 
ignores the 
Rprofile file.


Terry,


From WRE:


1.3 Checking and building packages

...

Note: R CMD check and R CMD build run R processes with --vanilla in which 
none of the user’s startup files are read. If you need R_LIBS set (to find 
packages in a non-standard library) you can set it in the environment: 
also you can use the check and build environment files (as specified by 
the environment variables R_CHECK_ENVIRON and R_BUILD_ENVIRON; if unset, 
files33 ~/.R/check.Renviron and ~/.R/build.Renviron are used) to set 
environment variables when using these utilities.


And from ?Startup


The command-line option --vanilla implies --no-site-file, --no-init-file, 
--no-environ and (except for R CMD) --no-restore



HTH,

Chuck
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Different behavior of model.matrix between R 3.2 and R3.1.1

2015-06-16 Thread Charles C. Berry

On Tue, 16 Jun 2015, Frank Harrell wrote:

Terry Therneau has been very helpful on r-help but we can't figure out what 
change in R in the past months made extra columns appear in model.matrix when 
the terms object is subsetted to remove stratification factors in a Cox 
model.  Terry has changed his logic in the survival package to avoid this 
issue but he requires generating a larger design matrix then dropping 
columns.


A simple example is below.


strat - function(x) x
d - expand.grid(a=c('a1','a2'), b=c('b1','b2'))
d$y - c(1,3,2,4)
f - y ~ a * strat(b)
m - model.frame(f, data=d)
Terms - drop.terms(terms(f, data=d), 2)
model.matrix(Terms, m)

 (Intercept) aa2 aa1:strat(b)b2 aa2:strat(b)b2
1   1   0  0  0
2   1   1  0  0
3   1   0  1  0
4   1   1  0  1
. . .

The column corresponding to a='a1' b='b2' should not be there
(aa1:strat(b)b2).

This does seem to be a change in R.  Any help appreciated.


I get the same results with Trick or Treat == R 2.15.2, so the change 
must be before late 2012.


HTH,

Chuck

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Bug in parseNamespaceFile or switch( , ... ) ?

2010-11-27 Thread Charles C. Berry


parseNamespaceFile() doesn't seem to detect misspelled directives. Looking 
at its code I see


switch(as.character(e[[1L]]),

lots of args omitted here,

stop(gettextf(unknown namespace directive: %s,
deparse(e)), call. = FALSE, domain = NA))

but this doesn't seem to function as I expect, viz. to stop with an error 
if I type a wrong directive.


Details:

# create dummy NAMESPACE file with two bad / one good directives
cat(blah( nada )\nblee( nil )\nexport( outDS )\n,file=NAMESPACE)
readLines(NAMESPACE)

[1] blah( nada )blee( nil ) export( outDS )

parseNamespaceFile(,.) # now parse it

$imports
list()

$exports
[1] outDS

$exportPatterns
character(0)

$importClasses
list()

$importMethods
list()

$exportClasses
character(0)

$exportMethods
character(0)

$exportClassPatterns
character(0)

$dynlibs
character(0)

$nativeRoutines
list()

$S3methods
 [,1] [,2] [,3]





So, it picked up 'export' and ignored the other two lines.


Chuck

p.s.


sessionInfo()

R version 2.12.0 (2010-10-15)
Platform: i386-apple-darwin9.8.0/i386 (32-bit)

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base








Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: kronecker to get a sep= argument

2010-11-25 Thread Charles C. Berry

On Thu, 25 Nov 2010, Michael Friendly wrote:


kronecker, with make.dimnames=TRUE uses a hardwired sep=: in the line

   tmp - outer(dnx[[i]], dny[[i]], FUN = paste, sep = :)

For an application in which dimnames arise from an n-way array, where 
different dimensions have

different roles, and I would like to be able to use kronecker in the form

kronecker(A, B, make.dimnames=TRUE, sep='/')

All this requires is to change the following two lines:

kronecker - function (X, Y, FUN = *, make.dimnames = FALSE, sep=: ...)
{
 ...
   tmp - outer(dnx[[i]], dny[[i]], FUN = paste, sep = sep)
}


Otherwise, I have to reproduce the logic inside kronecker() in my application 
function.


Or add one line of code:

res - kronecker(m1,m3,make.dimnames=T)
dimnames(res) -
lapply( dimnames(res), sub, pattern=:, replacement=/ )

HTH,

Chuck

p.s. your suggestion could break code that others may have written like

kronecker( letters[1:3], diag(LETTERS[1:2]), paste, sep='*' )




-Michael

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Request: kronecker to get a sep= argument

2010-11-25 Thread Charles C. Berry

On Thu, 25 Nov 2010, Charles C. Berry wrote:


On Thu, 25 Nov 2010, Michael Friendly wrote:


 kronecker, with make.dimnames=TRUE uses a hardwired sep=: in the line

tmp - outer(dnx[[i]], dny[[i]], FUN = paste, sep = :)

 For an application in which dimnames arise from an n-way array, where
 different dimensions have
 different roles, and I would like to be able to use kronecker in the form

 kronecker(A, B, make.dimnames=TRUE, sep='/')

 All this requires is to change the following two lines:

 kronecker - function (X, Y, FUN = *, make.dimnames = FALSE, sep=:
 ...)
 {
  ...
tmp - outer(dnx[[i]], dny[[i]], FUN = paste, sep = sep)
}


 Otherwise, I have to reproduce the logic inside kronecker() in my
 application function.


Or add one line of code:

res - kronecker(m1,m3,make.dimnames=T)
dimnames(res) -
 lapply( dimnames(res), sub, pattern=:, replacement=/ )

HTH,

Chuck

p.s. your suggestion could break code that others may have written like

kronecker( letters[1:3], diag(LETTERS[1:2]), paste, sep='*' )


make that

kronecker( letters[1:3], diag(LETTERS[1:2]), paste )







 -Michael

 --
 Michael Friendly Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.
 York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
 4700 Keele StreetWeb:   http://www.datavis.ca
 Toronto, ONT  M3J 1P3 CANADA

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel




Charles C. BerryDept of Family/Preventive 
Medicine

cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in agrep computing edit distance?

2010-11-17 Thread Charles C. Berry

On Wed, 17 Nov 2010, Dickison, Daniel wrote:


I downloaded and compiled the standalone TRE agrep command line program,
and I think I have a slightly better idea of what's going on.  Basically
R's agrep, like the command line tool, is matching all strings that
*contain* the pattern.  So, essentially, insertions before and after the
pattern is free.

As far as I can tell, there isn't an option to require full-string matches
using the TRE library.  It should be possible to not use REG_LITERAL and
surround the pattern with ^ and $, but that would require escaping all
special characters in the original pattern.

Is this something worth pursuing?  (For my immediate needs I'll probably
create a separate function that passes the regex directly to TRE without
REG_LITERAL).



I am joining this thread late, but I wonder if reversing agrep's 'pattern' 
and 'x' args serves the OP's need.


viz.


sapply( c(x,xy,xyz,xyza),

+ function(y) any( agrep( y, x, max=list(all=1
xxy   xyz  xyza
 TRUE  TRUE FALSE FALSE


HTH,

Chuck



Daniel

On 11/17/10 11:47 AM, Joris Meys jorism...@gmail.com wrote:


It might have to do something with spaces and the interpretation of
insertions, as far as I understand the following examples :


agrep(x,c(x,xy,xyz,xyza),max=list(all=1))

[1] 1 2 3 4

agrep(x,c(x,xy  ,xyz ,xyza),max=list(all=1))

[1] 1

agrep(xx,c(xx,xyx,xyzx,xyzax,max=list(all=1)))

[1] 1 2 3 4

agrep(xx,c(xx,xyx,xyzx,xyzax,max=list(ins=1)))

[1] 1 2 3 4

agrep(xx   ,c(xx   ,xyx  ,xyzx ,xyzax,max=list(all=2)))

[1] 1

agrep(xx   ,c(xx   ,xyx  ,xyzx ,xyzax,max=list(all=3)))

[1] 1

If the sequences are made the same length in spaces, this function
gives the expected result in the second example, but it definitely
doesn't do that any more when you start playing around with
insertions. If not a bug, it definitely behaves pretty weird...

Cheers
Joris

On Wed, Nov 17, 2010 at 4:49 PM, Dickison, Daniel
ddicki...@carnegielearning.com wrote:

I posted this yesterday to r-help and Ben Bolker suggested reposting it
here...

Dickison, Daniel ddickison at carnegielearning.com writes:



The documentation for agrep says it uses the Levenshtein edit distance,
but it seems to get this wrong in certain cases when there is a
combination of deletions and substitutions.  For example:


agrep(abcd, abcxyz, max.distance=1)

[1] 1

That should've been a no-match.  The edit distance between those
strings
is 3 (1 substitution, 2 deletions), but agrep matches with max.distance

=

1.

I didn't find anything in the bug database, so I was wondering if
somehow
I'm misinterpreting how agrep works.  If not, should I file this in
Bugzilla?



 Could you re-post this on r-devel?  It definitely sounds like
this is worth following up.  Based on a little bit of playing around,
it's quite clear that I don't understand what's going on.  The examples
show things like

agrep(lasy,lazy,max=list(sub=0))

 which makes sense, but

agrep(lasy,lazybc,max=1)
agrep(lasy,lazybc,max=0.001)
agrep(lasy,layt,max=list(all=1))

and

agrep(x,c(x,xy,xyz,xyza),max=list(insertions=2))
agrep(x,c(x,xy,xyz,xyza),max=list(deletions=2))
agrep(x,c(x,xy,xyz,xyza),max=list(all=2))

 all give 1 2 3 4 ??

 this makes it clear that I really don't understand what's going on
based on the documentation.  I tried to trace into the C code
(which calls functions from the TRE regexp library) but that didn't
help much ...



Daniel  Dickison
Research Programmer
ddicki...@carnegielearning.com
Toll Free: (888) 851-7094 x103
FAX: (412) 690-2444

Revolutionary Math Curricula. Revolutionary Results.

Carnegie Learning, Inc. | 437 Grant St. 20th Floor | Pittsburgh, PA
15219
www.carnegielearning.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel





--
Joris Meys
Statistical consultant

Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control

tel : +32 9 264 59 87
joris.m...@ugent.be
---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unloading compiled code.

2010-11-16 Thread Charles C. Berry

On Tue, 16 Nov 2010, Andrew Redd wrote:


Just found in the documentation for getHook that packages are not
unloaded on quit.  How should I force a package to unload on quit?


See

?q


HTH,

Chuck



-Andrew

On Tue, Nov 16, 2010 at 10:25 AM, Andrew Redd amr...@gmail.com wrote:

Are packages unloaded on quit so that the .Last.lib or .onUnload are
called for packages?

-Andrew

On Fri, Nov 12, 2010 at 3:52 PM, Andrew Redd amr...@gmail.com wrote:

Perhaps you could help me make some sense of this.  Here is a printout
of my sessions.
---
toys$R -q

library(test2)
gpualloctest()

testing allocation on gpu
C Finished
Collecting Garbage
done.

q()

Save workspace image? [y/n/c]: n

 *** caught segfault ***
address 0x7f12ec1add50, cause 'memory not mapped'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
aborting ...
Segmentation fault
toys$R -q

library(test2)
gpualloctest()

testing allocation on gpu
C Finished
Collecting Garbage
done.

library.dynam.unload('test2',system.file(package='test2'))
q()

Save workspace image? [y/n/c]: n
toys$
---

I have a in the test2/R/zzz.R file
---
.onUnload - function(libpath)
   library.dynam.unload(test2, libpath)
---

so the code should be unloaded.  But it appears that it is not from
errors when I explicitly unload the test2.so it does not through a
segfault.  Why would this be happening?  and how do I circumvent it.

thanks,
Andrew


On Fri, Nov 12, 2010 at 3:32 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:

On Fri, 12 Nov 2010, Andrew Redd wrote:


I have a package that I'm developing that I need to unload the
library.  Long story short I figured out that the leaving the compiled
code loaded lead to a segmentation fault, but unloading the code will
fix it.  I've read the documentation and it appears that there are
several ways to do this?  What is the popper accepted current standard
for unloading compiled code?


Depends how you loaded it: you basically reverse the process.


The options as I understand them are:
1. dyn.unload
2. library.dynam.unload
used with either
A. .Last.lib
B. .onUnload

If it makes a difference my package does use a NAMESPACE so the
package is loaded through useDynLib.


So you need an .onUnload action calling library.dynam.unload.

Slightly longer version: you need the DLL loaded whilst the namepace is in
use, so it has to be in .onUnload, and useDynLib calls library.dynam so you
need library.dynam.unload to do the housekeeping around dyn.unload which
matches what library.dynam does around dyn.load.

There are quite a lot of examples to look at, including in R itself. MASS is
one example I've just checked.

Having said all that, my experience is that unloading the DLL often does not
help if you need to load it again (and that is why e.g. tcltk does not
unload its DLL).


Thanks,
Andrew Redd

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595







__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unloading compiled code.

2010-11-16 Thread Charles C. Berry

On Tue, 16 Nov 2010, Andrew Redd wrote:


so should I use reg.finalizer or overwrite .Last()?



.Last

Error: object '.Last' not found




You create your own .Last - there is nothing to overwrite.

Chuck


 If I use

reg.finalizer, what should be the environment that I specify?  The
straight forward solution would be to have a hook .onExit that a
package could specify to make sure  that the code was unloaded before
the program terminates, that way I don't overwrite .Last if if has
another purpose.

-Andrew

On Tue, Nov 16, 2010 at 11:27 AM, Charles C. Berry cbe...@tajo.ucsd.edu wrote:

On Tue, 16 Nov 2010, Andrew Redd wrote:


Just found in the documentation for getHook that packages are not
unloaded on quit.  How should I force a package to unload on quit?


See

       ?q


HTH,

Chuck



-Andrew

On Tue, Nov 16, 2010 at 10:25 AM, Andrew Redd amr...@gmail.com wrote:


Are packages unloaded on quit so that the .Last.lib or .onUnload are
called for packages?

-Andrew

On Fri, Nov 12, 2010 at 3:52 PM, Andrew Redd amr...@gmail.com wrote:


Perhaps you could help me make some sense of this.  Here is a printout
of my sessions.
---
toys$R -q


library(test2)
gpualloctest()


testing allocation on gpu
C Finished
Collecting Garbage
done.


q()


Save workspace image? [y/n/c]: n

 *** caught segfault ***
address 0x7f12ec1add50, cause 'memory not mapped'

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace
Selection: 1
aborting ...
Segmentation fault
toys$R -q


library(test2)
gpualloctest()


testing allocation on gpu
C Finished
Collecting Garbage
done.


library.dynam.unload('test2',system.file(package='test2'))
q()


Save workspace image? [y/n/c]: n
toys$
---

I have a in the test2/R/zzz.R file
---
.onUnload - function(libpath)
   library.dynam.unload(test2, libpath)
---

so the code should be unloaded.  But it appears that it is not from
errors when I explicitly unload the test2.so it does not through a
segfault.  Why would this be happening?  and how do I circumvent it.

thanks,
Andrew


On Fri, Nov 12, 2010 at 3:32 PM, Prof Brian Ripley
rip...@stats.ox.ac.uk wrote:


On Fri, 12 Nov 2010, Andrew Redd wrote:


I have a package that I'm developing that I need to unload the
library.  Long story short I figured out that the leaving the compiled
code loaded lead to a segmentation fault, but unloading the code will
fix it.  I've read the documentation and it appears that there are
several ways to do this?  What is the popper accepted current standard
for unloading compiled code?


Depends how you loaded it: you basically reverse the process.


The options as I understand them are:
1. dyn.unload
2. library.dynam.unload
used with either
A. .Last.lib
B. .onUnload

If it makes a difference my package does use a NAMESPACE so the
package is loaded through useDynLib.


So you need an .onUnload action calling library.dynam.unload.

Slightly longer version: you need the DLL loaded whilst the namepace is
in
use, so it has to be in .onUnload, and useDynLib calls library.dynam so
you
need library.dynam.unload to do the housekeeping around dyn.unload
which
matches what library.dynam does around dyn.load.

There are quite a lot of examples to look at, including in R itself.
MASS is
one example I've just checked.

Having said all that, my experience is that unloading the DLL often
does not
help if you need to load it again (and that is why e.g. tcltk does not
unload its DLL).


Thanks,
Andrew Redd

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595







__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry                            Dept of Family/Preventive
Medicine
cbe...@tajo.ucsd.edu                        UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug in read.table?

2010-11-05 Thread Charles C. Berry
://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. BerryDept of Family/Preventive Medicine
cbe...@tajo.ucsd.eduUC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] What do you call the value that represents a missing argument?

2010-10-08 Thread Charles C. Berry

On Fri, 8 Oct 2010, Hadley Wickham wrote:


Hi all,

What's the official name for the value that represents a missing argument?

e.g.
formals(plot)$x


See ?list

It is a 'dotted pair list'

Are you looking for 'alist'?

alist handles its arguments as if they described function arguments. So 
the values are not evaluated, and tagged arguments with no value are 
allowed whereas list simply ignores them. alist is most often used in 
conjunction with formals.



alist(x=)$x==formals(plot)$x

[1] TRUE




HTH,

Chuck


str(formals(plot)$x)
deparse(formals(plot)$x)
is.symbol(formals(plot)$x)

What's the correct way to create an object like this?  (for example if
you are manipulating the formals of a function to add an argument with
no default value, as in http://stackoverflow.com/questions/3892580/).
as.symbol() returns an error.  Both substitute() and bquote() return
that object, but it's not obvious if this is on purpose.

Hadley


--
Assistant Professor / Dobelman Family Junior Chair
Department of Statistics / Rice University
http://had.co.nz/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Does anyone use Sweave (RweaveLatex) option expand=FALSE?

2010-08-19 Thread Charles C. Berry
 calls, rather than embedded code chunks? The reader can then see real
 code, rather than non-code, or meta-code, or whatever. Alternatively,
 represent the code chunks as R expressions, then evaluate the
 expressions at the appropriate points.

 -Matt


  So I vote strongly for retaining expand=FALSE.
 
  Best,

  Kevin
 
  Duncan Murdoch wrote:
 
   On 19/08/2010 4:29 PM, Claudia Beleites wrote:
  
I never used it.
   
I got curious, though. What would be a situation that benefits of 
this option?
   
   
   When I put it in, I thought it would be for people who were writing 
   about Sweave.
  
   Duncan Murdoch
  
  
Maybe a use case could be found by brute force (grep all .Rnw 
files on CRAN for the option?
   
Claudia
   
   
   
   __

   R-devel@r-project.org mailing list
   https://stat.ethz.ch/mailman/listinfo/r-devel
  
  __

  R-devel@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-devel
 





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Should as.complex(NaN) - NA?

2010-03-31 Thread Charles C. Berry

On Wed, 31 Mar 2010, William Dunlap wrote:


I'm having trouble grokking complex NaN's.
This first set examples using complex(re=NaN,im=NaN)
give what I expect
  Re(complex(re=NaN, im=NaN))
 [1] NaN
  Im(complex(re=NaN, im=NaN))
 [1] NaN
  Arg(complex(re=NaN, im=NaN))
 [1] NaN
  Mod(complex(re=NaN, im=NaN))
 [1] NaN
  abs(complex(re=NaN, im=NaN))
 [1] NaN
and so do the following
  Re(complex(re=1, im=NaN))
 [1] 1
  Im(complex(re=1, im=NaN))
 [1] NaN
  Re(complex(re=NaN, im=1))
 [1] NaN
  Im(complex(re=NaN, im=1))
 [1] 1
but I don't have a good mental model that explains
why the following produce NA instead of NaN.


Just a guess here:


as.complex(sqrt(as.complex(-1)))

[1] 0+1i

as.complex(sqrt(-1))

[1] NA
Warning message:
In sqrt(-1) : NaNs produced

It protects from assuming that the latter truly is not a number.

Chuck



  as.complex(NaN)
 [1] NA
  Im(complex(modulus=NaN, argument=NaN))
 [1] NA
  Re(complex(modulus=NaN, argument=NaN))
 [1] NA
  Re(1i * NaN)
 [1] NA
  Im(1i * NaN)
 [1] NA
  Re(NaN + 1i)
 [1] NA
  Im(NaN + 1i)
 [1] NA

It may be that if as.complex(NaN), and its C equivalent,
were changed to return complex(re=NaN,im=NaN) then the
arithmetic examples would return NaN.  Is there a
better way for me to model how NaN's in complex numbers
should work or is this a bug?

While I was looking into this I noticed a bug in str():
  str(NA_complex_)
 Error in FUN(X[[1L]], ...) : subscript out of bounds

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New version weighted mean differs from the old one (PR#14142)

2009-12-14 Thread Charles C. Berry



This was PR#14032. Fixed in R.10.1.


On Mon, 14 Dec 2009, huh...@dreamwiz.com wrote:


Full_Name: Myung-Hoe Huh
Version: 2.10
OS: Windows
Submission from: (NULL) (116.120.84.194)


New Version (2.10.0) weighted mean produces unreasonable result: see below.

wt - c(5,  5,  4,  1)/15
x - c(3.7,3.3,3.5,2.8)
x[4] - NA
(xm - weighted.mean(x,wt,na.rm=T))

Outcome is


 (xm - weighted.mean(x,wt,na.rm=T))

[1] 3.27

The number is obtained by treating x[4] - 0

I think the old version(2.8.0)'s weighte mean is more reasonable. The old output
was


 (xm - weighted.mean(x,wt,na.rm=T))

[1] 3.5

The number si obtained ignoring the x[4], which is NA.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] split() is slow on data.frame (PR#14123)

2009-12-09 Thread Charles C. Berry

On Wed, 9 Dec 2009, William Dunlap wrote:


Here are some differences between the current and proposed
split.data.frame.


Adding 'drop=FALSE' fixes this case. See in line correction below.

Chuck




d-data.frame(Matrix=I(matrix(1:10, ncol=2)),

Named=c(one=1,two=2,three=3,four=4,five=5),
row.names=as.character(1001:1005))

group-c(A,B,A,A,B)
split.data.frame(d,group)

$A
Matrix.1 Matrix.2 Named
100116 1
100338 3
100449 4

$B
Matrix.1 Matrix.2 Named
100227 2
10055   10 5


mysplit.data.frame(d,group) # lost row.names and 2nd column of Matrix

[1] processing data.frame
$A
Matrix Named
[1,]  1 1
[2,]  3 3
[3,]  4 4

$B
Matrix Named
[1,]  2 2
[2,]  5 5


Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


-Original Message-
From: r-devel-boun...@r-project.org
[mailto:r-devel-boun...@r-project.org] On Behalf Of
pengyu...@gmail.com
Sent: Wednesday, December 09, 2009 2:10 PM
To: r-de...@stat.math.ethz.ch
Cc: r-b...@r-project.org
Subject: [Rd] split() is slow on data.frame (PR#14123)

Please see the following code for the runtime comparison between
split() and mysplit.data.frame() (they do the same thing
semantically). mysplit.data.frame() is a fix of split() in term of
performance. Could somebody include this fix (with possible checking
for corner cases) in future version of R and let me know the inclusion
of the fix?

m=30
n=6
k=3

set.seed(0)
x=replicate(n,rnorm(m))
f=sample(1:k, size=m, replace=T)

mysplit.data.frame-function(x,f) {
  print('processing data.frame')
  v=lapply(
  1:dim(x)[[2]]
  , function(i) {
split(x[,i],f)


Change to:

 split(x[,i,drop=FALSE],f)



  }
  )

  w=lapply(
  seq(along=v[[1]])
  , function(i) {
result=do.call(
cbind
, lapply(v,
function(vj) {
  vj[[i]]
}
)
)
colnames(result)=colnames(x)
return(result)
  }
  )
  names(w)=names(v[[1]])
  return(w)
}

system.time(split(as.data.frame(x),f))
system.time(mysplit.data.frame(as.data.frame(x),f))

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R Usage Statistics

2009-11-19 Thread Charles C. Berry

On Thu, 19 Nov 2009, Kevin R. Coombes wrote:


Hi,

I got the following comment from the reviewer of a paper (describing an 
algorithm implemented in R) that I submitted to BMC Bioinformatics:


Finally, which useful for exploratory work and some prototyping, neither R 
nor S-Plus are appropriate environments for deploying user applications that 
would receive much use.


The reviewer needs to get out more...


Intel Capital has placed the number of R users at 1 million, and 
Revolution kicks the estimate all the way up to 2 million.


from the New York Times Business Innovation Technology Blog

http://bits.blogs.nytimes.com/2009/01/08/r-you-ready-for-r/

which follows up on this article


http://www.nytimes.com/2009/01/07/technology/business-computing/07program.html

in their printed paper. Did the reviewer notice where that article said:

Companies as diverse as Google, Pfizer, Merck, Bank of America, the 
InterContinental Hotels Group and Shell use it.


???

sarcasm
Yeah, Google and those other companies just don't have much ken 
for  muscular computing. ;-)

/sarcasm

more sarcasm
Maybe you should retool in Visual Basic.
/more sarcasm

HTH,

Chuck




I can certainly respond by pointing out that CRAN contains more than 2000 
packages and Bioconductor contains more than 350. However, does anyone have 
statistics on how often R (and possibly some R packages) are downloaded, or 
on how many people actually use R?


Thanks,
   Kevin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] 'is.integer' (PR#13671)

2009-04-22 Thread Charles C. Berry

On Wed, 22 Apr 2009, hzambran.newsgro...@gmail.com wrote:


Full_Name: Mauricio
Version: 2.9.0 (2009-04-17)
OS:  i486-pc-linux-gnu
Submission from: (NULL) (193.205.203.3)


This is a very simple function that seems not to be working, according to the
definition given by '?is.integer'.

I checked in the Bug Tracking page at http://bugs.R-project.org/, but I didn't
find any related message.

The possible problem is:



is.integer(1)

[1] FALSE

and 1 is obviously an integer value.


Obvious?

You must know something I do not, as the protagonist of Zen and the Art of 
Motorcylce Maintenance would have pointed out.


And if you do then it is not really obvious.


is.double(1)

[1] TRUE

1L _is_ an integer.


is.integer(1L)

[1] TRUE




So this is not a bug. Please see the FAQ for advise on not posting 
non-bugs.


Chuck




I would really appreciate if you could clarify if this is really a bug or not.

Thanks in advance,

Mauricio


version

  _
platform   i486-pc-linux-gnu
arch   i486
os linux-gnu
system i486, linux-gnu
status
major  2
minor  9.0
year   2009
month  04
day17
svn rev48333
language   R
version.string R version 2.9.0 (2009-04-17)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] further notes on model.frame issue

2009-01-19 Thread Charles C. Berry

On Mon, 19 Jan 2009, Terry Therneau wrote:


This is a follow-up on my note of Saturday.  Let me start with two important
clarifications
   - I think this would be a nice addition, but I've had exactly one use for it
in the 15+ years of developing the survival package.
   - I have a work around for the current case.
Prioritize accordingly.

The ideal would be to change survexp as follows:
   fit - survexp( ~ gender, data=mydata, ratetable=survexp.us,
ratevar=list(sex=gender, year=enroll.dt, age=age*365.25))

The model statement says that I want separate curves by gender, and is similar
to other model statements.

The ratevar option gives the mapping between my variable names and the dimnames
of the survexp.us rate table.  It wants age in days, enrollment date to be some
sort of date object, and sex to be a factor.  Then the heading of the R code
would be
m - match.call()
m - m[c(1, match(names(m), c('data','formula','na.action', 'subset',
  'weights', 'ratevar'), nomatch=0)
   m[[1]] - as.name('model.frame')
   m - eval.parent(m)



Maybe I am missing something. Why not do something like this?

ratetable - cbind ## for illustration purposes

foo - function(formula,ratevars,data,...){
  m - match.call()
  mfun - m$ratevars
  mfun[[1]] - as.name('ratetable')
  m$ratevars - NULL
  frm - eval(m$formula)
  frm - update(frm,as.formula( paste(~.+,deparse(substitute(mfun)
  m$formula - frm
  m[[1]] - as.name('model.frame')
  m - eval(m,parent.frame())
  m ## to show that model frame obeys
}

dat - data.frame(diag(4))
foo(~X1+X2,list(sex=X3,year=X4),dat)

HTH,

Chuck


That is, the variables enroll.dat and age are searched for in the data= arg.
This is like the start opion in glm, but a more complex result than a vector.

 The model.frame function can't handle this.  (Splus fails too, same spot, less
useful error message).

-

 The current code uses
fit - survexp(~ gender + ratetable(sex=gender, year=enroll.dt,
age=age*365.25),
  data=mydata, ratetable=survexp.us)

The ratetable function creates a matrix with extra attributes. The matrix
contains as.numeric of the factors with the levels remembered as an extra
attribute, and also looks out for dates.  So the result is like ns() in the eyes
of model.frame, and it works.  But having to write gender twice on the rhs is
confusing to users.

   Thanks in advance for any comments.

Terry Therneau

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] issue with [[-Call

2008-12-25 Thread Charles C. Berry

On Thu, 25 Dec 2008, Terry Therneau wrote:


 The following code works in Splus but not in R

coxph - function(formula, data, weights, subset, na.action,
init, control, method= c(efron, breslow, exact),
singular.ok =TRUE, robust=FALSE,
model=FALSE, x=FALSE, y=TRUE, ...) {

   method - match.arg(method)
   Call - match.call()

   # create a call to model.frame() that contains the formula (required)
   #  and any other of the relevant optional arguments
   # then evaluate it in the proper frame
   temp - call('model.frame', formula=formula)
   for (i in c(data, weights, subset, na.action)) #add optional args
   if (!is.null(Call[[i]])) temp[[i]] - Call[[i]]

   if (is.R()) m - eval(temp, parent.frame())
   elsem - eval(temp, sys.parent())

-

The problem is that the names ('data', 'weights', etc) do no propogate over
to temp, the new call object.  It looks like an oversight in the replacement
method.



   if (!is.null(Call[[i]])) temp[ i ] - Call[[i]]

should do it in R at least.

HTH,

Chuck




 Priority: low.  I have other code forms that work in both dialets.

 I created this when teaching an internal class on S programming; the goal
was to make what was happening as transparent as possible.  If anyone has
something that they think is even better from a teaching point of view I'd
be delighted to see it.

Terry T.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cbe...@tajo.ucsd.edu   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unlist change the ordered type

2008-10-24 Thread Charles C. Berry

On Fri, 24 Oct 2008, Christophe Genolini wrote:


Hi the list,

unlist respect the all the atomic type except orderd (it change of ordered 
into factor) :


### integer
class(unlist(list(1:5,1:3)))
#[1] integer

### numeric
class(unlist(list(1.2,3.5)))
#[1] numeric

### character
class(unlist(list(e,e)))
#[1] character

### factor
class(unlist(list(factor(e),factor(e
#[1] factor

### ordered
class(unlist(list(ordered(e),ordered(e
#[1] factor


Consider

unlist(list(ordered(1:2),ordered(letters[1:4])))

Since one cannot deduce what ordering should apply, the best that can be 
done is to demote all arguments to factors.


This is the general case. Only in the special case in which all list 
elements are of class 'ordered' and the levels attributes are the same 
would this be sensible.


HTH,

Chuck



Christophe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion: 20% speed up of which() with two-character mod

2008-07-11 Thread Charles C. Berry

On Thu, 10 Jul 2008, Henrik Bengtsson wrote:


Hi,

by replacing 'll' with 'wh' in the source code for base::which() one
gets ~20% speed up for *named logical vectors*.



The amount of speedup depends on how sparse the TRUE values are.

When the proportion of TRUEs gets small the speedup is more than twofold 
on my macbook. For high proportions of TRUE, the speedup is more like the 
20% you cite.


HTH,

Chuck



CURRENT CODE:

which - function(x, arr.ind = FALSE)
{
   if(!is.logical(x))
stop(argument to 'which' is not logical)
   wh - seq_along(x)[ll - x  !is.na(x)]
   m - length(wh)
   dl - dim(x)
   if (is.null(dl) || !arr.ind) {
   names(wh) - names(x)[ll]
   }
   ...
   wh;
}

SUGGESTED CODE: (Remove 'll' and use 'wh')

which2 - function(x, arr.ind = FALSE)
{
   if(!is.logical(x))
stop(argument to 'which' is not logical)
   wh - seq_along(x)[x  !is.na(x)]
   m - length(wh)
   dl - dim(x)
   if (is.null(dl) || !arr.ind) {
   names(wh) - names(x)[wh]
   }
   ...
   wh;
}

That's all.

BENCHMARKING:

# To measure both in same environment
which1 - base::which;
environment(which1) - globalenv();  # Needed?

N - 1e6;
set.seed(0xbeef);
x - sample(c(TRUE, FALSE), size=N, replace=TRUE);
names(x) - seq_along(x);
B - 10;
t1 - system.time({ for (bb in 1:B) idxs1 - which1(x); });
t2 - system.time({ for (bb in 1:B) idxs2 - which2(x); });
stopifnot(identical(idxs1, idxs2));
print(t1/t2);
# Fair benchmarking
t2 - system.time({ for (bb in 1:B) idxs2 - which2(x); });
t1 - system.time({ for (bb in 1:B) idxs1 - which1(x); });
print(t1/t2);
##  usersystem   elapsed
##   1.283186   1.052632   1.25

You get similar results if you put for loop outside the system.time()
call (and sum up the timings).

Cheers

Henrik

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Great tool

2008-01-21 Thread Charles C. Berry
On Sun, 20 Jan 2008, Gabor Grothendieck wrote:

 I agree.  Its incredibly useful.

OK gentlemen, you have piqued my curiosity.

Can you give an example or two of situations you encountered in which a 
codetools function was so helpful?

Chuck



 On Jan 20, 2008 11:02 PM, Henrik Bengtsson [EMAIL PROTECTED] wrote:
 Hi,

 I just have drop a note to say that the 'codetools' (and the part of R
 CMD check that use it) is a pleasure to use and saves me from hours of
 troubleshooting.  Each time it finds something I am amazed how
 accurate it is.  Thanks to Luke T. and everyone else involved in
 creating it.

 Cheers,

 Henrik

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setdiff for data frames

2007-12-10 Thread Charles C. Berry
On Mon, 10 Dec 2007, G. Jay Kerns wrote:

 Hello,

 I have been interested in setdiff() for data frames that operates
 row-wise.  I looked in the documentation, mailing lists, etc., and
 didn't find exactly the right thing.  Given data frames A, B with the
 same columns, the goal is to extract the rows that are in A, but not
 in B.  Of course, one can usually do setdiff(rownames(A), rownames(B))
 but that is cheating.  :-)

 I played around a little bit and came up with

 setdiff.data.frame = function(A, B){
 g -  function( y, B){
 any( apply(B, 1, FUN = function(x)
 identical(all.equal(x, y), TRUE) ) ) }
 unique( A[ !apply(A, 1, FUN = function(t) g(t, B) ), ] )
 }

 I am sure that somebody can do this a better/faster way... any ideas?

setdiff.data.frame -
function(A,B) A[ !duplicated( rbind(B,A) )[ -seq_len(nrow(B))] , ]

This ignores rownames(A) which may not be what is wanted in every case.

HTH,

Chuck

 Any chance we could get a data.frame method for set.diff in future R
 versions? (The notion of set is somewhat ambiguous with respect to
 rows, columns, and entries in the data frame case.)


 Jay


 P.S. You can see what I'm looking for with

 A - expand.grid( 1:3, 1:3 )
 B - A[ 2:5, ]
 setdiff.data.frame(A,B)





 ***
 G. Jay Kerns, Ph.D.
 Assistant Professor / Statistics Coordinator
 Department of Mathematics  Statistics
 Youngstown State University
 Youngstown, OH 44555-0002 USA
 Office: 1035 Cushwa Hall
 Phone: (330) 941-3310 Office (voice mail)
 -3302 Department
 -3170 FAX
 E-mail: [EMAIL PROTECTED]
 http://www.cc.ysu.edu/~gjkerns/

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] pt inaccurate when x is close to 0 (PR#9945)

2007-10-10 Thread Charles C. Berry
 graphics  grDevices utils datasets  methods
 base

 version
 platform   i386-pc-mingw32
 arch   i386
 os mingw32
 system i386, mingw32
 status
 major  2
 minor  5.1
 year   2007
 month  06
 day27
 svn rev42083
 language   R
 version.string R version 2.5.1 (2007-06-27)
 ---

 Is there a reason for this loss of accuracy, or am I missing something here?
 Thanks.

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] installing packages (PR#9907)

2007-09-12 Thread Charles C. Berry
On Wed, 12 Sep 2007, [EMAIL PROTECTED] wrote:

 Full_Name: Alexander Jerneck
 Version: 2.4.0 (2006-10-03)
 OS: Gentoo (2.6.17-custom kernel)
 Submission from: (NULL) (130.91.92.78)


 I had trouble installing R packages, either from inside R or from the
 commandline, with R complaining about not finding /usr/bin/pwd
 I have pwd in /bin/ so I created a symlink from /usr/bin/pwd to /bin/pwd and 
 now
 I can install packages. I do not know where the original problem is however.

This is NOT A BUG.

Likely, you do not have /bin on your path. Check

Sys.getenv(PATH)


install.packages() works fine on all my Gentoo boxes with R installed as 
per the instructions in

$R_HOME/doc/manual/R-admin.html


 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [OT] How many useRs?

2007-08-27 Thread Charles C. Berry


Ahem!

RSiteSearch(user base)


On Mon, 27 Aug 2007, Andy Bunn wrote:

 I figured the devel list would have people on it who might know the
 answer to this



 Is there a reliable (for some definition of reliable) estimate of how
 many people use R or have downloaded it? Say an order of magnitude
 estimate? I would like to mention this in the introduction to a paper
 I'm writing where I encourage R's use.



 Thanks for any help.



 -Andy






   [[alternative HTML version deleted]]

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] C vs. C++ as learning and development tool for R

2007-01-19 Thread Charles C. Berry
On Sat, 20 Jan 2007, Roger Bivand wrote:

 On Fri, 19 Jan 2007, Kimpel, Mark William wrote:

 Thanks to all for your excellent suggestions. I think will I proceed

[snip]


Commenting on writing R packages with portable C/C++ code:


 [F]ollowing the guides to the letter gets you there like
 marked stones across a marsh. Leaving the path usually gets you at best
 neck deep in the mire, alternatively just bubbles.

Fortune!

[snip]


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Wish list

2007-01-01 Thread Charles C. Berry
On Mon, 1 Jan 2007, Duncan Murdoch wrote:

 A few comments thrown in, and some general comments at the bottom.

 On 1/1/2007 1:28 AM, Gabor Grothendieck wrote:
 This is my 2007 New Year wishlist for R features:

 1. [deleted thru 12]

 13. Make upper/lower case of simplify/SIMPLIFY consistent on all
 apply commands and add a simplify= arg to by.

 It would have been good not to introduce the inconsistency years ago,
 but it's too late to change now.


Really? The consistency issue only concerns mapply, I think.

How 'bout changing the formals of mapply to

$FUN


$...


$MoreArgs
NULL

$SIMPLIFY
simplify

$USE.NAMES
[1] TRUE

$simplify
[1] TRUE

i.e. add simplify = TRUE and change SIMPLIFY's default to 'simplify'

Then the default behavior is retained, specifying a value for 
either SIMPLIFY or simplify gives the desired behavior and SIMPLIFY takes 
precedence over simplify if both are given values. Not pretty, perhaps, 
but it does the job.

I suppose this could get one into trouble if one of the ... args is named 
'simplify', but I do not imagine that is a big deal.


 14. [rest deleted]


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: [R] axis and times() problem

2006-12-28 Thread Charles C. Berry
On Thu, 28 Dec 2006, Gabor Grothendieck wrote:

 The axes do not intersect with this command.  Is it a bug?

   plot(c(.51, .6), bty = n, xaxs = i, yaxs = i)

 If I remove the bty = n then they do intersect.

box() is making it look like the axes are different.

Axis()/axis() is behaving the same way in both cases.

 par(mfrow=c(1,2))
 plot(c(.51, .6), bty = n, xaxs = i, yaxs = i)
 box(lty=2)
 plot(c(.51, .6), xaxs = i, yaxs = i)
 axis(4,col=2)

[...]


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.missing

2006-10-24 Thread Charles C. Berry
On Tue, 24 Oct 2006, Duncan Murdoch wrote:

 On 10/24/2006 12:58 PM, Paul Gilbert wrote:
 (I'm not sure if this is a request for a feature, or another instance
 where a feature has eluded me for many years.)

 Often I have a function which calls other functions, and may often use
 the default arguments to those functions, but needs the capability to
 pass along non-default choices. I usually do this with some variation on

 foo - function(x, foo2Args=NULL or a list(foo2defaults),
 foo3Args=NULL or a list(foo3defaults))

 and then have logic to check for NULL, or use the list in combination
 with do.call.  It is also possible to do this with ..., but it always
 seems a bit dangerous passing all the unnamed arguments along to all the
 functions being called, especially when I always seem to be calling
 functions that have similar arguments (maxit, eps, start, frequency, etc).

 It is a situation I have learned to live with, but one of my
 co-maintainers just pointed out to me that there should be a good way to
 do this in R.  Perhaps there is something else I have missed all these
 years?  Is there a way to do this cleanly? It would be nice to have
 something like

 foo - function(x, foo2Args=as.missing(),  foo3Args=as.missing())

 then the call to foo2 and foo3 could specify  foo2Args and foo3Args, but
 these would get treated as if they were missing, unless they are given
 other values.

 I was going to say I couldn't see the difference between this and just
 declaring

  foo - function(x, foo2Args, foo3Args)

 with no defaults.  However, this little demo illustrates the point, I think:

  g - function(gnodef, gdef=1) {
 +if (missing(gnodef)) cat('gnodef is missing\n')
 +if (missing(gdef)) cat('gdef is missing\n')
 +cat('gdef is ',gdef,'\n')
 +  }
 
   f - function(fnodef, fdef) {
 +g(fnodef, fdef)
 +  }
 
   g()
 gnodef is missing
 gdef is missing
 gdef is  1
   f()
 gnodef is missing
 gdef is missing
 Error in cat(gdef is , gdef, \n) : argument fdef is missing, with
 no default


 What would be nice to be able to do is to have a simple way for f() to
 act just like g() does.


Is this what you want?

   f - function(fnodef, fdef=NULL) {
+ g()}
 f()
gnodef is missing
gdef is missing
gdef is  1



 Duncan Murdoch

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Extreme slowdown with named vectors. A bug?

2006-10-06 Thread Charles C. Berry

Another example:

 avec - 1:55000
 names(avec) - as.character(avec)
 system.time(avec[names(avec)[1:39045]])
[1] 0.06 0.00 0.07   NA   NA
 system.time(avec[names(avec)[1:39046]])
[1] 23.89  0.00 23.94NANA
 version
_
platform   i386-pc-mingw32
arch   i386
os mingw32
system i386, mingw32
status
major  2
minor  4.0
year   2006
month  10
day03
svn rev39566
language   R
version.string R version 2.4.0 (2006-10-03)


FWIW, this example shows similar behavior on R-2.2.0 Linux.



On Fri, 6 Oct 2006, Henrik Bengtsson wrote:

 Tried the following with R --vanilla on the Rv2.4.0 release (see
 details at the end).  I think the script and its comments speaks for
 itself, but the outcome is certainly not wanted.

[snip]


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Undocumented features of 'browser' (and possible changes)

2006-03-25 Thread Charles C. Berry
On Fri, 24 Mar 2006, Kevin Wright wrote:

 I often use browser() when debugging a function.  After entering
 browser, I would find it very useful to be able to cut-and-paste a
 chunk of R code to the browser (or use ess-eval-region in Emacs).  An
 inconvenience, however, is that both blank lines and comment lines
 will exit the browser.

Kevin,

This trick may help:

Browse[1] { ### blank lines will be ignored
+
+
+
+ x+1
+ }
[1] 2
Browse[1]

Maybe you want to write 'ess-eval-region-in-braces'.

[rest deleted]

Chuck

Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] unexpected '[-.data.frame' result

2005-10-26 Thread Charles C. Berry

Is this a bug?

If not, I am curious to know why '[-.data.frame' was designed to yield 
a.frame$y != a.frame$z rather than refusing to carry out the operation at 
all.

 a.frame - data.frame( x=letters[1:5] )
 a.frame[ 2:5, y ] - letters[2:5]
 a.frame[[ z ]][ 2:5 ] - letters[2:5]
 a.frame
   xyz
1 ab NA
2 bcb
3 cdc
4 ded
5 e NAe

Chuck


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] unexpected '[-.data.frame' result

2005-10-26 Thread Charles C. Berry
On Wed, 26 Oct 2005, Peter Dalgaard wrote:

 Charles C. Berry [EMAIL PROTECTED] writes:

 Is this a bug?

 If not, I am curious to know why '[-.data.frame' was designed to yield
 a.frame$y != a.frame$z rather than refusing to carry out the operation at
 all.

 a.frame - data.frame( x=letters[1:5] )
 a.frame[ 2:5, y ] - letters[2:5]
 a.frame[[ z ]][ 2:5 ] - letters[2:5]
 a.frame
xyz
 1 ab NA
 2 bcb
 3 cdc
 4 ded
 5 e NAe

 It sure looks like a bug, and we're not even prototype-compatible:


[stuff deleted]


 Why would you expect the operation to be refused?


I was having trouble deciding if the use of whole in the 
Extract.data.frame help page was a warning against creating columns with 
only some entries present:

The replacement methods can be used to add whole column(s)...

[rest deleted]

Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] calling fortran from C

2005-10-20 Thread Charles C. Berry


Lapack.c is loaded with examples.

Try

$ cd R source dir
$ grep F77_CALL ./src/modules/lapack/Lapack.c

Did you see

5.6 Calling C from FORTRAN and vice versa

in 'Writing R Extensions' ??



On Thu, 20 Oct 2005, James Bullard wrote:


 Hello, I had a question about calling some of R's fortran routines from C.
 Specifically, I would like to call: dqrfit from some C code which will be
 bundled as an R package. I was hoping someone knew of an example in some
 of R's code which does something like this (any fortran call from R's C
 would probably be a sufficient guide). So far I can only find locations
 where R calls the Fortran directly, but this is not an option for me.

 Also, I am trying to gauge the overhead of making this call, does anyone
 have any knowledge of whether there might be some non-trivial constant
 time penalty on making such a call.

 Thanks in advance, Jim

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] md5sum for R-2.2.0-win32.exe ??

2005-10-20 Thread Charles C. Berry

I get

c1279b77fcccf40379f59a83523a440e *R-2.2.0-win32.exe

but I see

e8bdf765fe8013129045314c8e2605fd *rw2011.exe

on several USA  mirrors.

I hope the latter is merely in need of a replacement and not an 
indication of a problem with the web sites.

Chuck

Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0717

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel