[R] ggplot2: scale_y_log10() with geom_histogram

2011-03-29 Thread Markus Loecher
Dear ggplot2 users,
is there an easy/elegant way to suppress zero count bars in histograms with
logarithmic y axis ?
One (made up) example would be

qplot(exp(rnorm(1000))) + geom_histogram(colour = cornsilk, fill =
darkblue) + scale_x_sqrt() + scale_y_log10()



Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] barplot, different color for shading lines and bar

2011-02-19 Thread Markus Loecher
Dear all,
might there be a modified barplot function out there which allows the user
to specify a fill color for the bars and independent parameters for the
overlaid shading lines ?
Currently, when I specify density and col, the fill color for the bars is
white.

Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] RWeka, java.lang.NullPointerException

2010-10-28 Thread Markus Loecher
Dear all,
I have trained a J48 classifier in RWeka but when I try to predict on new
data I get the following exceptions:

fit - J48(...)
yNew - predict(fit, x, type=probability);

Error in .jcall(RWekaInterfaces, [D, distributionForInstances,
.jcast(classifier,  :
  java.lang.NullPointerException

What could be the cause of this ?
I am using R version 2.10.0 on a Linux server.

Thanks,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package gbm, predict.gbm with offset

2010-09-21 Thread Markus Loecher
Dear all,
the help file for predict.gbm  states that The predictions from gbm do not
include the offset term. The user may add the value of the offset to the
predicted value if desired. I am just not sure how exactly, especially for
a Poisson model, where I believe the offset is multiplicative ?

For example:

library(MASS)

fit1 - glm(Claims ~ District + Group + Age + offset(log(Holders)),
data = Insurance, family = poisson)
head(predict(fit1, data = Insurance, type = response))

#glm.predict includes the offset:
head(predict(fit1, newdata = Insurance, type = response))
#1 2 3 4 5 6
# 31.86358  35.27587  28.18080 158.87829  53.97772  84.16012


library(gbm)

fit2 - gbm(Claims ~ District + Group + Age + offset(log(Holders)),
data = Insurance, distribution =poisson, n.trees = 600)
head(predict(fit2, newdata = Insurance, type = response, n.trees=600))

#[1] 0.1378249 0.1378249 0.1314991 0.1284441 0.1389563 0.1389563
#Warning message:
#In predict.gbm(fit2, newdata = Insurance, type = response, n.trees = 600)
:
#  predict.gbm does not add the offset to the predicted values.

Would the answer be simple multiplication such as:
head(predict(fit2, newdata = Insurance, type = response,
n.trees=600)*Insurance[,Holders])
[1]  27.15151  36.38577  32.34878 215.78607  39.46359  74.48058

Any help would be immensely useful.

Thx,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package gbm C++ code as separate module

2010-09-10 Thread Markus Loecher
Dear all,
I would like to separate the gbm C++ code from any R dependencies, so that
it could be compiled into a standalone module.
I am wondering if anyone has already done this and could provide me with
some pointers/help ?

Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] modifying axis labels in lattice panels

2010-09-10 Thread Markus Loecher
Dear all,
I am struggling to modify the axis labels/ticks in a panel provided to
xyplot.
To begin with, I do not know the equivalent of the xaxt=n directive for
panels that would set the stage for no default x axis being drawn.
My goal is to draw ticks and custom formatted labels at certain hours of the
week.
When I execute the code below, I get an error message in the plot window
that suggests a problem with some args parameter.

A second problem concerns the shaded rectangles I try to draw. Clearly, the
range returned by par('usr') does not correspond to the true y ranges.

Any help would be greatly appreciated,

Thanks,

Markus

PS I am using R version 2.10.0 on MACOS and the lattice package version
0.18-3 (latest)

library(lattice);

#multivariate time series, one row for each hour of the week:
Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) +
(0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 =
cos(2*pi*7*(0:167)/168));
#to pass this to xyplot, first need to reshape:
Xlong - reshape(Xwide, varying = c(2:3), idvar = time, direction=long,
timevar = PC);
#get descriptive shingle labels
Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2));

xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales =
list(x=list(at = Hours24-4*3600,
labels=as.character(format(Hours24-4*3600,%H);

WeekhourPanel - function(x,y,...){
  r - range(x);
  #print(r)
  Hours8 - seq(r[1], r[2], by=8*3600);
  Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)
  #axis.POSIXct(1, at= Hours8, format=%H);
  panel.xyplot(x,y, type=l, ...);
  panel.grid(0,3);
  panel.abline(v= Hours24-4*3600, lty=2, col = rgb(0,0,1,0.5));
  panel.abline(v=Hours24+6*3600, lty=2, col = rgb(0,1,0,0.5));
  bb - par('usr')
  y0 - bb[3];
  for (i in seq(r[1], r[2], by=48*3600)) panel.rect(xleft=i, ybottom=y0,
xright=i+24*3600-1, ytop=bb[4], col = rgb(0.75,0.75,0.75,0.3), border = NA);
  panel.axis(1, at= Hours24-4*3600,
labels=as.character(format(Hours24-4*3600,%H)));
  #panel.axis(1, at= Hours24+6*3600, labels=format(x,%H));
  #panel.axis(3, at= Hours24, labels=format(x,%a), line = -1, tick =
FALSE);

}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modifying axis labels in lattice panels

2010-09-10 Thread Markus Loecher
Thanks a lot for this incredibly helpful and thorough reply.
I had actually meant to cut out the scales part before sending the email,
very sorry about the confusion, so I was actually executing just

xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel)

The scales part was a later attempt to control the axis directly which I
eventually abandoned. (partly because I actually wanted the HOUR variables
to be local to the panel function)
and yes, in this simplified version I asked for labels only at 8am which
formats to 08. My intention was to add more hours and weekly labels once I
figure out this simple axis first.

I had hoped to define a panel function that draws only one PC at a time
since I envision that grouping variable to have many levels (two were just
an example).

Might you know how to disable the axis drawing in panel.xyplot ?

Thanks !

Markus


On Fri, Sep 10, 2010 at 12:45 PM, Dennis Murphy djmu...@gmail.com wrote:

 Hi:

 On Fri, Sep 10, 2010 at 7:16 AM, Markus Loecher 
 markus.loec...@gmail.comwrote:

 Dear all,
 I am struggling to modify the axis labels/ticks in a panel provided to
 xyplot.
 To begin with, I do not know the equivalent of the xaxt=n directive for
 panels that would set the stage for no default x axis being drawn.
 My goal is to draw ticks and custom formatted labels at certain hours of
 the
 week.
 When I execute the code below, I get an error message in the plot window
 that suggests a problem with some args parameter.

 A second problem concerns the shaded rectangles I try to draw. Clearly,
 the
 range returned by par('usr') does not correspond to the true y ranges.

 Any help would be greatly appreciated,

 Thanks,

 Markus

 PS I am using R version 2.10.0 on MACOS and the lattice package version
 0.18-3 (latest)

 
 library(lattice);

 #multivariate time series, one row for each hour of the week:
 Xwide = cbind.data.frame(time=as.POSIXct(2010-09-06 00:00:00 EDT) +
 (0:167)*3600, Comp.1= sin(2*pi*7*(0:167)/168), Comp.2 =
 cos(2*pi*7*(0:167)/168));
 #to pass this to xyplot, first need to reshape:
 Xlong - reshape(Xwide, varying = c(2:3), idvar = time,
 direction=long,
 timevar = PC);
 #get descriptive shingle labels
 Xlong[,PC] - factor(Xlong[,PC], labels = paste(PC,1:2));


 A less mentally taxing approach :)

 library(reshape)
 xlong - melt(Xwide, id = 'time')
 names(xlong)[2:3] - c('PC', 'Comp')


 xyplot(Comp ~ time | PC ,data = Xlong, pane1l = WeekhourPanel, scales =
 list(x=list(at = Hours24-4*3600,
 labels=as.character(format(Hours24-4*3600,%H);


 When attempting to run this, I got
 Error in xyplot.formula(Comp ~ time | PC, data = Xlong, pane1l =
 WeekhourPanel,  :
   object 'Hours24' not found

 Attempting to pull Hours24 out of the function didn't work...

  Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)
 Error in seq(r[1] + 12 * 3600, r[2], by = 24 * 3600) :
   object 'r' not found

 One problem is that to use Hours24 in scales, it has to be defined in the
 calling environment of xyplot() - in other words, it has to be defined
 outside the panel function and outside of xyplot() if your present code is
 to have any hope of working.

 I think I got that part figured out: in the console, type
 r - range(Xwide$time)

 Hours24 - seq(r[1]+12*3600, r[2], by=24*3600)

 I at least get a plot now by running your xyplot() function with the panel
 function, but all the labels are 08 on the x-axis. Here's why:


 format(Hours24-4*3600,%H)
 [1] 08 08 08 08 08 08 08

 This comes from the labels = part of your panel function. I got the same
 plot with this code (apart from adding the lines):
 xyplot(Comp ~ time | PC ,data = Xlong, type = 'l',
 scales =list(x = list(at = Hours24-4*3600,

 labels=as.character(format(Hours24-4*3600,%H)

 which indicates that something in your panel function is awry.

 I'd suggest starting out simply. Put both plots on the same panel  using PC
 as a grouping variable in the long data frame. It will automatically use
 different colors for groups, but you can control the line color with  the
 col.lines = argument; e.g., col.lines = c('red', 'blue'). Next, I'd work on
 getting the axis ticks and labels the way you want with scales. It also
 appears that you want to set a custom grid - my suggestion would be to do
 that last, after you've controlled the axis ticks and labels. Once you have
 that figured out, you have the kernel of your panel function. In most
 applications I've seen in lattice, the idea is to keep the panel function as
 simple as possible and pass the 'global' stuff to the function call. There's
 something broken in your panel function, but it's a run-time error rather
 than a compile-time error, so you can either debug it or try simplifying the
 problem (and the panel function) as much as possible.

 HTH,
 Dennis

 WeekhourPanel - function(x,y,...){
  r - range(x);
  #print(r)
  Hours8 - seq(r[1], r[2], by=8*3600);
  Hours24 - seq

[R] no predict function in lme4 ?

2010-03-23 Thread Markus Loecher
Dear mixed effects modelers,
I seem unable to find a predict method for mer objects in the package lme4.
Am I not seeing the forest for the trees ?

Any pointer would be very helpful.
Thanks,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] expression(), mixed symbols and evaluated objects

2010-03-10 Thread Markus Loecher
Is it possible to mix symbols and evaluated objects inside the expression()
function ?
The following example shows what I am trying to achieve:

for (m in 1:3) {
plot(1:10); #just a place holder for the real plots
title(expression(y = m * lambda));
}

I want to actually evaluate the variable m but keep lambda as a symbol in
the title.
I tried to wrap an eval() around various subparts of the expression but to
no avail.

Going further, I ideally would like to mix text into the expression as well.

Any help would be appreciated.

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] MASS package not on CRAN ?

2010-03-09 Thread Markus Loecher
In fact, I must have a broken installation of R then (though I have not
noticed any other problems so far).
The library MASS is neither pre-installed nor can I explicitly install it
(though the internet connection is up and functional).

Thanks for all the help !

Markus

2010/3/9 Uwe Ligges lig...@statistik.tu-dortmund.de



 On 09.03.2010 15:39, Ista Zahn wrote:

 MASS is a recommended package, so is probably already installed on
 your machine. Try



 And if installation fails, it is either your internet connection that does
 not download the file in its original form or you have a broken installation
 of R (which would also be indicated if MASS is not already installed given
 you installed a released version of R).

 Best,
 Uwe Ligges





  library(MASS)

 -Ista

 On Tue, Mar 9, 2010 at 9:32 AM, Markus Loechermarkus.loec...@gmail.com
  wrote:

 The MASS package is listed on the CRAN web site (
 http://cran.r-project.org/web/packages/MASS/index.html) but I am unable
 to
 install it via install.packages(). The error is that the package is
 unavailable. When I manually download the source tar ball and try to
 install it on a Linux machine, installation fails because it is not a
 valid
 package.

 Do I need to search different repositories ?
 Thanks,
 Markus

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.






[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R package pdf files

2010-03-09 Thread Markus Loecher
Dear all,
the examples in the pdf files that are automatically built from the examples
in package help files are poorly formatted; they frequently do not wrap to
the next line and are cut off. While there is an easy work around by looking
at the examples in the corresponding help files, I do wonder if there is a
way to ensure proper line wrappiong when creating a package.

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] MASS package not on CRAN ?

2010-03-09 Thread Markus Loecher
The MASS package is listed on the CRAN web site (
http://cran.r-project.org/web/packages/MASS/index.html) but I am unable to
install it via install.packages(). The error is that the package is
unavailable. When I manually download the source tar ball and try to
install it on a Linux machine, installation fails because it is not a valid
package.

Do I need to search different repositories ?
Thanks,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] grid.image(), pckg grid

2010-01-28 Thread Markus Loecher
While I am very happy with and awed by the grid package and its basic
plotting primitives such as grid.points, grid.lines, etc, I was wondering
whether the equivalent of a grid.image() function exists ?

Any pointer would be helpful.

Thanks !

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] possible memory leak in predict.gbm(), package gbm ?

2009-10-30 Thread Markus Loecher
Dear gbm users,
When running predict.gbm() on a large dataset (150,000 rows, 300 columns,
500 trees), I notice that the memory used by R grows beyond reasonable
limits. My 14GB of RAM are often not sufficient. I am interpreting this as a
memory leak since there should be no reason to expand memory needs once the
data are loaded and passed to predict.gbm() ?


Running R version 2.9.2 on Linux, gbm package 1.6-3.

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package:snow, timeOut for makeSOCKcluster()

2009-10-01 Thread Markus Loecher
Dear snow users,
is there any way to specify a max time after which makeSOCKcluster() stops
trying to create socket connections and gives up/returns ?
In my current setup (MAC OSX 10.5.8, R version 2.9) I have to force quit R
if the host specified in makeSOCKcluster() either does not exist or does not
respond.
On Linux, I can at least manually interrupt the function via Ctrl-C

Any help would be greatly appreciated,

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] as.POSIXct(as.Date()) independent of timezone

2009-09-18 Thread Markus Loecher
Dear R users,
I am struggling a bit with the converting dates to full POSIX timestamps, in
particular, I would like to somehow force the timezone to be local, i.e. the
output of
as.POSIXct(as.Date(2008-07-01)) should always be equal to 2008-07-01
00:00:00, is that achievable ? I tried to set the origin and the timezone,
neither of which seems to make a difference.
On my Mac Book Pro (R version 2.9.1) which is set to Eastern US time zone, I
obtain the shifted result:
 as.POSIXct(as.Date(2008-07-01))
[1] 2008-06-30 20:00:00 EDT

And e.g.
 as.POSIXct( Sys.Date())
[1] 2009-09-17 20:00:00 EDT
 Sys.time()
[1] 2009-09-18 10:10:48 EDT

Any help would make life simpler for me.

Thanks,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read/write connections

2009-07-29 Thread Markus Loecher
Dear fellow R users,
I would very much like to see an example of read/write connection (open =
r+ ) for e.g. pipe() or any other R connection.
I have a standalone program which accepts input from stdin, performs some
processing and returns the results on stdout. Is it possible at all to open
a connection to that program, write to it (i.e. to stdin of that process)
and read back the results ?

As a silly example, imagine the following use of the Unix function head:

zz - pipe( head , open =r+);
cat(rnorm(10), file = zz);

Error in cat(rnorm(10), file = zz) : cannot write to this connection

While I am not surprised that this does not work, I would love to know a
solution to this general problem.

Thanks

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read.table, row.names arg

2009-06-05 Thread Markus Loecher
Dear R users,
I had somehow expected that read.table() would treat the column specified by
the row.names argument as of class character. That seems to be the only
sensible class allowed for a column containing row names. However, that does
not seem to be the case, as the following example shows:

  x - cbind.data.frame(ID = c(010007787048271871, 1007109516820319,
10094843652996959, 010145176274075487), X1 = 1:4, X2 = 4:1);
write.table(x, tmp.txt, quote = FALSE, row.names = FALSE);
y - read.table(tmp.txt, header= TRUE, row.names=1)

 y
  X1 X2
10007787048271872  1  4
1007109516820319   2  3
10094843652996960  3  2
10145176274075488  4  1
 x
  ID X1 X2
1 010007787048271871  1  4
2   1007109516820319  2  3
3  10094843652996959  3  2
4 010145176274075487  4  1

The first column was not read in as a string, which mangled the IDs.
I could use colClasses explicitly, but then I would need to know the number
and classes of the remaining columns in advance.
Is this a bug or expected behavior ?
Any advice would be most helpful.

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quick square root axes

2009-05-05 Thread Markus Loecher
Dear R users,
while I enjoy the built-in log argument to the plot() function, I wished it
would be as easy to create more general custom transformed axes such as
sqrt(), logit, etc...

for example, instead of
 plot(x=exp(rnorm(10)), y=(1:10)^4, log = xy), sth. along the lines of
 plot(x=exp(rnorm(10)), y=(1:10)^4, trans = list(x = log, y = sqrt))
to encode the desired transfomation.

This involves just transforming the xy values and creating nice tick marks
at the appropriate positions.
Before trying to write my own function, I wanted to see if that
functionality already exists in another package ?

Thanks!

Markus
.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can install.packages() copy utility files to the public_html directory ?

2009-05-04 Thread Markus Loecher
Dear fellow R-users,
I am about to publish an HTML utility package to CRAN that expands on the
R2HTML package and includes a few goodies such as sorted tables, easy
automation of framed HTML reporting, etc.
However, some of the resulting dynamic HTML pages need to access JavaScript
code that should sit in a specific subdirectory of public_html.
My more general question is hence, (i) how do I include the directory
containing the JavaScript code in my R package and (ii) is it possible to
copy this directory to the user's public_html path during installation ?

Thanks!
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] source code for prompt()

2009-04-17 Thread Markus Loecher
Dear R community,
pardon my ignorance but how would you get the source code fornon-visible
functions ?

For example, I would like to see and modify the source code for the prompt()
function.

Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] package knnFinder, kd-trees

2009-02-19 Thread Markus Loecher
Dear R users,
thanks to Samuel for making the package knnFinder available to the public. I was
wondering if there is an easy way to only build and store the kdd tree
in a first step and perform NN queries from then on ?
It seems that nn() does both simultaneously.

Thanks!

Markus

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] compressing data without writing output to file

2009-02-07 Thread Markus Loecher
This might seem like a strange question but is there any way to compress an
R object (such as a matrix) and know its resulting size in bytes ?
Clearly, I could implement this in the following way (if x is my matrix):
  zz - gzfile(fname,w);
  write.table(x,zz);
  close(zz);
  file.info(fname)[,size];
However, I need to do this for hundreds of thousands of objects and the
overhead in terms of disk access due to the actual file creation is
prohibitive.

I guess, I would like a modified object.size() function that returns the
size of the compressed (e.g. gzip) version of the object.

Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] updating contents of a package

2009-02-07 Thread Markus Loecher
Dear fellow R users,
I read through the Writing R Extensions document and am able to now create
my own packages/libraries which so far are just well documented collections
of my own R functions. I use package.skeleton() and the tools package to
build these packages.
However, it is not clear to me how to modify and update a package after its
initial creation. How do you elegantly update e.g. the old help file when
one added an argument to a function ? How do you keep most of the existing
package structure when implementing incremental changes ?

Any help would be very useful,

Thanks,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] beanplot, Error in shapiro.test(x)

2009-02-06 Thread Markus Loecher
Dear all,
I am trying to create beanplots from a dataset for which boxplot works fine.
(MACOS, R 2.8.1 GUI 1.27 Tiger build 32-bit (5301))
I am getting the following error message:

Error in shapiro.test(x) : sample size must be between 3 and 5000

I am not even sure why the shapiro.test is being used, but is there any
workaround ?

Thanks !

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] interrupting R

2009-01-02 Thread Markus Loecher
Dear fellow R users,
is there a generic way to gracefully interrupt an R function without
terminating the entire session ? I am mainly interested in this answer for
Linux and MacOS.
I found neither Esc nor Ctrl-C to work; it seems that R does not check for
signals periodically?

Also, an entirely unrelated question: I have been looking unsuccessfully for
the R sources for the examples given in Simon Wood's book on Generalized
Additive Models. I had hoped they would be part of the mgcv package but they
are not.
Has anyone had any luck with this ?

Thanks,
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] interrupting R

2009-01-02 Thread Markus Loecher
Thank you for the quick reply.
It seems that Ctrl-C interrupts pure R functions (i.e. R scripts that do
not call external compiled libraries) but when I run functions that in turn
call external C code (such as gam() in the package mgcv), the Ctrl-C does
not appear to propagate that deeply, if I may use such loose language.
The Stop icon in the R.app on MAC OS is similarly unresponsive when e.g.
gam() is performing some extensive computations.
(I am just using gam() as an example, nothing special about it, I think)

You are right, I should ask the author about the source code, I just did not
want to add more requests to his InBox.

Thanks again,

Markus

On Fri, Jan 2, 2009 at 8:56 AM, Prof Brian Ripley rip...@stats.ox.ac.ukwrote:

 On Fri, 2 Jan 2009, Markus Loecher wrote:

  Dear fellow R users,
 is there a generic way to gracefully interrupt an R function without
 terminating the entire session ? I am mainly interested in this answer for
 Linux and MacOS.
 I found neither Esc nor Ctrl-C to work; it seems that R does not check for
 signals periodically?


 Well, Ctrl-C works for me.  Rather than check for signals, R installs a
 signal handler and gets the OS to do the work.

 On Mac OS it is unclear if you mean R or R.app.  R.app has a Stop sign
 icon, amongst other ways.

  Also, an entirely unrelated question: I have been looking unsuccessfully
 for
 the R sources for the examples given in Simon Wood's book on Generalized
 Additive Models. I had hoped they would be part of the mgcv package but
 they
 are not.
 Has anyone had any luck with this ?


 Why not ask him directly?

  Thanks,
 Markus

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 --
 Brian D. Ripley,  rip...@stats.ox.ac.uk
 Professor of Applied Statistics,  
 http://www.stats.ox.ac.uk/~ripley/http://www.stats.ox.ac.uk/%7Eripley/
 University of Oxford, Tel:  +44 1865 272861 (self)
 1 South Parks Road, +44 1865 272866 (PA)
 Oxford OX1 3TG, UKFax:  +44 1865 272595


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] apply() just loops ?

2008-11-12 Thread Markus Loecher
Dear R users,
I have been diligently using the apply() family in order to avoid explicit
for loops and speed up computation.
However, when I finally inspected the source code for apply, it appears that
the core computation is a simple loop as well.
What am I missing ? Why the often found advice to use apply() instead of
loops and the actually observed empirical  speedups on many tasks ?

Thanks in advance for demystifying,


Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] findInterval(), binary search, log(N) complexity

2008-09-22 Thread Markus Loecher
Dear R users,
the help for findInterval(x,vec) suggests a logarithmic dependence on N
(=length(vec)), which would imply a binary search type algorithm.
However, when I test this hypothesis, in the following manner:

set.seed(-3645);
l - vector();
N.seq - c(5000, 50, 100, 1000, 5000);k - 1
for (N in N.seq){
  tmp - sort(round(stats::rt(N, df=2), 2));
  l[k] - system.time(it3 - findInterval(-1, tmp))[2];k - k + 1;
}
plot(N.seq,l,type=b,xlab=length(vec), ylab=CPU time);

the resulting plot suggests a linear relationship.
I must be missing sth. here ?

Thanks !

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] gam negative.binomial

2008-05-16 Thread Markus Loecher
Dear list members,
while I appreciate the possibility to deal with overdispersion for count
data either by specifying the family argument to be quasipoisson() or
negative.binomial(), it estimates just one overdispersion parameter for the
entire data set.
In my applications I often would like the estimate for overdispersion  to
depend on the covariates in the same manner as the mean.

For example,
#either library(mgcv) or library(gam):

 x - seq(0,1,length = 100)*2*pi
 mu - 4+ 2*sin(x)
 size - 4 + 2*cos(x)
data - cbind.data.frame(x- rep(x,10), y =
rnbinom(10*100,mu=rep(mu,10),size=rep(size,10)))

x.gam - gam(y~s(x), data=data,family=quasipoisson())
plot(x.gam)
summary(x.gam)

How would I get a smooth estimate of the overdispersion ?

Thanks,

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] mboost partial contribution plots

2008-05-05 Thread Markus Loecher
Just having read the nice review article on boosting in the latest
Statistical Science, I would love to reproduce some of the plots inside
that article, but it is not clear to me how to create the partial
contribution plots for the Poisson regression.
Does anyone have example code for this ?
(The vignette does not offer it, I think)


Thanks !
Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] estimate of overdispersion with glm.nb

2008-04-21 Thread Markus Loecher
Dear R users,
I am trying to fully understand the difference between estimating
overdispersion with glm.nb() from MASS compared to glm(..., family =
quasipoisson).
It seems that (i) the coefficient estimates are different and also (ii) the
summary() method for glm.nb suggests that overdispersion is taken to be one:
Dispersion parameter for Negative Binomial(0.9695) family taken to be 1,
which is not what I expected.
The code I used is pasted below:

  x - rep(seq(0,23,by=1),50);
  s - rep(seq(1,2,length=50*24),1);

  tmp -
cbind.data.frame(y=rnbinom(length(tmp1),mu=8*(sin(2*pi*x/24)+2),size =
1),x=x,s=s);

  tmp.glm.qp - glm(y~factor(x)-1,data = tmp, family=quasipoisson,
offset=log(s));
  tmp.glm.nb - glm.nb(y~factor(x)-1 +offset(log(s)),data = tmp);

On a more advanced topic, I was furthermore hoping to compare models with a
global estimate of overdispersion with one that allows overdispersion to be
estimated separately for each level of the factor x. Can I achieve that in
glm or do I need to employ a mixed effects model ?

Thanks!

Markus

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pnbinom.c qnorm.c

2008-04-17 Thread Markus Loecher
Dear R users,
I was wondering from where I could get the C source code to compute
pnbinom() and qnorm() ?
(I would use R in batch mode but I find the startup time prohibitive, unless
there is a way to speed it up)
I searched the Web and it clearly is part of the R distribution, I just
don't know how to extract them.

Thanking you !

Markus Loecher
Princeton, NJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Confidence intervals for PCA scores/eigenvalues

2008-01-31 Thread Markus Loecher
Dear all,
I have read various descriptions of employing resampling techniques, such as
the bootstrap, to estimate the uncertainties of the eigenvectors computed by
PCA. When I try

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Too many open files

2007-12-27 Thread Markus Loecher
Dear all,
Did this problem that was posted in 2006 (see below) ever got fully resolved
?
I am encountering the exact same issue
; I have executed get.hist.quote() in a loop and now R not only refuses to
establish any further connections to yahoo, but, worse, it will not open any
files either. For example, I cannot even save my current workspace for that
reason.
I tried 
  closeAllConnections()
As well as 
  showConnections()
But I do not see any open connections.
Also, does get.hist.quote() not close its connection internally once it is
done ?

Any help would be immensely useful as I am quite stuck.
Thanks!

Markus



Re: [R] Too many open files

* This message: [ Message body ] [ More options ]
* Related messages: [ Next message ] [ Previous message ] [ In reply to
] [ [R] error reports ] [ Next in thread ]

From: Seth Falcon sfalcon_at_fhcrc.org
Date: Sat 27 May 2006 - 09:21:36 EST

Omar Lakkis [EMAIL PROTECTED] writes:

 This may be more of an OS question ...
 I have this call

 r = get.hist.quote(symbol, start= format(start, %Y-%m-%d), end=
 format(end, %Y-%m-%d))
 which does a url request

 in a loop and my program runs out of file handlers after few hundred
 rotations. The error message is: 'Too many open files'. Other than
 increasing the file handlers assigned to my process, is there a way to
 cleanly release and reuse these connections?

Inside your loop you need to close the connection object created by url().

for (i in 1:500) {

con - url(urls[i])
## ... stuff here ...
close(con)
}

R only allows you to have a fixed number of open connections at one time,
and they do not get closed automatically when they go out of scope. These
commands may help make clear how things work...

 showConnections()

 description class mode text isopen can read can write
 f = url(http://www.r-project.org;, open=r)
 showConnections()

From: Gabor Grothendieck ggrothendieck_at_gmail.com
Date: Sat 27 May 2006 - 09:47:20 EST

Try closeAllConnections()

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.