[R] Combining two or more KML files in R

2012-02-11 Thread z2.0
Wanted to post an exchange with Roger Bivand on combining KML files. 

I had hoped to combine multiple KML files into a single
SpatialPolygonsDataFrame. I used readOGR to bring files into R; had I worked
with .shp files, I might've used readShapePoly and a unique IDvar. SPRbind
construction would have followed.

Turns out I needed to review spChFIDs-methods in the sp package. Roger said, 
'they let you assign row.names to the geometries and the data slot
data.frame consistently.'

Zack

--
View this message in context: 
http://r.789695.n4.nabble.com/Combining-two-or-more-KML-files-in-R-tp4378333p4378333.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint on one of parameters.

2012-02-11 Thread FU-WEN LIANG
Thanks for your suggestion.
I did read the manual but it seems those examples set boundaries for every
parameter. I have no idea how to set bound for only one parameter(in my
case, only for theta[21]). I tried adding
method='L-BFGS-B',lower=c(rep(inf,20),-1),upper=c(rep(inf,20),1), but got
this error object 'inf' not found. I thought for those unconstrained
parameters I can set them as (-inf, inf), but the manual says using
L-BFGS-B method, the estimated parameters should be finite numbers.
My constrain on theta[21] is  -1  theta[21]  1
For those unconstrained parameters, how do I free them?

Thank you very much.

On Fri, Feb 10, 2012 at 1:26 AM, Rubén Roa r...@azti.es wrote:

 Read optimx's help.
 There are 'method', 'upper', 'lower' arguments that'll let you put bounds
 on pars.
 HTH
 Rubén

 -Mensaje original-
 De: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] En
 nombre de FU-WEN LIANG
 Enviado el: jueves, 09 de febrero de 2012 23:56
 Para: r-help@r-project.org
 Asunto: [R] Constraint on one of parameters.

 Dear all,

 I have a function to optimize for a set of parameters and want to set a
 constraint on only one parameter. Here is my function. What I want to do is
 estimate the parameters of a bivariate normal distribution where the
 correlation has to be between -1 and 1. Would you please advise how to
 revise it?

 ex=function(s,prob,theta1,theta,xa,xb,xc,xd,t,delta) {

 expo1= exp(s[3]*xa+s[4]*xb+s[5]*xc+s[6]*xd)
 expo2= exp(s[9]*xa+s[10]*xb+s[11]*xc+s[12]*xd)
 expo3= exp(s[15]*xa+s[16]*xb+s[17]*xc+s[18]*xd)
 expo4= exp(s[21]*xa+s[22]*xb+s[23]*xc+s[24]*xd)
 expo5= exp(s[27]*xa+s[28]*xb+s[29]*xc+s[30]*xd)


 nume1=prob[1]*(s[2]^-s[1]*s[1]*t^(s[1]-1)*expo1)^delta*exp(-s[2]^-s[1]*t^s[1]*expo1)*

 theta1[1]^xa*(1-theta1[1])^(1-xa)*theta1[2]^xb*(1-theta1[2])^(1-xb)*(1+theta1[11]*(xa-theta1[1])*(xb-theta1[2])/sqrt(theta1[1]*(1-theta1[1]))/sqrt(theta1[2]*(1-theta1[2])))/

 (2*pi*theta[2]*theta[4]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[1])^2/theta[2]^2+(xd-theta[3])^2/theta[4]^2-2*theta[21]^2*(xc-theta[1])*(xd-theta[3])/(theta[2]*theta[4]))


 nume2=prob[2]*(s[8]^-s[7]*s[7]*t^(s[7]-1)*expo2)^delta*exp(-s[8]^-s[7]*t^s[7]*expo2)*

 theta1[3]^xa*(1-theta1[3])^(1-xa)*theta1[4]^xb*(1-theta1[4])^(1-xb)*(1+theta1[11]*(xa-theta1[3])*(xb-theta1[4])/sqrt(theta1[3]*(1-theta1[3]))/sqrt(theta1[4]*(1-theta1[4])))/

 (2*pi*theta[6]*theta[8]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[5])^2/theta[6]^2+(xd-theta[7])^2/theta[8]^2-2*theta[21]^2*(xc-theta[5])*(xd-theta[7])/(theta[6]*theta[8]))


 nume3=prob[3]*(s[14]^-s[13]*s[13]*t^(s[13]-1)*expo3)^delta*exp(-s[14]^-s[13]*t^s[13]*expo3)*

 theta1[5]^xa*(1-theta1[5])^(1-xa)*theta1[6]^xb*(1-theta1[6])^(1-xb)*(1+theta1[11]*(xa-theta1[5])*(xb-theta1[6])/sqrt(theta1[5]*(1-theta1[5]))/sqrt(theta1[6]*(1-theta1[6])))/

 (2*pi*theta[10]*theta[12]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[9])^2/theta[10]^2+(xd-theta[11])^2/theta[12]^2-2*theta[21]^2*(xc-theta[9])*(xd-theta[11])/(theta[10]*theta[12]))


 nume4=prob[4]*(s[20]^-s[19]*s[19]*t^(s[19]-1)*expo4)^delta*exp(-s[20]^-s[19]*t^s[19]*expo4)*

 theta1[7]^xa*(1-theta1[7])^(1-xa)*theta1[8]^xb*(1-theta1[8])^(1-xb)*(1+theta1[11]*(xa-theta1[7])*(xb-theta1[8])/sqrt(theta1[7]*(1-theta1[7]))/sqrt(theta1[8]*(1-theta1[8])))/

 (2*pi*theta[14]*theta[16]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[13])^2/theta[14]^2+(xd-theta[15])^2/theta[16]^2-2*theta[21]^2*(xc-theta[13])*(xd-theta[15])/(theta[14]*theta[16]))


 nume5=prob[5]*(s[26]^-s[25]*s[25]*t^(s[25]-1)*expo5)^delta*exp(-s[26]^-s[25]*t^s[25]*expo5)*

 theta1[9]^xa*(1-theta1[9])^(1-xa)*theta1[10]^xb*(1-theta1[10])^(1-xb)*(1+theta1[11]*(xa-theta1[9])*(xb-theta1[10])/sqrt(theta1[9]*(1-theta1[9]))/sqrt(theta1[10]*(1-theta1[10])))/

 (2*pi*theta[18]*theta[20]*sqrt(1-theta[21]^2))*exp(-2*(1-theta[21]^2))^(-1)*((xc-theta[17])^2/theta[18]^2+(xd-theta[19])^2/theta[20]^2-2*theta[21]^2*(xc-theta[17])*(xd-theta[19])/(theta[18]*theta[20]))


 denom=nume1+nume2+nume3+nume4+nume5

 Ep1=nume1/denom
 Ep2=nume2/denom
 Ep3=nume3/denom
 Ep4=nume4/denom
 Ep5=nume5/denom


 elogld=

 sum(Ep1*(-log(2*pi*theta[2]*theta[4]*sqrt(1-theta[21]^2))-(2*(1-theta[21]^2))^(-1)*((xc-theta[1])^2/theta[2]^2+(xd-theta[3])^2/theta[4]^2-2*theta[21]^2*(xc-theta[1])*(xd-theta[3])/(theta[2]*theta[4]
 +

 sum(Ep2*(-log(2*pi*theta[6]*theta[8]*sqrt(1-theta[21]^2))-(2*(1-theta[21]^2))^(-1)*((xc-theta[5])^2/theta[6]^2+(xd-theta[7])^2/theta[8]^2-2*theta[21]^2*(xc-theta[5])*(xd-theta[7])/(theta[6]*theta[8]
 +

 sum(Ep3*(-log(2*pi*theta[10]*theta[12]*sqrt(1-theta[21]^2))-(2*(1-theta[21]^2))^(-1)*((xc-theta[9])^2/theta[10]^2+(xd-theta[11])^2/theta[12]^2-2*theta[21]^2*(xc-theta[9])*(xd-theta[11])/(theta[10]*theta[12]
 +

 

Re: [R] Schwefel Function Optimization

2012-02-11 Thread Hans W Borchers
Vartanian, Ara aravart at indiana.edu writes:

 All,
 
 I am looking for an optimization library that does well on something as
  chaotic as the Schwefel function:
 
 schwefel - function(x) sum(-x * sin(sqrt(abs(x
 
 With these guys, not much luck:
 
  optim(c(1,1), schwefel)$value
 [1] -7.890603
  optim(c(1,1), schwefel, method=SANN, control=list(maxit=1))$value
 [1] -28.02825
  optim(c(1,1), schwefel, lower=c(-500,-500), upper=c(500,500), 
 method=L-BFGS-B)$value
 [1] -7.890603
  optim(c(1,1), schwefel, method=BFGS)$value
 [1] -7.890603
  optim(c(1,1), schwefel, method=CG)$value
 [1] -7.890603

Why is it necessary over and over again to point to the Optimization Task
View? This is a question about a global optimization problem, and the task
view tells you to look at packages like 'NLoptim' with specialized routines,
or use one of the packages with evolutionary algorithms, such as 'DEoptim'
or'pso'.

library(DEoptim)
schwefel - function(x) sum(-x * sin(sqrt(abs(x
de - DEoptim(schwefel, lower = c(-500,-500), upper = c(500,500),
 control = list(trace = FALSE))
de$optim$bestmem
# par1 par2 
# 420.9687 420.9687 
de$optim$bestval
# [1] -837.9658

 All trapped in local minima. I get the right answer when I pick a starting 
 point that's close:
 
  optim(c(400,400), schwefel, lower=c(-500,-500), upper=c(500,500),
  method=L-BFGS-B)$value
 [1] -837.9658
 
 Of course I can always roll my own:
 
 r - vector()
 for(i in 1:1000) {
   x - runif(2, -500,500)
   m - optim(x, schwefel, lower=c(-500,-500), upper=c(500,500),
 method=L-BFGS-B)
   r - rbind(r, c(m$par, m$value))
 }
 
 And this does fine. I'm just wondering if this is the right approach,
 or if there is some other package that wraps this kind of multi-start
 up so that the user doesn't have to think about it.
 
 Best,
 
 Ara

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Find interval between numbers in list or vector

2012-02-11 Thread alend
Thanks, it is exactly the function that I needed.

--
View this message in context: 
http://r.789695.n4.nabble.com/Find-interval-between-numbers-in-list-or-vector-tp4376115p4378473.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Constraint on one of parameters.

2012-02-11 Thread Petr Savicky
On Fri, Feb 10, 2012 at 11:40:57PM -0600, FU-WEN LIANG wrote:
 Thanks for your suggestion.
 I did read the manual but it seems those examples set boundaries for every
 parameter. I have no idea how to set bound for only one parameter(in my
 case, only for theta[21]). I tried adding
 method='L-BFGS-B',lower=c(rep(inf,20),-1),upper=c(rep(inf,20),1), but got
 this error object 'inf' not found. I thought for those unconstrained

Hi.

Names in R are case sensitive. The infinity is Inf, not inf.

 parameters I can set them as (-inf, inf), but the manual says using
 L-BFGS-B method, the estimated parameters should be finite numbers.
 My constrain on theta[21] is  -1  theta[21]  1
 For those unconstrained parameters, how do I free them?

If the bound cannot be -Inf and Inf, try -1e308 and 1e308.
This is close to the bounds of the range of numeric values.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to plot a nice legend?

2012-02-11 Thread Jonas Stein
i'd like to plot a legend in my diagram. The diagram will be included 
in a TikZ LaTeX document later.

I tried the legend() function, but 
- it can not find a good place it self where the legend fits
  and playing around with coordinates and scaling consumes a lot time

- standard settings for the text need adjustment 
  (linespacing is quite large and so on)

Is there an alternative to legend()?

Is it possible to place the legend() outside of the plot area?

Kind regards,

-- 
Jonas Stein n...@jonasstein.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] fixed effects with clustered standard errors

2012-02-11 Thread Francesco
Dear Giovanni,

I recalled the procedure and here is the output :

Erreur : impossible d'allouer un vecteur de taille 39.2 Mo
De plus : Messages d'avis :
1: In structure(list(message = as.character(message), call = call),  :
  Reached total allocation of 12279Mb: see help(memory.size)
2: In structure(list(message = as.character(message), call = call),  :
  Reached total allocation of 12279Mb: see help(memory.size)
3: In structure(list(message = as.character(message), call = call),  :
  Reached total allocation of 12279Mb: see help(memory.size)
4: In structure(list(message = as.character(message), call = call),  :
  Reached total allocation of 12279Mb: see help(memory.size)

 traceback()
Pas d'historique des fonctions appelées ('traceback') disponible

I had a similar problem with another stat software but in a different
context : when I tried to fit a fixed effect logit model.
The soft did not converge because, as you rightly guessed at the
beginning of this thread, the number of points for some individuals is
too high.

This might be the source of error here, probably?
I recall that the median number of points in my database is quite low
at 10, but I have individuals with more than 2000, 5000 or even 50 000
points!

What do you think ?

Many thanks
Best,

On 9 February 2012 10:31, John L caribou...@gmx.fr wrote:
 Dear Giovanni,

 Many thanks for your interesting suggestions.
 Your guess is indeed right, I only use the 'within' fixed effects 
 specification.

 I will soon send to this list all the additional information you
 requested in order to understand what might cause this problem, but I
 would say as a first guess that the inefficiency is (probably?) due to
 individuals with too many datapoints : the median number of points is
 10, but I have some individuals with more than 1000, 5000 or even 80
 000 points overall!
 So basically my dataset is probably too strange, as you suggested,
 compared to the standard panel dataset in social sciences...

 To be continued... ;-)
 Many thanks again

 Best,

 On 8 February 2012 18:55, Millo Giovanni [via R]
 ml-node+s789695n4370302...@n4.nabble.com wrote:
 Dear John,

 interesting. There must be a bottleneck somewhere, which possibly went
 unnoticed because econometricians seldom use so many data points. In
 fact 'plm' wasn't designed to handle only 700 Megs of data at a time;
 but we're happy to investigate in this direction too. E.g., I was aware
 of some efficiency problems if effect=twoways but I seem to understand
 that you are using effect=individual? -- which takes me to the main
 point.

 I understand that enclosing the data for a reproducible report, as
 requested by the posting guide, is awkward for such a big dataset. Yet
 it would be of great help if you at least produced:

 - an output of your procedure, in order to see what goes wrong and where
 - the output of traceback() called immediately after you got the error
 (idem)

 and possibly gave it a try with lm() applied to the very same formula
 and data, maybe into a system.time( ... ) statement.

 Else, the information you provide is way too scant to even make an
 educated guess. For example, it isn't clear whether the problem is
 related to plm() or to vcovHC.plm etc.

 As far as simple demeaning is concerned, you might try the following
 code, which really does only that. Be aware that **standard errors are
 biased** etc. etc., this is not meant to be a proper function but just a
 computational test for your data and a quick demonstration of demeaning.
 'plm()' is far more structured, for a number of reasons. Please execute
 it inside system.time() again.

 # test function for within model, BIASED SEs !! #
 ##
 ## ## example:
 ## data(Produc, package=plm)
 ## mod - FEmod(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp,
 index=Produc$state, data=Produc)
 ## summary(mod)
 ## ## compare with:
 ## library(plm)
 ## example(plm)

 demean-function(x,index,lambda=1,na.rm=F) {

 as.vector(x-lambda*tapply(x,index,mean,na.rm=na.rm)[unclass(as.factor(in
 dex))])
   }
 FEmod-function(formula,index,data=ls()) {

   ## fit a model without intercept in any case
   formula-as.formula(paste(deparse(formula(formula)),-1,sep=))
   X-model.matrix(formula,data=data)
   y-model.response(model.frame(formula,data=data))
   ## reduce index accordingly
   names(index)-row.names(data)
   ind-index[which(names(index)%in%row.names(X))]

   ## within transf.
   MX-matrix(NA,ncol=dim(X)[[2]],nrow=dim(X)[[1]])
   for(i in 1:dim(X)[[2]]) {
     MX[,i]-demean(X[,i],index=ind,lambda=1)
     }
   My-demean(y,index=ind,lambda=1)

   ## estimate within model
   femod-lm(My~MX-1)

   return(femod)
 }
 ### end test function 


 Best,
 Giovanni

 ### original message #

 --

 Message: 28
 Date: Tue, 07 Feb 2012 15:35:07 +0100
 From: [hidden email]
 To: [hidden email]
 Subject: [R] fixed effects with clustered standard errors
 Message-ID: [hidden email]
 

Re: [R] how to plot a nice legend?

2012-02-11 Thread Duncan Murdoch

On 12-02-10 3:45 PM, Jonas Stein wrote:

i'd like to plot a legend in my diagram. The diagram will be included
in a TikZ LaTeX document later.

I tried the legend() function, but
- it can not find a good place it self where the legend fits
   and playing around with coordinates and scaling consumes a lot time

- standard settings for the text need adjustment
   (linespacing is quite large and so on)

Is there an alternative to legend()?

Is it possible to place the legend() outside of the plot area?

Kind regards,



There are various alternatives available; you can also write your own, 
by modifying the standard one.


Generally there are lots of possibilities for customizing within the 
standard one; e.g. y.intersp will affect the line spacing, using a 
negative value for inset (together with xpd=NA) will allow the legend to 
be moved outside the plot.


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] updating one's own package

2012-02-11 Thread Wet Bell Diver

Win7 x64, R2.14.1

Dear list,

I would like to get into the habbit of creating a package for each 
project. That way I can package my project-specific functions and data 
together and load or unload that at will. Also, it will allow easy 
navigation through all the functions I will have available for each 
project, since I can then use ?myfunction and immediately see the 
details of that function (esp. useful when revisiting a project after 
several months).
Anyway, I have been successful in constructing a package, using the 
package.skeleton. My question is how to easily update the package when I 
write an additional function or rewrite an existing function. Do I then 
need to again build the package fully, or is there an incremental 
update option somewhere? This I can not find anywhere, maybe I am 
overlooking something?


Thanks,
Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] best option for big 3D arrays?

2012-02-11 Thread Duncan Murdoch

On 12-02-10 9:12 AM, Djordje Bajic wrote:

Hi all,

I am trying to fill a 904x904x904 array, but at some point of the loop R
states that the 5.5Gb sized vector is too big to allocate. I have looked at
packages such as bigmemory, but I need help to decide which is the best
way to store such an object. It would be perfect to store it in this cube
form (for indexing and computation purpouses). If not possible, maybe the
best is to store the 904 matrices separately and read them individually
when needed?

Never dealed with such a big dataset, so any help will be appreciated

(R+ESS, Debian 64bit, 4Gb RAM, 4core)


I'd really recommend getting more RAM, so you can have the whole thing 
loaded in memory.  16 Gb would be nice, but even 8Gb should make a 
substantial difference.  It's going to be too big to store as an array 
since arrays have a limit of 2^31-1 entries, but you could store it as a 
list of matrices, e.g.


x - vector(list, 904)
for (i in 1:904)
  x[[i]] - matrix(0, 904,904)

and then refer to entry i,j,k as x[[i]][j,k].

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice 3d coordinate transformation

2012-02-11 Thread Deepayan Sarkar
On Fri, Feb 10, 2012 at 12:43 AM, ilai ke...@math.montana.edu wrote:
 Hello List!
 I asked this before (with no solution), but maybe this time... I'm
 trying to project a surface to the XY under a 3d cloud using lattice.
 I can project contour lines following the code for fig 13.7 in
 Deepayan Sarkar's Lattice, Multivariate Data Visualization with R,
 but it fails when I try to color them in using panel.levelplot.
 ?utilities.3d says there may be some bugs, and I think
 ltransform3dto3d() is not precise (where did I hear that?), but is
 this really the source of my problem? Is there a (simple?) workaround,
 maybe using 3d.wire but projecting it to XY? How? Please, any insight
 may be useful.

I don't think this will be that simple. panel.levelplot() essentially
draws a bunch of colored rectangles. For a 3D projection, each of
these will become (four-sided) polygons. You need to compute the
coordinates of those polygons, figure out their fill colors (possibly
using ?level.colors) and then draw them.

-Deepayan


 Thanks in advance,
 Elai.

 A working example:

  ## data d and predicted surf:
 set.seed(1113)
 d - data.frame(x=runif(30),y=runif(30),g=gl(2,15))
 d$z - with(d,rnorm(30,3*asin(x^2)-5*y^as.integer(g),.1))
 d$z - d$z+min(d$z)^2
 surf - by(d,d$g,function(D){
  fit - lm(z~poly(x,2)*poly(y,2),data=D)
  outer(seq(0,1,l=10),seq(0,1,l=10),function(x,y,...)
 predict(fit,data.frame(x=x,y=y)))
 })
 ##
 # This works to get contours:
 require(lattice)
 cloud(z~x+y|g,data=d,layout=c(2,1), type='h', lwd=3, par.box=list(lty=0),
      scales=list(z=list(arrows=F,tck=0)),
      panel.3d.cloud = function(x, y, z,rot.mat, distance,
 zlim.scaled, nlevels=20,...){
        add.line - trellis.par.get(add.line)
        clines - contourLines(surf[[packet.number()]],nlevels = nlevels)
        for (ll in clines) {
          m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5,
 zlim.scaled[1]), rot.mat,
                                distance)
          panel.lines(m[1,], m[2,], col = add.line$col, lty = add.line$lty,
                      lwd = add.line$lwd)
        }
        panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled =
 zlim.scaled, ...)
      }
      )
 # But using levelplot:
 panel.3d.levels - function(x, y, z,rot.mat, distance, zlim.scaled,...)
 {
    zz - surf[[packet.number()]]
    n - nrow(zz)
    s - seq(-.5,.5,l=n)
    m - ltransform3dto3d(rbind(rep(s,n),rep(s,each=n),zlim.scaled[1]),
                          rot.mat, distance)
    panel.levelplot(m[1,],m[2,],zz,1:n^2,col.regions=heat.colors(20))
    panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...)
  }
 cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = 
 panel.3d.levels,
      scales=list(z=list(arrows=F,tck=0)),par.box=list(lty=0),lwd=3)
 # I also tried to fill between contours but can't figure out what to
 do with the edges and how to incorporate the x,y limits to 1st and nth
 levels.
 panel.3d.contour - function(x, y, z,rot.mat, distance,xlim,ylim,
 zlim.scaled,nlevels=20,...)
 {
    add.line - trellis.par.get(add.line)
    zz - surf[[packet.number()]]
    clines - contourLines(zz,nlevels = nlevels)
    colreg - heat.colors(max(unlist(lapply(clines,function(ll) ll$level
    for (i in 2:length(clines)) {
      ll - clines[[i]]
      ll0 - clines[[i-1]]
      m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5, zlim.scaled[1]),
 rot.mat, distance)
      m0 - ltransform3dto3d(rbind(ll0$x-.5, ll0$y-.5,
 zlim.scaled[1]), rot.mat, distance)
      xvec - c(m0[1,],m[1,ncol(m):1])
      yvec - c(m0[2,],m[2,ncol(m):1])
      panel.polygon(xvec,yvec,col=colreg[ll$level],border='transparent')
      panel.lines(m[1,], m[2,], col = add.line$col, lty = add.line$lty,
                  lwd = add.line$lwd)
    }
    panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...)
  }
 cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = 
 panel.3d.contour,
      scales=list(z=list(arrows=F,tck=0)),par.box=list(lty=0),lwd=3)

 #

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to solve long tick labels (axis.text.x)

2012-02-11 Thread vd3000
Hi, all, 

I am a newbie for [r]. 

I am currently trying to learn this example. 
/http://learnr.wordpress.com/2009/03/17/ggplot2-barplots//
/http://learnr.wordpress.com/2009/03/17/ggplot2-barplots/  /

After I made a drawing 

/c - b + facet_grid(Region ~ .) + opts(legend.position = none) /

If I want to make the axis.text.x (I don't want to mix with axis labels, so
I type axis.text.x, or simply tick labels) to become horizontal, I think I
could do it.
However, how could I make the text in two rows like 1820 - on the first
row and 30 on the second row?

I am also trying to make another graph, however, the axis.text.x do not
follows any pattern...
let say, amercian handsome guys, italian ladys, smart japanese, etc...
How could I wrap those tick-labels in ggplot???

I have tried to follow the wrapper from 
/http://stackoverflow.com/questions/5574157/r-ggplot2-can-i-make-the-facet-strip-text-wrap-around/
/https://stat.ethz.ch/pipermail/r-help/2005-April/069496.html/
But I just failed again and again...X_X.

Hope some genius could help. 

Thanks.

vd

--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-solve-long-tick-labels-axis-text-x-tp4378760p4378760.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Detect numerical series

2012-02-11 Thread syrvn
Hello,

I am struggling with detecting successive digits in a numerical series
vector.

Here is an example:

vec - c(1, 15, 26, 29, 30, 31, 37, 40, 41)

I want to be able to detect 29, 30, 31 and 40, 41.

Then, I would like to delete the successive digits from the vector.

1, 15, 26, 29, 37, 40


Cheers

--
View this message in context: 
http://r.789695.n4.nabble.com/Detect-numerical-series-tp4379088p4379088.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Detect numerical series

2012-02-11 Thread David Winsemius


On Feb 11, 2012, at 10:01 AM, syrvn wrote:


Hello,

I am struggling with detecting successive digits in a numerical series
vector.

Here is an example:

vec - c(1, 15, 26, 29, 30, 31, 37, 40, 41)

I want to be able to detect 29, 30, 31 and 40, 41.

Then, I would like to delete the successive digits from the vector.

1, 15, 26, 29, 37, 40


 vec[ c(TRUE, !diff(vec) == 1) ]
#[1]  1 15 26 29 37 40




Cheers

--
View this message in context: 
http://r.789695.n4.nabble.com/Detect-numerical-series-tp4379088p4379088.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Detect numerical series

2012-02-11 Thread syrvn
That's great code. Thanks a lot!

--
View this message in context: 
http://r.789695.n4.nabble.com/Detect-numerical-series-tp4379088p4379133.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] naiveBayes: slow predict, weird results

2012-02-11 Thread Uwe Ligges
We don't have the data, but my guess is that you want to have some 
factors in your data that were integers when you tried the code below.


Uwe Ligges


On 10.02.2012 03:43, Sam Steingold wrote:

I did this:
nb- naiveBayes(users, platform)
pl- predict(nb,users)
nrow(users) ==  314781
ncol(users) ==  109

1. naiveBayes() was quite fast (~20 seconds), while predict() was slow
(tens of minutes).  why?

2. the predict results were completely off the mark (quite the opposite
of the expected overfitting).  suffice it to show the tables:

pl:

android blackberry   ipad iphone lg  linuxmac
  3  5 11 14 312723  5 11
 mobile  nokiasamsungsymbianunknownwindows
   1864 17 16112  0  0

platform:
android blackberry   ipad iphone lg  linuxmac
  18013   1221   2647   1328  4   2936  34336
 mobile  nokiasamsungsymbianunknownwindows
 18 88 39103   2660 251388

i.e., nb classified nearly everything as lg while in the actual data
lg is virtually nonexistent.

3. when I print nb, I see A-priori probabilities (which are what I
expected) and Conditional probabilities which are confusing because
there are only two of them, e.g.:

  android0.048464998 0.43946764
  blackberry 0.001638002 0.04045564
  ipad   0.322251606 1.84940588
  iphone 0.030873494 0.23250250
  lg 0.0 0.
  linux  0.023501362 0.34698919
  mac0.082653774 1.22535027
  mobile 0.0 0.
  nokia  0.0 0.
  samsung0.0 0.
  symbian0.0 0.
  unknown0.003759398 0.08219078
  windows0.021158528 0.32916970

the predictors are integers.
is the first column for the 0 predictors and the second for all non-0?
Is there a way to ask naiveBayes to differenciate between non-0 values?

thanks!



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need to aggregate large dataset by week...

2012-02-11 Thread John Kane
If I understand what you want here are two possible ways to approach the 
problem.  One uses aggregate and one uses the reshape package to melt and cast 
the data into the form you want.

To use reshape you need to install the reshape package.

Assuming your dataset is named xx

aggregate(xx, by=list(xx$week), mean)

library(reshape)
mm - melt(xx, id=c(week))
cast(mm, week ~ variable, mean)




John Kane
Kingston ON Canada


 -Original Message-
 From: revda...@gmail.com
 Sent: Fri, 10 Feb 2012 04:55:44 -0800 (PST)
 To: r-help@r-project.org
 Subject: [R] Need to aggregate large dataset by week...
 
 Hi all,
 
 I have a large dataset with ~8600 observations that I want to compress to
 weekly means. There are 9 variables (columns), and I have already added a
 week column with 51 weeks. I have been looking at the functions:
 aggregate, tapply, apply, etc. and I am just not savvy enough with R to
 figure this out on my own, though I'm sure it's fairly easy. I also have
 the
 Dates (month/day/year) for all of the observations, but I figured just
 having a week column may be easier. If someone wanted to show me how to
 organize this data using a date function and aggregating by month that
 would
 be useful too!
 
 Here's an example of the data set, with only 5 of the variables and 10 of
 8600 obs.:
 
  weekrainfall windspeed   winddir  temp   oakdepth
 1   1  0.2000   0.89000  245.9200  1.15   4.40
 2   1  0.   0.84000  292.8800  1.19   5.30
 3   1  0.2000   0.74000  258.5400  1.36   6.00
 4   1  0.   0.930003.7000  1.43   4.40
 5   1  0.2000   0.69000   37.8200  1.56   5.20
 6   1  0.   0.8   17.2900  1.69   4.40
 7   1  0.2000   0.7   28.7300  1.88   5.00
 8   1  0.2000   1.12000  294.3700  1.93   6.00
 9   1  0.   1.21000  274.9700  1.80   4.40
 10  1  0.   1.31000  279.2400  1.86   5.80
 
 ...so after about 170 observations it changes to week 2, and so on.
 
 I've tried something like this, but its only one variable's mean, and I
 would rather have the rows=weeks and columns= the different variables.
 
  tapply(metdata$rainfall,metdata$week,FUN=mean)
   1   2   3   4   5   6
 0.080952381 0.101190476 0.379761905 0.179761905 0.0 0.295238095
   7   8   9  10  11  12
 0.146428571 0.015476190 0.16389 0.098809524 0.065476190 0.215476190
 
 Hope this is enough information and that I'm not just re-asking an old
 question. Thanks so much in advance for any help.
 
 
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Need-to-aggregate-large-dataset-by-week-tp4376154p4376154.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Send your photos by email in seconds...
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if3
Works in all emails, instant messengers, blogs, forums and social networks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] object not found - Can not figure out why I get this error: Error in NROW(yCoordinatesOfLines) : object 'low' not found

2012-02-11 Thread Samo Pahor
Hi,

I have been using R for over a year now. I am a very happy user. Thank you
for making this happen.

This is my first question to this list.

I trying to add some functions to quantmod that would enable me to draw
arbitrary lines and text and make sure they are redrawn. I have created
following function:

require(quantmod)

# Add horizontal line to graph produced by quantmod::chart_Series()
add_HorizontalLine-function(yCoordinatesOfLines, on=1, ...) {
lenv - new.env()
lenv$add_horizontalline - function(x, yCoordinatesOfLines, ...) {
xdata - x$Env$xdata
xsubset - x$Env$xsubset

x0coords - rep(1, NROW(yCoordinatesOfLines))
x1coords - rep(NROW(xdata[xsubset]), NROW(yCoordinatesOfLines))

if ((NROW(x0coords)  0)  (NROW(x1coords)  0)) {
segments(x0coords,
 yCoordinatesOfLines,
 x1coords,
 yCoordinatesOfLines, ...)
#abline(h=yCoordinatesOfLines, ...)
}
}
mapply(function(name, value) {assign(name,value,envir=lenv)},
names(list(yCoordinatesOfLines=yCoordinatesOfLines,...)),
list(yCoordinatesOfLines=yCoordinatesOfLines,...))
exp - parse(text=gsub(list,add_horizontalline,
as.expression(substitute(list(x=current.chob(),

yCoordinatesOfLines=yCoordinatesOfLines, ..., srcfile=NULL)
plot_object - current.chob()
lenv$xdata - plot_object$Env$xdata
#plot_object$set_frame(sign(on)*abs(on)+1L)
plot_object$set_frame(2*on)
plot_object$add(exp,env=c(lenv, plot_object$Env),expr=TRUE)
plot_object
}

# Short test function that uses add_HorizontalLine
test-function(series, low=20, high=80) {
chart_Series(SPX, subset=2012)
add_TA(RSI(Cl(SPX)))
plot(add_HorizontalLine(c(low, high), on=2, col=c('green', 'red'),
lwd=2))
}

# Actual test
SPX - getSymbols(^GSPC, from=2000-01-01, auto.assign=FALSE)
dev.new()
test(SPX)

This gives me the following error:
 test(SPX)
Error in NROW(yCoordinatesOfLines) : object 'low' not found

What am I doing wrong here? Any hints highly appreciated.

The funniest thing is that this was working and somehow broke it...

Best,
Samo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] passing an extra argument to an S3 generic

2012-02-11 Thread ilai
You are setting a new class (inflmlm) at the end of mlm.influence.
Remove that second to last line and enjoy your new S3 method.

I'm not sure, but I think it is just the new class inflmlm applied
to inf in the formals of hatvalues.mlm confused the dispatch
mechanism. You would think the error message will call the offending
class not numeric double but that's above my pay grade...

You could probably put back the inflmlm class assignment with an
explicit call to UseMethod in hatvalues.mlm ?

Cheers

On Fri, Feb 10, 2012 at 2:35 PM, Michael Friendly frien...@yorku.ca wrote:
 On 2/10/2012 4:09 PM, Henrik Bengtsson wrote:

 So people may prefer to do the following:

 hatvalues.mlm- function(model, m=1, infl, ...)
 {
    if (missing(infl)) {
      infl- mlm.influence(model, m=m, do.coef=FALSE);
    }

    hat- infl$H
    m- infl$m
    names(hat)- if(m==1) infl$subsets else apply(infl$subsets,1,
 paste, collapse=',')
    hat
 }

 Thanks;  I tried exactly that, but I still can't pass m=2 to the mlm method
 through the generic

 hatvalues(Rohwer.mod)
         1          2          3          4          5          6          7
          8
 0.16700926 0.21845327 0.14173469 0.07314341 0.56821462 0.15432157 0.04530969
 0.17661104
         9         10         11         12         13         14         15
         16
 0.05131298 0.45161152 0.14542776 0.17050399 0.10374592 0.12649927 0.33246744
 0.33183461
        17         18         19         20         21         22         23
         24
 0.17320579 0.26353864 0.29835817 0.07880597 0.14023750 0.19380286 0.04455330
 0.20641708
        25         26         27         28         29         30         31
         32
 0.15712604 0.15333879 0.36726467 0.11189754 0.30426999 0.08655434 0.08921878
 0.07320950

 hatvalues(Rohwer.mod, m=2)
 Error in UseMethod(hatvalues) :
  no applicable method for 'hatvalues' applied to an object of class
 c('double', 'numeric')

 ## This works:
 hatvalues.mlm(Rohwer.mod, m=2)
   ... output snipped

 hatvalues

 function (model, ...)
 UseMethod(hatvalues)
 bytecode: 0x021339e4
 environment: namespace:stats



 -Michael


 --
 Michael Friendly     Email: friendly AT yorku DOT ca
 Professor, Psychology Dept.
 York University      Voice: 416 736-5115 x66249 Fax: 416 736-5814
 4700 Keele Street    Web:   http://www.datavis.ca
 Toronto, ONT  M3J 1P3 CANADA



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Derive pattern from vector

2012-02-11 Thread syrvn
Hello,

consider the following vector 'chars':


chars - c(A, B, C, C, D, E, E, E, F, F, F)


I need to convert 'chars' into the following pattern:


1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 8

As soon as there are duplicates they get the same number otherwise it's
increasing numbers.

However, for the char 'F' it should be always increasing numbers. Is that
possible in R?


I used the following code:


chars - c('A', 'B', 'C', 'C', 'D', 'E', 'E', 'E', 'F', 'F', 'F')

chars_dup - duplicated(chars)

cumsum(!chars_dup)

 [1] 1 2 3 3 4 5 5 5 6 6 6


But I do not know how to treat 'F' in the way described above.


Regards



--
View this message in context: 
http://r.789695.n4.nabble.com/Derive-pattern-from-vector-tp4379312p4379312.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derive pattern from vector

2012-02-11 Thread Petr Savicky
On Sat, Feb 11, 2012 at 09:11:12AM -0800, syrvn wrote:
 Hello,
 
 consider the following vector 'chars':
 
 
 chars - c(A, B, C, C, D, E, E, E, F, F, F)
 
 
 I need to convert 'chars' into the following pattern:
 
 
 1, 2, 3, 3, 4, 5, 5, 5, 6, 7, 8
 
 As soon as there are duplicates they get the same number otherwise it's
 increasing numbers.
 
 However, for the char 'F' it should be always increasing numbers. Is that
 possible in R?
 
 
 I used the following code:
 
 
 chars - c('A', 'B', 'C', 'C', 'D', 'E', 'E', 'E', 'F', 'F', 'F')
   
 chars_dup - duplicated(chars)
   
 cumsum(!chars_dup)
 
  [1] 1 2 3 3 4 5 5 5 6 6 6
 
 
 But I do not know how to treat 'F' in the way described above.

Try this

  non_dup - !duplicated(chars) | chars == 'F'
  cumsum(non_dup)
   [1] 1 2 3 3 4 5 5 5 6 7 8

HTH.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Derive pattern from vector

2012-02-11 Thread syrvn
fantastic. Thanks for that chunk of code. Works great! :)

--
View this message in context: 
http://r.789695.n4.nabble.com/Derive-pattern-from-vector-tp4379312p4379402.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Embed R code in online database

2012-02-11 Thread Jokel Meyer
Dear R-List!

I would like to embed R code in an online database such as i.e. a google
spreadsheet in way that users can add data to the database and that R's
calculations are updated automatically and i.e. given out in the
spreadsheet. Maybe even graphs could be updated online? Is there a way to
implement this?

Many thanks!
J.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Counting occurences of variables in a dataframe

2012-02-11 Thread Kai Mx
Hi everybody,
I have a large dataframe similar to this one:
knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
'20101201', '20110105', '20101001', '20110504', '20110603', '20110201'),
format=%Y%m%d)
kdata - data.frame (knames, kdate)
I would like to add a new variable to the dataframe counting the
occurrences of different values in knames in their order of appearance
(according to the date as in indicated in kdate). The solution should be a
variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop,
but there must be a more elegant way to this.

Thanks!

Best,

Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rpart and splitting criteria

2012-02-11 Thread Uwe Ligges



On 10.02.2012 10:37, MYRIAM TABASSO wrote:

Dear All,
I have questions about the function rpart  to construct a regression tree
in
R code.
My problem is how to change the splitting criteria.

In the rpart we have : parms=list(split=..) , I ask you if in this
command is it possible to use an another splitting criterion to substitute
the
default criteria( gini or information)?



No.

Uwe Ligges



Does someone can help me ?
Thank you,
Myriam Tabasso

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] updating one's own package

2012-02-11 Thread Uwe Ligges



On 11.02.2012 13:49, Wet Bell Diver wrote:

Win7 x64, R2.14.1

Dear list,

I would like to get into the habbit of creating a package for each
project. That way I can package my project-specific functions and data
together and load or unload that at will. Also, it will allow easy
navigation through all the functions I will have available for each
project, since I can then use ?myfunction and immediately see the
details of that function (esp. useful when revisiting a project after
several months).
Anyway, I have been successful in constructing a package, using the
package.skeleton. My question is how to easily update the package when I
write an additional function or rewrite an existing function. Do I then
need to again build the package fully, or is there an incremental
update option somewhere? This I can not find anywhere, maybe I am
overlooking something?



You can use package.skeleton with force = FALSE in order not to 
overwrite existing files. Anyway, I typically edit the files of the 
package directly, without using any helper functions.
And if you add a new function, you can use prompt() to prepare a 
corresponding Rd file.


Uwe Ligges






Thanks,
Peter

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to properly build model matrices

2012-02-11 Thread Uwe Ligges



On 09.02.2012 22:39, Yang Zhang wrote:

I always bump into a few (very minor) problems when building model
matrices with e.g.:

train = model.matrix(label~., read.csv('train.csv'))
target = model.matrix(label~., read.csv('target.csv'))

(1) The two may have different factor levels, yielding different
matrices.  I usually first rbind the data frames together to meld
the factors, and then split them apart and matrixify them.



You can preprocess the data and explicitly define the levels for factor 
variables in your data.frames.




(2) The target set that I'm predicting on typically doesn't have
labels.  I usually manually append dummy labels to the target data
frame.


R cannot know labels if you do not provide any.


(3) I almost always remove the Intercept from the model matrices,
since it seems to always be redundant (I usually use caret).


Then change your model formula to: label ~ . - 1. But note the 
interpretation changes and it is *not* redundant in general.


Uwe Ligges



None of these is a big deal at all, but I'm just curious if I'm
missing something simple in how I'm doing things.  Thanks.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug with memory allocation when loading Rdata files iteratively?

2012-02-11 Thread Uwe Ligges



On 10.02.2012 01:56, Janko Thyson wrote:

Dear list,

when iterating over a set of Rdata files that are loaded, analyzed and
then removed from memory again, I experience a *significant* increase in
an R process' memory consumption (killing the process eventually).

It just seems like removing the object via |rm()| and firing |gc()| do
not have any effect, so the memory consumption of each loaded R object
cumulates until there's no more memory left :-/

Possibly, this is also related to XML package functionality (mainly
|htmlTreeParse| and |getNodeSet|), but I also experience the described
behavior when simply iteratively loading and removing Rdata files.



Please provide a reproducible example. If you manage to produce one with 
XML only, report to its maintainer. If you manage to provide one without 
XML, report to R-devel.


But please try with recent versions of XML and R (both unstated in your 
message).


Uwe Ligges



I've put together a little example that illustrates the memory
ballooning mentioned above which you can find here:
http://stackoverflow.com/questions/9220849/significant-memory-issue-in-r-when-iteratively-loading-rdata-files-killing-the

Is this a bug? Any chance of working around this?

Thanks a lot and best regards,
Janko



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting occurences of variables in a dataframe

2012-02-11 Thread Tal Galili
Hello Kai

This looks like a fun question.

Here is my solution, I'd be curious to see solutions by other people here.
It can also be tweaked in various ways, and easily put into a function
(actually, if you do it - please put it back online :) )
The only thing that might require some work is the rearranging of the
columns.

Cheers,
Tal



##
# Loading the functions
##
# Making sure we can source code from github
source(
http://www.r-statistics.com/wp-content/uploads/2012/01/source_https.r.txt;)
# This is based on code first discussed here:
##
http://www.r-statistics.com/2012/01/printing-nested-tables-in-r-bridging-between-the-reshape-and-tables-packages/

# Reading in the function for using merge that reserves order
source_https(
https://raw.github.com/talgalili/R-code-snippets/master/merge.data.frame.r;)




##
# Make Data
knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
'20101201', '20110105', '20101001', '20110504', '20110603', '20110201'),
format=%Y%m%d)
kdata - data.frame (knames, kdate)
kdata$kdate - as.character(kdata$kdate)

##
# Calculate counts
tmp - data.frame(table(kdata$kdate))
colnames(tmp)[1] - kdate
tmp[,1] - as.character(tmp[,1])

# Based on this:
#
http://www.r-statistics.com/2012/01/merging-two-data-frame-objects-while-preserving-the-rows-order/
merge.data.frame(kdata ,tmp ,keep_order = x)

### Solution:
 kdate knames Freq
9  2011-10-01 ab1
10 2011-11-02 aa1
2  2010-10-01 ac2
1  2010-03-15 ad1
4  2010-12-01 ab1
5  2011-01-05 ac1
3  2010-10-01 aa2
7  2011-05-04 ad1
8  2011-06-03 ae1
6  2011-02-01 af1






Contact
Details:---
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
--




On Sat, Feb 11, 2012 at 8:17 PM, Kai Mx govo...@gmail.com wrote:

 Hi everybody,
 I have a large dataframe similar to this one:
 knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
 kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
 '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'),
 format=%Y%m%d)
 kdata - data.frame (knames, kdate)
 I would like to add a new variable to the dataframe counting the
 occurrences of different values in knames in their order of appearance
 (according to the date as in indicated in kdate). The solution should be a
 variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop,
 but there must be a more elegant way to this.

 Thanks!

 Best,

 Kai

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] debug in a loop

2012-02-11 Thread Duncan Murdoch

On 12-02-10 12:48 PM, Justin Haynes wrote:

You can add

if(is.na(tab[i])) browser()

or

if(is.na(tab[i])) break

see inline


You can also do this temporarily.  Supposing that you used 
source(foo.R) to enter a function with that code in it, and you want 
the check on line 10, you'd enter


setBreakpoint(foo.R#10, tracer=quote(if(is.na(tab[i])) browser()))

Duncan Murdoch



On Fri, Feb 10, 2012 at 7:22 AM, ikuzarraz...@hotmail.fr  wrote:


Hi,

I'd like to debug in a loop (using debug() and browser() etc but not
print()
). I'am looking for the first occurence of NA.
For instance:

tab = c(1:300)
tab[250] = NA
len = length(tab)
for (i in 1:len){
   if(i != len){


if(is.na(tab[i])) browser()


 tab[i] = tab[i]+tab[i+1]
   }
}

I do not want to do Browse[2]  n for each step ... I'd like to declare a
browser() in the loop with a condition. But how to write stop running
when you encounter NA ?

Thanks for your help

--
View this message in context:
http://r.789695.n4.nabble.com/debug-in-a-loop-tp4376563p4376563.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting occurences of variables in a dataframe

2012-02-11 Thread Petr Savicky
On Sat, Feb 11, 2012 at 07:17:54PM +0100, Kai Mx wrote:
 Hi everybody,
 I have a large dataframe similar to this one:
 knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
 kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
 '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'),
 format=%Y%m%d)
 kdata - data.frame (knames, kdate)
 I would like to add a new variable to the dataframe counting the
 occurrences of different values in knames in their order of appearance
 (according to the date as in indicated in kdate). The solution should be a
 variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a loop,
 but there must be a more elegant way to this.

Hi.

Is the first 2 in the new variable due to the fact that
the name is ab and ab at row 5 has older date? If so,
then try the following

  ind - order(kdata$kdate)
  f - function(x) seq.int(along.with=x)
  kdata$x - ave(1:nrow(kdata), kdata$knames[ind], FUN=f)[order(ind)]

 knames  kdate x
  1  ab 2011-10-01 2
  2  aa 2011-11-02 2
  3  ac 2010-10-01 1
  4  ad 2010-03-15 1
  5  ab 2010-12-01 1
  6  ac 2011-01-05 2
  7  aa 2010-10-01 1
  8  ad 2011-05-04 2
  9  ae 2011-06-03 1
  10 af 2011-02-01 1

kdata$knames[ind] orders the names by increasing date.
ave(...)[order(ind)] reorders the output of ave() to the original order.

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] colnames documentation

2012-02-11 Thread Uwe Ligges



On 10.02.2012 04:53, R. Michael Weylandt wrote:

Consider the following in R 2.14.1 (seems to still be the case in Rdevel):

x- matrix(1:9, 3)
colnames(x) # NULL as expected
colnames(x, do.NULL = TRUE) # NULL -- since we didn't change the default
colnames(x, do.NULL = FALSE) # col1 col2 col3

This doesn't really seem to square with the documentation which reads:

do.NULL: logical.  Should this create names if they are ‘NULL’?

The details section expounds and says:

If ‘do.NULL’ is ‘FALSE’, a character vector (of length ‘NROW(x)’
  or ‘NCOL(x)’) is returned in any case, prepending ‘prefix’ to
  simple numbers, if there are no dimnames or the corresponding
  component of the dimnames is ‘NULL’.

But I have to admit that I don't really get it. (The interpretation of
the docs; I understand the functionality) Could someone enlighten me?
Given what the details section says (and the behavior of the function
is), I'd expect something more like:

do.NULL: logical.  Is NULL an acceptable return value? If FALSE,
column names derived from prefix are returned.



Changed to

\item{do.NULL}{logical. If \code{FALSE} and names are \code{NULL}, names 
are created.}



Michael

PS -- In my searching, I think the link to the svn on the developer
page (http://developer.r-project.org/) is wrong: clicking it takes one
to what appears to be the same page: am I incorrect in assuming it
should link to http://svn.r-project.org/R for the current svn?



I think you followed the link to the svn sources of that developer page 
(rather than the software R).


Uwe



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package ICSNP

2012-02-11 Thread Uwe Ligges



On 10.02.2012 04:44, David Winsemius wrote:


On Feb 9, 2012, at 9:44 PM, David Winsemius wrote:



On Feb 9, 2012, at 5:37 PM, Miho Morimoto wrote:



install.packages(ICSNP)

--- Please select a CRAN mirror for use in this session ---
Warning: unable to access index for repository
http://www.stats.ox.ac.uk/pub/RWin/bin/windows/contrib/2.4


We are not even close to version 2.4.


Or maybe my memory is too short. I read this as being in the future but
now that I think about it further v2.4 was probably in 2004.



Actually, R-2.4.0 was released in October 2006.
ICSNP appeared in 2007 on CRAN.

Hence we never generated a binary for an R version that was already 
unsupported at that time. That means the OP should really upgrade R!





Maybe in another 15 years?

Furthermore the directory for version 2.14 does not have the target
package.


The CRAN on has.



What made you choose this repository over one of the standard
CRAN repos? (The repository at www.stat.ox.ac.uk has a rather
specialized reason for existence.)



This is a standard repository under Windows (in addition to ordinary 
CRAN) and contacted by default. The Warning comes from the fact that the 
repository for the outdated R version was removed in the meantime - I 
guess the OP has not changed the default, hence R also looked into CRAN 
but did not find the package anywhere.


Best,
Uwe Ligges






Warning in download.packages(pkgs, destdir = tmpd, available =
available,
:
no package 'ICSNP' at the repositories
Miho




David Winsemius, MD
Heritage Laboratories
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting codebook data into R

2012-02-11 Thread barny
Hi Eric - after seeing the difficulty of inputting this kind of data into R I
decided to use your method. It was rather painless using PSPP to do what I
wanted - however, how do I now create an SPSS file and then use the memisc
package to read it in?

--
View this message in context: 
http://r.789695.n4.nabble.com/Getting-codebook-data-into-R-tp4374331p4379433.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] AMOVA error: 'bin' must be numeric or a factor

2012-02-11 Thread Hanne Ballestad

Hi!

I am trying to analyse my data using amova 
(http://www.oga-lab.net/RGM2/func.php?rd_id=pegas:amova):


My input to R is a DNA sequence file, format=fasta
	dna- read.dna(XX.fasta, format=fasta) #left other options as 
default

d- dist.dna(dna, model=raw)
g- read.table(XXX.design)

Load necessary libraries:
library(pegas)
Loading required package: adegenet
Loading required package: MASS
Loading required package: ade4

Running Amova:
amova(d ~ g, nperm = 100)
Error in FUN(X[[1L]], ...) : 'bin' must be numeric or a factor


How can I solve this bin problem?
I think it might be a problem with the g variabel. In the example they 
type

g - factor(c(rep(A, 7), rep(B, 8)))

I cannot find any information about what this c(rep) does. Do 
anyone know about a proper manual for the amova function?

My input file for g looks like this:

sequenceA   1
sequenceB   1
sequenceC   1
.
.
.
.
sequenceD   5
.
.

sequenceE   9
sequenceF   9


Where sequenceA is in group 1, sequenceD is in group 5 and so on...

If I type is.factor(g)
I get FALSE

I have also checked that the d (a matrix file) is a numeric file. It 
should be correct.


is.numeric(d)
TRUE


Any help will be very much appreciated!

Cheers,

Hanne

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting occurences of variables in a dataframe

2012-02-11 Thread David Winsemius


On Feb 11, 2012, at 1:17 PM, Kai Mx wrote:


Hi everybody,
I have a large dataframe similar to this one:
knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
'20101201', '20110105', '20101001', '20110504', '20110603',  
'20110201'),

format=%Y%m%d)
kdata - data.frame (knames, kdate)


  ave(unclass(kdate), knames, FUN=order )
 [1] 2 2 1 1 1 2 1 2 1 1


That was actually not using the dataframe values but you could also do  
this:


 kdata$ord - with(kdata, ave(unclass(kdate), knames, FUN=order ))
 kdata
   knames  kdate ord
1  ab 2011-10-01   2
2  aa 2011-11-02   2
3  ac 2010-10-01   1
4  ad 2010-03-15   1
5  ab 2010-12-01   1
6  ac 2011-01-05   2
7  aa 2010-10-01   1
8  ad 2011-05-04   2
9  ae 2011-06-03   1
10 af 2011-02-01   1


I would like to add a new variable to the dataframe counting the
occurrences of different values in knames in their order of appearance
(according to the date as in indicated in kdate). The solution  
should be a
variable with the values 2,2,1,1,1,2,1,2,1,1. I could do it with a  
loop,

but there must be a more elegant way to this.

Thanks!

Best,

Kai

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] obtaining a true/false vector with combination of strsplit, length, unlist,

2012-02-11 Thread emorway
Hi, 

A pared down version of the dataset I'm working with:

edm-read.table(textConnection(WELLIDX_GRID Y_GRID LAYER ROW COLUMN
SPECIES CALCULATED OBSERVED
w301_3  4428.   1389 2   6 18   1   3558   
6490.
w304_12 4836.   6627 2  27 20   1   3509   
3228.
02_10_120803.6125E+04  13875 1  56145   1   2774  
-999.0
02_10_120803.6125E+04  13875 1  56145   1   2774  
-999.0
02_10_120813.6375E+04  13875 1  56146   1   3493  
-999.0
02_10_120923.9125E+04  13875 1  56157   1   4736  
-999.0
w305_12 2962.   7326 2  30 12   1   4575   
5899.),header=T)
closeAllConnections() 

I'm having a hard time coming up with the R code that would produce a
TRUE/FALSE vector based on whether or not the first column of the data.frame
edm has a length of 2 or 3?  To show what I mean going row-by-row, I could
do the following:

 length(strsplit(as.character(edm$WELLID),_)[[1]])==3
[1] FALSE
 length(strsplit(as.character(edm$WELLID),_)[[2]])==3
[1] FALSE
 length(strsplit(as.character(edm$WELLID),_)[[3]])==3
[1] TRUE
 length(strsplit(as.character(edm$WELLID),_)[[4]])==3
[1] TRUE
 length(strsplit(as.character(edm$WELLID),_)[[5]])==3
[1] TRUE
 length(strsplit(as.character(edm$WELLID),_)[[6]])==3
[1] TRUE
 length(strsplit(as.character(edm$WELLID),_)[[7]])==3
[1] FALSE

I've fumbled around trying to come up with a line of R code that would
create a vector that looks like:  FALSE FALSE TRUE TRUE TRUE TRUE FALSE

The final goal is to use this vector to create two new data.frames, where,
for example, the first contains all the rows of edm in which the first
column has a length of 2 when split using a _ character.  The second
data.frame would contain all the rows in which the first column has a length
of 3 when split using a _ character.

Thanks,
Eric

--
View this message in context: 
http://r.789695.n4.nabble.com/obtaining-a-true-false-vector-with-combination-of-strsplit-length-unlist-tp4380050p4380050.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] obtaining a true/false vector with combination of strsplit, length, unlist,

2012-02-11 Thread Sarah Goslee
You are so very close:
 sapply(edm[,1], function(x)length(strsplit(as.character(x), _)[[1]]) == 3)
[1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE

Thanks for providing a small reproducible example. dput() tends
to work better for than than textConnection(), because many email
clients add arbitrary newlines, messing up the text formatting.

Sarah

On Sat, Feb 11, 2012 at 4:51 PM, emorway emor...@usgs.gov wrote:
 edm-read.table(textConnection(WELLID        X_GRID Y_GRID LAYER ROW COLUMN
 SPECIES CALCULATED     OBSERVED
 w301_3          4428.       1389     2   6     18       1       3558
 6490.
 w304_12         4836.       6627     2  27     20       1       3509
 3228.
 02_10_12080    3.6125E+04  13875     1  56    145       1       2774
 -999.0
 02_10_12080    3.6125E+04  13875     1  56    145       1       2774
 -999.0
 02_10_12081    3.6375E+04  13875     1  56    146       1       3493
 -999.0
 02_10_12092    3.9125E+04  13875     1  56    157       1       4736
 -999.0
 w305_12         2962.       7326     2  30     12       1       4575
 5899.),header=T)
 closeAllConnections()



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] obtaining a true/false vector with combination of strsplit, length, unlist,

2012-02-11 Thread Phil Spector

It sounds like the problem boils down to counting the number
of _s in the WELLID variable, and seeing if there are two:


nchar(gsub('[^_]','',edm$WELLID)) == 2

[1] FALSE FALSE  TRUE  TRUE  TRUE  TRUE FALSE

- Phil Spector
 Statistical Computing Facility
 Department of Statistics
 UC Berkeley
 spec...@stat.berkeley.edu



On Sat, 11 Feb 2012, emorway wrote:


Hi,

A pared down version of the dataset I'm working with:

edm-read.table(textConnection(WELLIDX_GRID Y_GRID LAYER ROW COLUMN
SPECIES CALCULATED OBSERVED
w301_3  4428.   1389 2   6 18   1   3558
6490.
w304_12 4836.   6627 2  27 20   1   3509
3228.
02_10_120803.6125E+04  13875 1  56145   1   2774
-999.0
02_10_120803.6125E+04  13875 1  56145   1   2774
-999.0
02_10_120813.6375E+04  13875 1  56146   1   3493
-999.0
02_10_120923.9125E+04  13875 1  56157   1   4736
-999.0
w305_12 2962.   7326 2  30 12   1   4575
5899.),header=T)
closeAllConnections()

I'm having a hard time coming up with the R code that would produce a
TRUE/FALSE vector based on whether or not the first column of the data.frame
edm has a length of 2 or 3?  To show what I mean going row-by-row, I could
do the following:


length(strsplit(as.character(edm$WELLID),_)[[1]])==3

[1] FALSE

length(strsplit(as.character(edm$WELLID),_)[[2]])==3

[1] FALSE

length(strsplit(as.character(edm$WELLID),_)[[3]])==3

[1] TRUE

length(strsplit(as.character(edm$WELLID),_)[[4]])==3

[1] TRUE

length(strsplit(as.character(edm$WELLID),_)[[5]])==3

[1] TRUE

length(strsplit(as.character(edm$WELLID),_)[[6]])==3

[1] TRUE

length(strsplit(as.character(edm$WELLID),_)[[7]])==3

[1] FALSE

I've fumbled around trying to come up with a line of R code that would
create a vector that looks like:  FALSE FALSE TRUE TRUE TRUE TRUE FALSE

The final goal is to use this vector to create two new data.frames, where,
for example, the first contains all the rows of edm in which the first
column has a length of 2 when split using a _ character.  The second
data.frame would contain all the rows in which the first column has a length
of 3 when split using a _ character.

Thanks,
Eric

--
View this message in context: 
http://r.789695.n4.nabble.com/obtaining-a-true-false-vector-with-combination-of-strsplit-length-unlist-tp4380050p4380050.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] colnames documentation

2012-02-11 Thread R. Michael Weylandt
Thanks Uwe.

Michael

2012/2/11 Uwe Ligges lig...@statistik.tu-dortmund.de:


 On 10.02.2012 04:53, R. Michael Weylandt wrote:

 Consider the following in R 2.14.1 (seems to still be the case in Rdevel):

 x- matrix(1:9, 3)
 colnames(x) # NULL as expected
 colnames(x, do.NULL = TRUE) # NULL -- since we didn't change the default
 colnames(x, do.NULL = FALSE) # col1 col2 col3

 This doesn't really seem to square with the documentation which reads:

 do.NULL: logical.  Should this create names if they are ‘NULL’?

 The details section expounds and says:

 If ‘do.NULL’ is ‘FALSE’, a character vector (of length ‘NROW(x)’
      or ‘NCOL(x)’) is returned in any case, prepending ‘prefix’ to
      simple numbers, if there are no dimnames or the corresponding
      component of the dimnames is ‘NULL’.

 But I have to admit that I don't really get it. (The interpretation of
 the docs; I understand the functionality) Could someone enlighten me?
 Given what the details section says (and the behavior of the function
 is), I'd expect something more like:

 do.NULL: logical.  Is NULL an acceptable return value? If FALSE,
 column names derived from prefix are returned.



 Changed to

 \item{do.NULL}{logical. If \code{FALSE} and names are \code{NULL}, names are
 created.}


 Michael

 PS -- In my searching, I think the link to the svn on the developer
 page (http://developer.r-project.org/) is wrong: clicking it takes one
 to what appears to be the same page: am I incorrect in assuming it
 should link to http://svn.r-project.org/R for the current svn?



 I think you followed the link to the svn sources of that developer page
 (rather than the software R).

 Uwe


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to see a R function's code

2012-02-11 Thread Colstat
I was wondering how do I actually see what's inside a function, say,
density of t distribution, dt()?

I know for some, I can type the function name inside R and the code will be
displayed.  But for dt(), I get
 dt
function (x, df, ncp, log = FALSE)
{
if (missing(ncp))
.Internal(dt(x, df, log))
else .Internal(dnt(x, df, ncp, log))
}
environment: namespace:stats

I am curious because I am doing rejection sampling and want to find a
bigger distribution.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting codebook data into R

2012-02-11 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of barny
 Sent: Saturday, February 11, 2012 10:04 AM
 To: r-help@r-project.org
 Subject: Re: [R] Getting codebook data into R
 
 Hi Eric - after seeing the difficulty of inputting this kind of data into
 R I
 decided to use your method. It was rather painless using PSPP to do what I
 wanted - however, how do I now create an SPSS file and then use the memisc
 package to read it in?
 

There is SPSS code for reading the files on the codebook page

http://www.cdc.gov/nchs/nsfg/nsfg_2006_2010_puf.htm#codebooks

hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to see a R function's code

2012-02-11 Thread Joshua Wiley
Hi,

Section 2 of the R Internals manual gives you some information.
Assuming you have the source code,

path_to_R/src/main/names.c

holds the look up table.  I am pretty sure that dt is one of the
do_math* group (maybe math2??) so arithmetic.c may be useful.  These
are all text files so you can search in the source, but as these are
pretty low level functions, I would expect it to take some time and
effort to see and understand the code you want.  Someone else on the
list may know an easier way or know straight where to go for your
particlar purpose.

Cheers,

Josh

On Sat, Feb 11, 2012 at 2:19 PM, Colstat cols...@gmail.com wrote:
 I was wondering how do I actually see what's inside a function, say,
 density of t distribution, dt()?

 I know for some, I can type the function name inside R and the code will be
 displayed.  But for dt(), I get
 dt
 function (x, df, ncp, log = FALSE)
 {
    if (missing(ncp))
        .Internal(dt(x, df, log))
    else .Internal(dnt(x, df, ncp, log))
 }
 environment: namespace:stats

 I am curious because I am doing rejection sampling and want to find a
 bigger distribution.

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple histograms from a dataframe

2012-02-11 Thread Adel ESSAFI
Le 11 février 2012 02:33, David Winsemius dwinsem...@comcast.net a écrit :


 On Feb 10, 2012, at 7:05 PM, Adel ESSAFI wrote:

  Hi list


 I  need some help for drawing some histograms

 I have a dataframe , say,
 X Y Z T

 I want to draw a histogram Z-T for each value of the couple (X-Y).
 When I use thus syntax

  library(lattice)
 histogram(law[,3] ~ law[,66] | law[,1] )


 Perhaps (but untested in the absence of data);

  histogram( Z ~ T | interaction(X, Y)  , data=dfrmname )

 Thanks ,
that helped a lot.

now, I have another problem: I want  to draw many (two) figures together.
The par(new=T) directve does not recognize the ploy provided by lattice
library
when I tired :
 xyplot(law[,66] ~ law[,3]| interaction(law[,1],law[,2]),type='l')
 par(new=T)
*Warning message:
In par(new = T) : calling par(new=TRUE) with no plot*
 xyplot(law[,67] ~ law[,3]| interaction(law[,1],law[,2]),type='l')

and the second xyplot() draws a new figure.

what can I do to draw to figures together using lattice?
Thanks









 it draws multiple histograms but by selecting distinct values of  law[,1]
 The deal is to make the same thing but for a couple of columns

 Thanks in advance for help

 Adel


 --

 David Winsemius, MD
 West Hartford, CT




-- 
PhD candidate in Computer Science
Address
3 avenue lamine, cité ezzahra, Sousse 4000
Tunisia
tel: +216 97 246 706 (+33640302046 jusqu'au 15/6)
fax: +216 71 391 166

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Detect numerical series

2012-02-11 Thread Carl Witthoft
This was actually answered a couple times in StackOverflow.  Someone and 
I indpendently wrote up the following function, stolen directly from the 
source for rle().



# extended version of rle to find all sorts of sequences
# if incr=0, this is rle

seqle- function (x,incr=1)
{
if (!is.vector(x)  !is.list(x))
stop('x' must be an atomic vector)
n - length(x)
if (n == 0L)
return(structure(list(lengths = integer(), values = x),
class = rle))
y - x[-1L] != x[-n] +incr
i - c(which(y | is.na(y)), n)
structure(list(lengths = diff(c(0L, i)), values = x[i]),
class = rle)
}

quote
From: David Winsemius dwinsemius_at_comcast.net
Date: Sat, 11 Feb 2012 10:08:17 -0500
On Feb 11, 2012, at 10:01 AM, syrvn wrote:

 Hello,

 I am struggling with detecting successive digits in a numerical series
 vector.

 Here is an example:

 vec - c(1, 15, 26, 29, 30, 31, 37, 40, 41)

 I want to be able to detect 29, 30, 31 and 40, 41.

 Then, I would like to delete the successive digits from the vector.

 1, 15, 26, 29, 37, 40
  vec[ c(TRUE, !diff(vec) == 1) ]
#[1] 1 15 26 29 37 40

--

Sent from my Cray XK6
Quidvis recte factum, quamvis humile, praeclarum.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Using igraph: community membership of components built by decompose.graph()

2012-02-11 Thread Natalia Pobedina
Hi everyone!

I would appreciate help with using decompose.graph(), community
detection functions from igraph and lapply().

I have an igraph object G with vertex attribute label and edge
attribute weight. I want to calculate community
memberships using different functions from igraph, for simplicity let
it be walktrap.community.
This graph is not connected, that is why I decided to decompose it
into connected components and run
 walktrap.community on each component, and afterwards add a
community membership vertex attribute to the
original graph G.

I am doing currently the following

comps - decompose.graph(G,min.vertices=2)
communities - lapply(comps,walktrap.community)

At this point I get stuck since I get the list object with the
structure I cannot figure out. The documentation on
decompose.graph tells only that it returns list object, and when I
use lapply on the result I get completely
confused in the results. Moreover, the communities are numbered from 0
in each component, and I don't know
how to supply weights parameter into walktrap.community function.

If it were not for the components, I would have done the following:
wt - walktrap.community(G, modularity=TRUE, weights=E(G)$weight)
wmemb - community.to.membership(G, wt$merges,
steps=which.max(wt$modularity)-1)
V(G)$walktrap - wmemb$membership

Could anyone please help me solve this issue? Or provide some
information/links which could help?

Thanks and best wishes,
Natalia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Install the rugarch-package

2012-02-11 Thread liweichen0817
I have problem installing rugarch, too.

I use R 2.14.1 on Mac OS X 10.7.3.  When I tried to load rugauch, I got the
bellowing error message:

Loading required package: Rcpp
Loading required package: RcppArmadillo
Loading required package: numDeriv
Loading required package: chron
Loading required package: Rsolnp
Loading required package: truncnorm
Rsolnp (version 1.11) initialized.
Package rugarch (1.0-7) loaded.  To cite, see citation(rugarch)

Error in as.environment(pos) : 
  no item called newtable on the search list
In addition: Warning message:
In objects(newtable, all.names = TRUE) :
  ‘newtable’ converted to character string
Error: package/namespace load failed for ‘rug arch’

I have tried removing all related packages and reinstalling them but the
error still exists. I appreciate if someone can help me resolve this issue.

--
View this message in context: 
http://r.789695.n4.nabble.com/Install-the-rugarch-package-tp3911903p4380077.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New Sex Video.#########

2012-02-11 Thread fgre
New Sex Video..
Video share  http://money586.blogspot.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get identical results for parallel and sequential?

2012-02-11 Thread slbfelix
Hi All, 

I have a question about R parallel computing by using snowfall. 

How can I set the seeds on parallel workers to get the same result as
sequential mode? 

For example: 

 sfSapply(c(1,1),rnorm) 
[1]  1.823082 -2.222052 
 rnorm(2) 
[1] -0.5179967 -1.0807196 

How to get the identical result? 

Thanks. 
Libo

--
View this message in context: 
http://r.789695.n4.nabble.com/Get-identical-results-for-parallel-and-sequential-tp4380110p4380110.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R Parallel question

2012-02-11 Thread slbfelix
Hi All,

I have a question about R parallel computing by using snowfall.

How can I set the seeds on parallel workers to get the same result as
sequential mode?

For example:

 sfSapply(c(1,1),rnorm)
[1]  1.823082 -2.222052
 rnorm(2)
[1] -0.5179967 -1.0807196

How to get the identical result?

Thanks.
Libo Sun

Graduate Student,
Department of Statistics,
Colorado State University
Fort Collins, CO


--
View this message in context: 
http://r.789695.n4.nabble.com/R-Parallel-question-tp4380098p4380098.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] New Sex Video.#########

2012-02-11 Thread fgre
New Sex Video..
Video share  http://money586.blogspot.com
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reading data from a worksheet on the Internet

2012-02-11 Thread Nilza BARROS
Dear R-users,

I have to read data from a worksheet that is available on the Internet. I
have been doing this by copying the worksheet from the browser.
But I would like to be able to copy the data automatically using the url
command.

But when using  url command the result is the source code, I mean, a html
code.
I see that the data I need is in the source code but before thinking about
reading the data from the html code I wonder if there is a package or
anoher way to extract these data since reading  from the code will demand
many work and it can be not so accurate.

Below one can see the from where I am trying to export the data:

dados-url(
http://www.mar.mil.br/dhn/chm/meteo/prev/dados/pnboia/sc1201_arquivos/sheet002.htm,r
)
I am looking forward  any help.

Thanks in advance ,

Nilza Barros

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Lattice 3d coordinate transformation

2012-02-11 Thread ilai
Thank you Deepayan, your answer put me on the path to SOLVED !!!
Actually passing projected corners to panel.rect was the first thing I
tried, but couldn't get it to work. However, panel.3dpolygon in
latticeExtra did the trick.
I'm posting it here for completion.

 require(lattice) ; require(latticeExtra)
set.seed(1113)
d - data.frame(x=runif(30),y=runif(30),g=gl(2,15))
d$z - with(d,rnorm(30,3*asin(x^2)-5*y^as.integer(g),.1))
d$z - d$z+min(d$z)^2
surf - by(d,d$g,function(D){
  fit - lm(z~poly(x,2)*poly(y,2),data=D)
  outer(seq(0,1,l=10),seq(0,1,l=10),function(x,y,...)
predict(fit,data.frame(x=x,y=y)))
})
panel.3d.surf - function(x, y, z, rot.mat, distance, zlim.scaled, ...){
zz - surf[[packet.number()]] ; n - nrow(zz)
lp - level.colors(zz, at = do.breaks(range(zz), 20), col.regions
= heat.colors(20))
s - seq(-.5,.5,l=n) ; cntrds - expand.grid(s,s) ; index - 0
apply(cntrds,1,function(i){
  index - index+1
  xx - i[1]+c(-.5,-.5,.5,.5)/(n-1) ; yy - i[2]+c(-.5,.5,.5,-.5)/(n-1)
  panel.3dpolygon(xx,yy, zlim.scaled[1], rot.mat, distance,
  border=lp[index], col=lp[index],...)
})
panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, ...)
  }
cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = panel.3d.surf,
  zoom = 1,screen=list(z= 21,y=0,x=-60),aspect = c(1,1), panel.aspect = 1,
  scales=list(z=list(arrows=F,tck=0),x=list(distance=.75)),
  par.box=list(lwd=NA),lwd=3)

## Beautiful !

On Sat, Feb 11, 2012 at 6:00 AM, Deepayan Sarkar
deepayan.sar...@gmail.com wrote:
 On Fri, Feb 10, 2012 at 12:43 AM, ilai ke...@math.montana.edu wrote:
 Hello List!
 I asked this before (with no solution), but maybe this time... I'm
 trying to project a surface to the XY under a 3d cloud using lattice.
 I can project contour lines following the code for fig 13.7 in
 Deepayan Sarkar's Lattice, Multivariate Data Visualization with R,
 but it fails when I try to color them in using panel.levelplot.
 ?utilities.3d says there may be some bugs, and I think
 ltransform3dto3d() is not precise (where did I hear that?), but is
 this really the source of my problem? Is there a (simple?) workaround,
 maybe using 3d.wire but projecting it to XY? How? Please, any insight
 may be useful.

 I don't think this will be that simple. panel.levelplot() essentially
 draws a bunch of colored rectangles. For a 3D projection, each of
 these will become (four-sided) polygons. You need to compute the
 coordinates of those polygons, figure out their fill colors (possibly
 using ?level.colors) and then draw them.

 -Deepayan


 Thanks in advance,
 Elai.

 A working example:

  ## data d and predicted surf:
 set.seed(1113)
 d - data.frame(x=runif(30),y=runif(30),g=gl(2,15))
 d$z - with(d,rnorm(30,3*asin(x^2)-5*y^as.integer(g),.1))
 d$z - d$z+min(d$z)^2
 surf - by(d,d$g,function(D){
  fit - lm(z~poly(x,2)*poly(y,2),data=D)
  outer(seq(0,1,l=10),seq(0,1,l=10),function(x,y,...)
 predict(fit,data.frame(x=x,y=y)))
 })
 ##
 # This works to get contours:
 require(lattice)
 cloud(z~x+y|g,data=d,layout=c(2,1), type='h', lwd=3, par.box=list(lty=0),
      scales=list(z=list(arrows=F,tck=0)),
      panel.3d.cloud = function(x, y, z,rot.mat, distance,
 zlim.scaled, nlevels=20,...){
        add.line - trellis.par.get(add.line)
        clines - contourLines(surf[[packet.number()]],nlevels = nlevels)
        for (ll in clines) {
          m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5,
 zlim.scaled[1]), rot.mat,
                                distance)
          panel.lines(m[1,], m[2,], col = add.line$col, lty = add.line$lty,
                      lwd = add.line$lwd)
        }
        panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled =
 zlim.scaled, ...)
      }
      )
 # But using levelplot:
 panel.3d.levels - function(x, y, z,rot.mat, distance, zlim.scaled,...)
 {
    zz - surf[[packet.number()]]
    n - nrow(zz)
    s - seq(-.5,.5,l=n)
    m - ltransform3dto3d(rbind(rep(s,n),rep(s,each=n),zlim.scaled[1]),
                          rot.mat, distance)
    panel.levelplot(m[1,],m[2,],zz,1:n^2,col.regions=heat.colors(20))
    panel.3dscatter(x, y, z, rot.mat, distance, zlim.scaled = zlim.scaled, 
 ...)
  }
 cloud(z~x+y|g,data=d,layout=c(2,1), type='h', panel.3d.cloud = 
 panel.3d.levels,
      scales=list(z=list(arrows=F,tck=0)),par.box=list(lty=0),lwd=3)
 # I also tried to fill between contours but can't figure out what to
 do with the edges and how to incorporate the x,y limits to 1st and nth
 levels.
 panel.3d.contour - function(x, y, z,rot.mat, distance,xlim,ylim,
 zlim.scaled,nlevels=20,...)
 {
    add.line - trellis.par.get(add.line)
    zz - surf[[packet.number()]]
    clines - contourLines(zz,nlevels = nlevels)
    colreg - heat.colors(max(unlist(lapply(clines,function(ll) ll$level
    for (i in 2:length(clines)) {
      ll - clines[[i]]
      ll0 - clines[[i-1]]
      m - ltransform3dto3d(rbind(ll$x-.5, ll$y-.5, zlim.scaled[1]),
 rot.mat, distance)
      m0 - 

Re: [R] how to plot a nice legend?

2012-02-11 Thread Jonas Stein
 There are various alternatives available; you can also write your own, 
 by modifying the standard one.

 Generally there are lots of possibilities for customizing within the 
 standard one; e.g. y.intersp will affect the line spacing, using a 
 negative value for inset (together with xpd=NA) will allow the legend to 
 be moved outside the plot.

i tried without success:

plot(1:10)
legend(1,3, legend=c(one, two), inset=-1, xpd=NA)

The legend is still placed inside the plot on point (1,3)

What could i have done wrong?

Can i include a legend like this in a standard plot like 
plot(1:10) too?
http://www.r-bloggers.com/wp-content/uploads/2011/03/heatmap.png

kind regards,

-- 
Jonas Stein n...@jonasstein.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to plot a nice legend?

2012-02-11 Thread Sarah Goslee
Hi,

On Sat, Feb 11, 2012 at 8:07 PM, Jonas Stein n...@jonasstein.de wrote:
 There are various alternatives available; you can also write your own,
 by modifying the standard one.

 Generally there are lots of possibilities for customizing within the
 standard one; e.g. y.intersp will affect the line spacing, using a
 negative value for inset (together with xpd=NA) will allow the legend to
 be moved outside the plot.

 i tried without success:

 plot(1:10)
 legend(1,3, legend=c(one, two), inset=-1, xpd=NA)

 The legend is still placed inside the plot on point (1,3)

 What could i have done wrong?

Wrong? Nothing. You told R to put the legend at c(1,3)
so it did. If you want it elsewhere you need to specify that.
legend(-1,3, legend=c(one, two), inset=-1, xpd=NA)
maybe, or some other location?

 Can i include a legend like this in a standard plot like
 plot(1:10) too?
 http://www.r-bloggers.com/wp-content/uploads/2011/03/heatmap.png

Yes.

What part of that do you want to duplicate? You can specify colors,
symbols, labels, etc. in legend().

Also, please link to the original blog post, not just the figure, so that
the author gets some credit and we can see the code used.

Sarah

 kind regards,

 --
 Jonas Stein n...@jonasstein.de

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to plot a nice legend?

2012-02-11 Thread Jonas Stein
 Wrong? Nothing. You told R to put the legend at c(1,3)
 so it did. If you want it elsewhere you need to specify that.
 legend(-1,3, legend=c(one, two), inset=-1, xpd=NA)
 maybe, or some other location?

ok that works fine. Now i understand how to use it.
If i create several plots it would be nice if all legends would 
have the same distance to plots with different scaling.

Can the legend be placed vertically centered, 1cm right to the plot aera?

 Can i include a legend like this in a standard plot like
 plot(1:10) too?
 http://www.r-bloggers.com/wp-content/uploads/2011/03/heatmap.png

 Yes.

 What part of that do you want to duplicate? 

The coloured squares.
for the reader who got to this article and had the same question:
I have just found another nice solution for a colour legend a minute ago
http://www.r-bloggers.com/rethinking-loess-for-binomial-response-pitch-fx-strike-zone-maps/

 You can specify colors, symbols, labels, etc. in legend().

can i even invent my own symbols?

 Also, please link to the original blog post, not just the figure, so that
 the author gets some credit and we can see the code used.

sure 
http://www.r-bloggers.com/ggheat-a-ggplot2-style-heatmap-function/

kind regards,

-- 
Jonas Stein n...@jonasstein.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] multiple histograms from a dataframe

2012-02-11 Thread David Winsemius


On Feb 11, 2012, at 6:25 PM, Adel ESSAFI wrote:




Le 11 février 2012 02:33, David Winsemius dwinsem...@comcast.net a  
écrit :


On Feb 10, 2012, at 7:05 PM, Adel ESSAFI wrote:

Hi list


I  need some help for drawing some histograms

I have a dataframe , say,
X Y Z T

I want to draw a histogram Z-T for each value of the couple (X-Y).
When I use thus syntax

library(lattice)
histogram(law[,3] ~ law[,66] | law[,1] )

Perhaps (but untested in the absence of data);

 histogram( Z ~ T | interaction(X, Y)  , data=dfrmname )

Thanks ,
that helped a lot.

now, I have another problem: I want  to draw many (two) figures  
together.
The par(new=T) directve does not recognize the ploy provided by  
lattice library


Par is for base graphics. xyplot is part of lattice and grid graphics.


when I tired :
 xyplot(law[,66] ~ law[,3]| interaction(law[,1],law[,2]),type='l')
 par(new=T)
Warning message:
In par(new = T) : calling par(new=TRUE) with no plot
 xyplot(law[,67] ~ law[,3]| interaction(law[,1],law[,2]),type='l')

and the second xyplot() draws a new figure.

what can I do to draw to figures together using lattice?


You need to describe what you mean by together. It is possible that  
the goup parameter is what you want but that's just a guess. It's also  
possible that the formular operator + will give you what you desire.  
Perhaps:


xyplot( law[,67] + law[,66] ~ law[,3]| interaction(law[,1],law[, 
2]),type='l')




it draws multiple histograms but by selecting distinct values of   
law[,1]

The deal is to make the same thing but for a couple of columns


That doesn't make any sense to me. But then I do apologize for the  
English language. It's horribly complex and syntactically a mess.





Thanks in advance for help

Adel


--

David Winsemius, MD
West Hartford, CT

--
PhD candidate in Computer Science
Address
3 avenue lamine, cité ezzahra, Sousse 4000
Tunisia
tel: +216 97 246 706 (+33640302046 jusqu'au 15/6)
fax: +216 71 391 166


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Counting occurences of variables in a dataframe

2012-02-11 Thread Petr Savicky
On Sat, Feb 11, 2012 at 04:05:25PM -0500, David Winsemius wrote:
 
 On Feb 11, 2012, at 1:17 PM, Kai Mx wrote:
 
 Hi everybody,
 I have a large dataframe similar to this one:
 knames -c('ab', 'aa', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
 kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
 '20101201', '20110105', '20101001', '20110504', '20110603',  
 '20110201'),
 format=%Y%m%d)
 kdata - data.frame (knames, kdate)
 
   ave(unclass(kdate), knames, FUN=order )
  [1] 2 2 1 1 1 2 1 2 1 1
 
 
 That was actually not using the dataframe values but you could also do  
 this:
 
  kdata$ord - with(kdata, ave(unclass(kdate), knames, FUN=order ))
  kdata
knames  kdate ord
 1  ab 2011-10-01   2
 2  aa 2011-11-02   2
 3  ac 2010-10-01   1
 4  ad 2010-03-15   1
 5  ab 2010-12-01   1
 6  ac 2011-01-05   2
 7  aa 2010-10-01   1
 8  ad 2011-05-04   2
 9  ae 2011-06-03   1
 10 af 2011-02-01   1

Hi.

This is a good solution, if there are at most two occurrences
of each name. If there are more occurrences, then function order
should be replaced by rank. Replacing name aa at row 2 by ab,
we get

  knames -c('ab', 'ab', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
  kdate - as.Date( c('20111001', '2002', '20101001', '20100315',
  '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'),
  format=%Y%m%d)
  kdata - data.frame (knames, kdate)

  kdata$ord - with(kdata, ave(unclass(kdate), knames, FUN=order))
  kdata$rank - with(kdata, ave(unclass(kdate), knames, FUN=rank))
  kdata

 knames  kdate ord rank
  1  ab 2011-10-01   32
  2  ab 2011-11-02   13
  3  ac 2010-10-01   11
  4  ad 2010-03-15   11
  5  ab 2010-12-01   21
  6  ac 2011-01-05   22
  7  aa 2010-10-01   11
  8  ad 2011-05-04   22
  9  ae 2011-06-03   11
  10 af 2011-02-01   11

The names ab occur in the order row 5, row 1, row 2, so
row 1 should get index 2, row 2 index 3.

If some of the dates repeat, then rank() by default computes
the average index. In this case, the following function f()
may be used

  knames -c('ab', 'ab', 'ac', 'ad', 'ab', 'ac', 'aa', 'ad','ae', 'af')
  kdate - as.Date( c('20111001', '20111001', '20101001', '20100315',
  '20101201', '20110105', '20101001', '20110504', '20110603', '20110201'),
  format=%Y%m%d)
  kdata - data.frame (knames, kdate)

  kdata$rank - with(kdata, ave(unclass(kdate), knames, FUN=rank))
  f - function(x) rank(x, ties.method=first)
  kdata$f - with(kdata, ave(unclass(kdate), knames, FUN=f))
  kdata
  
 knames  kdate rank f
  1  ab 2011-10-01  2.5 2
  2  ab 2011-10-01  2.5 3
  3  ac 2010-10-01  1.0 1
  4  ad 2010-03-15  1.0 1
  5  ab 2010-12-01  1.0 1
  6  ac 2011-01-05  2.0 2
  7  aa 2010-10-01  1.0 1
  8  ad 2011-05-04  2.0 2
  9  ae 2011-06-03  1.0 1
  10 af 2011-02-01  1.0 1

Hope this helps.

Petr Savicky.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.