Re: [R] Forward stepwise regression using lmStepAIC in Caret

2012-03-05 Thread Allan Engelhardt
Not sure I really like stepwise variable selection, but the function 
should work, of course.  Compare


data(BostonHousing, package = mlbench)
lmFit - train(medv ~ ., data = BostonHousing, lmStepAIC, scope = 
list(lower = ~., upper = ~.^2), direction = forward)


with

lmFit - train(medv ~ ., data = BostonHousing, lmStepAIC, scope = 
list(lower = ~., upper = ~.^2), direction = backward)


and also the version without the direction argument where the code 
tries to do the right thing.


In summary, my understanding is that your call first fits a y ~ . type 
model from which you cannot step backwards with constraints and then it 
tries to do 'what you might have meant'.


Hope this helps a little.

Allan

On 05/03/12 01:14, Dan Putka wrote:

I'm looking for guidance on how to implement forward stepwise regression
using lmStepAIC in Caret.

The stepwise direction appears to default to backward. When I try to
use scope to provide a lower and upper model,  Caret still seems to
default to backward.

Any thoughts on how I can make this work?

Here is what I tried:

itemonly- susbstitute(~i1+i2+i3+i4+i5+i6+i7+i8+i9+i10)  #this is my full
model
#I want my lower model to consist of the intercept only

stepLmFit.i- train(xtraindata.i, ytraindata,lmStepAIC,
scope=list(upper=itemonly,lower=~1),direction=forward)

Any guidance on how I can make this work would be greatly appreciated.

Dan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to terminate R program if found any execution error

2011-09-27 Thread Allan Engelhardt
The original poster may also be interested in options(error) to 
capture the 'any execution error' requirement.  From the examples in 
help(options):


## Not run: ## on error, terminate the R session with error status 66
options(error = quote(q(no, status=66, runLast=FALSE)))
stop(test it)


Allan

On 27/09/2011 06:22, Duncan Murdoch wrote:

On 11-09-27 12:20 AM, arunkumar wrote:

Hi

   I want to terminate R process if there are any execution error.

a=a
b=10
c=try(a/b)

if(class(c)[1]==try-error)
{
stop(Wrong Input Value)
}

d=c*c



if c fails then it should terminate the process.
Please can anyone help


Keep the try(a/b), but replace the conditional with

if (inherits(c, try-error)) q(no)

See ?q if you want to set an error status to be returned, or do want 
to save the workspace, etc.


(And do use inherits() rather than comparing to a particular entry: 
your code will probably work in this example, but it's not the right 
way to test the class of something, and some day try-error might not 
be the first entry.)


Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does replacing some values of a zoo object by NA reduce it's size ?

2011-09-27 Thread Allan Engelhardt
It is not an anomaly.  The object is the same size 
(object.size(f1)==object.size(f2)); its file representation is different.


On 27/09/2011 07:00, Ashim Kapoor wrote:

Dear R-helpers,

Please have a look at the following. f1 is the same as f2 except that it has
some values replaced by NA. But it's corresponding file is slightly bigger
than the file containing f2. Could someone please tell me if this is an
anomaly ?


load(file1)
ls()

[1] f1

load(file2)
ls()

[1] f1 f2

dput(f1)

structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, NA, NA, NA, 2, 2,
2, NA, NA, NA, NA), .Dim = c(10L, 2L), .Dimnames = list(NULL,
 c(v1, l1)), index = 1:10, class = zoo)

dput(f2)

structure(c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2), .Dim = c(10L, 2L), .Dimnames = list(NULL, c(v1,
l1)), index = 1:10, class = zoo)

-rw-r--r-- 1 ashimkapoor ashimkapoor 192 2011-09-27 11:08 file1
-rw-r--r-- 1 ashimkapoor ashimkapoor 179 2011-09-27 11:08 file2

Best Regards,
Ashim.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Disabling Auto-complete

2011-09-26 Thread Allan Engelhardt (CYBAEA)

The $ operator does partial matching by default so for example

 print(b$u)
[1] 1

The [[ operator does not

 print(b[[unit]])
NULL
 print(b[[unit1]])
[1] 1

See help($) for more details and also the warnPartialMatch* arguments 
in help(options).


Try

testfunction - function(x) if (exists(unit, x)  x[[unit]]==1) 
cat(All is well\n)


Hope this helps

Allan


On 26/09/11 10:23, Vikram Bahure wrote:

Hi,

I am a new user to R.

I am having the following problem while using R:
The defined function is having following a$unit as input but if I define
a$unit1 then still I am getting the output which is not desired.
__
*Function:*
testfunction-function(a){
stopifnot(a$unit==1)
cat(All is well)
}
b-list(unit1=1 )
testfunction(b)
-
*Output:*

testfunction-function(a){

stopifnot(a$unit==1)
cat(All is well)
}
b-list(unit1=1 )
testfunction(b)
testfunction-function(a){
+ stopifnot(a$unit==1)
+ cat(All is well)
+ }

b-list(unit1=1 )
testfunction(b)

All is well
__

If it is a auto complete problem, it would be nice if you could let me know
how to disable it, else you could throw some light on the what the problem
is exactly.

Regards
Vikram

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Ignoring loadNamespace errors when loading a file

2011-08-22 Thread Allan Engelhardt
On a Unix machine I ran caret::rfe using the multicore package, and I 
saved the resulting object using save(lm2, file = lm2.RData).  
[Reproducible example below.]


When I try to load(lm2.RData) on my Windows laptop, I get

Error in loadNamespace(name) : there is no package called 'multicore'

I completely understand the error and I would like to ignore it and 
still load the saved object data.


I imagine that I can make myself an empty multicore package for Windows 
and load the data file successfully, but is there another way?


(I am not going to use any multicore functionality; I just want to 
inspect some of the data stored in the object and the reference in 
question is just an unfortunately stored link from the original call.)


(I did search for this question in the archives, but could only find it 
discussed in connection with starting R with a .RData file where the 
consensus seems to be to start R in vanilla mode and install the missing 
package.  This situation is different and installing a Unix-only package 
on Windows is obviously a non-starter, except as I proposed above.)


Obligatory reproducible example: On the Unix machine do

library(multicore)
a - list(data = 1:10, fun = mclapply)
save(a, file = a.RData)

and then try to load the a.RData file on Windows.  The question is if 
I can recover the data (1:10) on that platform.


Allan


 sessionInfo()

R version 2.13.1 (2011-07-08)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=English_United Kingdom.1252
[2] LC_CTYPE=English_United Kingdom.1252
[3] LC_MONETARY=English_United Kingdom.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.1252

attached base packages:
[1] compiler  stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] caret_4.98  cluster_1.14.0  reshape_0.8.4   plyr_1.6
[5] lattice_0.19-31 boot_1.3-2  ctv_0.7-3

loaded via a namespace (and not attached):
[1] grid_2.13.1  tools_2.13.1

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Ignoring loadNamespace errors when loading a file

2011-08-22 Thread Allan Engelhardt



On 22/08/11 12:26, Martin Morgan wrote:

On 08/21/2011 11:52 PM, Allan Engelhardt wrote:

[...]
Obligatory reproducible example: On the Unix machine do

library(multicore)
a - list(data = 1:10, fun = mclapply)
save(a, file = a.RData)

and then try to load the a.RData file on Windows. The question is if I
can recover the data (1:10) on that platform.


Is this a more realistic reproducible example (from ?rfe, modified to 
use computeFunction=mclapply)?


  data(BloodBrain)

  x - scale(bbbDescr[,-nearZeroVar(bbbDescr)])
  x - x[, -findCorrelation(cor(x), .8)]
  x - as.data.frame(x)

  set.seed(1)
  lmProfile - rfe(x, logBBB,
   sizes = c(2:25, 30, 35, 40, 45, 50, 55, 60, 65),
   rfeControl = rfeControl(functions = lmFuncs,
 number = 5,
 computeFunction=mclapply))

Maybe provide a computeFunction that only indirectly references mclapply

  computeFunction=function(...) {
  if (require(multicore)) mclapply(...)
  else lapply(...)
  }

[...]


Yes, that workaround works for my usage.  Thanks!

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reflecting a PCA biplot

2011-08-09 Thread Allan Engelhardt (CYBAEA)

Something like

opar- par(mfcol = c(1, 2))
z- prcomp(USArrests, scale = TRUE)
biplot(z, cex = 0.5)
z$x[,1]- -z$x[,1]
z$rotation[,1]- -z$rotation[,1]
biplot(z, cex = 0.5, xlab = -PC1)
par(opar)


perhaps?

Allan


On 09/08/11 13:57, Andrew Halford wrote:

Hi Listers,

I am trying to reflect a PCA biplot in the x-axis (i.e. PC1) but am not
having much success. In theory I believe all I need to do is multiply the
site and species scores for the PC1 by -1, which would effectively flip the
biplot.

I am creating a blank plot using the plot command and accessing the results
from a call to rda. I then use the calls to scores to obtain separate site
and species coordinates and I have worked out how to multiply the
appropriate PC1 scores by -1 to create the site and species scores I want.
However I am not sure how to change the call to plot which accesses the
results of the call to rda to draw the blank plot. The coordinates it is
accessing are for the unreflected ordination and this does not match the new
site and species scores that I have.



fish.pca- rda(fish.hel)
fish.site- scores(fish.pca,display=sites,scaling=3)
fish.spp- scores(fish.pca,display=species,scaling=3)
fish.site[,PC1]- -1*(fish.site[,PC1])
fish.spp[,PC1]- -1*(fish.spp[,PC1])
graph- plot(fish.pca,display=c(sites,species),type=n,scaling=3) #

how do I get the plot to draw up the blank display based on the reversed
site and species scores?

Any help appreciated.

cheers

Andy


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cdplot error

2011-08-03 Thread Allan Engelhardt (CYBAEA)

On 03/08/11 05:52, wildernessness wrote:

Fairly new at this.
Trying to create a conditional density plot.


cdplot(status~harvd.l,data=phy)

Error in cdplot.formula(status~harvd.l,data=phy):
dependent variable should be a factor

What does this error mean?  Status is a binary response of infestation (0/1)


Probably status is a numerical variable rather than a factor**.  Try 
print(is.factor(phy$status)) and if that is FALSE then


phy$status - factor(phy$status, labels=c(N, Y))
cdplot(status~harvd.l,data=phy)

Hope this helps a little.

Allan


and harvd.l is the log of timber harvest density per catchment.

Thanks.

--
View this message in context: 
http://r.789695.n4.nabble.com/cdplot-error-tp3714454p3714454.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Deleting the last value of a vector

2011-04-18 Thread Allan Engelhardt
This is unlikely to be the kind of operation where speed is essential, 
but nevertheless on my build of 2.14.0 (with byte compiled base packages):

stopifnot(getRversion()= 2.14)
library(compiler)

f1- function (x, n) head(x, length(x) - n)# suggested by baptiste auguie
f2-  function (x, n) x[ seq(1, length(x) - n) ]# suggested by baptiste 
auguie
f3- function (x, n) x[ - seq(length(x), by=-1, length=n) ]   # suggested by 
baptiste auguie
f4- function (x, n) head(x, -n)# Suggested by Petr PIKAL
f5- function (x, n) x[ seq_len(length(x) - n) ]# My variation on f2
c1- cmpfun(f1,options = list(optimize = 3))# Undocumented (?) default is 2
c2- cmpfun(f2,options = list(optimize = 3))
c3- cmpfun(f3,options = list(optimize = 3))
c4- cmpfun(f4,options = list(optimize = 3))
c5- cmpfun(f5,options = list(optimize = 3))
library(rbenchmark)
set.seed(42)
x- runif(1e6)

n- 1
benchmark(f1(x, n), f2(x, n), f3(x, n), f4(x, n), f5(x, n),
   c1(x, n), c2(x, n), c3(x, n), c4(x, n), c5(x, n),
   columns = c(test, elapsed, relative),
   order = relative, replications = 1e3)
#test elapsed relative
# 7  c2(x, n)  17.483 1.00
# 10 c5(x, n)  17.600 1.006692
# 1  f1(x, n)  22.782 1.303094
# 2  f2(x, n)  23.054 1.318652
# 4  f4(x, n)  23.498 1.344049
# 9  c4(x, n)  23.547 1.346851
# 6  c1(x, n)  23.752 1.358577
# 5  f5(x, n)  23.892 1.366585
# 3  f3(x, n)  27.209 1.556312
# 8  c3(x, n)  27.296 1.561288
## So c{2,5} are fastest, .3 slowest, and the rest much the same

n- 1e5
benchmark(f1(x, n), f2(x, n), f3(x, n), f4(x, n), f5(x, n),
   c1(x, n), c2(x, n), c3(x, n), c4(x, n), c5(x, n),
   columns = c(test, elapsed, relative),
   order = relative, replications = 1e3)
#test elapsed relative
# 10 c5(x, n)  14.729 1.00
# 7  c2(x, n)  14.939 1.014258
# 5  f5(x, n)  18.544 1.259013
# 2  f2(x, n)  18.557 1.259895
# 6  c1(x, n)  18.695 1.269265
# 4  f4(x, n)  18.718 1.270826
# 9  c4(x, n)  18.729 1.271573
# 1  f1(x, n)  18.729 1.271573
# 3  f3(x, n)  20.836 1.414624
# 8  c3(x, n)  20.958 1.422907
## As before

n- 1
enableJIT(2)# Will also optimise rbenchmark::benchmark.
benchmark(f1(x, n), f2(x, n), f3(x, n), f4(x, n), f5(x, n),
   c1(x, n), c2(x, n), c3(x, n), c4(x, n), c5(x, n),
   columns = c(test, elapsed, relative),
   order = relative, replications = 1e3)
#test elapsed relative
# 7  c2(x, n)  15.035 1.00
# 10 c5(x, n)  15.076 1.002727
# 2  f2(x, n)  15.258 1.014832
# 9  c4(x, n)  15.313 1.018490
# 5  f5(x, n)  15.327 1.019421
# 6  c1(x, n)  15.402 1.024410
# 4  f4(x, n)  15.430 1.026272
# 1  f1(x, n)  15.594 1.037180
# 3  f3(x, n)  18.238 1.213036
# 8  c3(x, n)  18.245 1.213502
## No difference between the c. and f. functions here


So as long as you don't use the .3 functions you should be OK.

Allan
http://www.cybaea.net/Blogs/Data/

On 18/04/11 07:07, Petr PIKAL wrote:
 Hi

 r-help-boun...@r-project.org napsal dne 18.04.2011 04:51:20:

 Or perhaps even more parsimoniously (by a couple of characters) -

 r- c(1, 2, 3, 4, 5)
 r2-r[-length(r)]
 Maybe even shorter

 head(x,-1)

 Regards
 Petr


 Min-Han

 On Sun, Apr 17, 2011 at 10:23 PM, Daisy Englert Duursma
 daisy.duur...@gmail.com  wrote:

 A easier solution:

 r- c(1, 2, 3, 4, 5)
 r2-r[1:length(r)-1]




 On Mon, Apr 18, 2011 at 10:51 AM, empyreanctr...@ucdavis.edu  wrote:
 Hey guys,

 I've search a few threads about deleting a value from a vector, but
 no
 one
 has addressed this question so far.

 I want to delete the last value from a string of values

 I have:

 r = [ 1, 2, 3, 4, 5 ], and i want to make r2 = to [ 1, 2, 3, 4]

 So that r2 is just like r, except that it missing the final value.

 Thanks,

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Deleting-the-last-value-of-a-vector-
 tp3456363p3456363.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 Daisy Englert Duursma
 Department of Biological Sciences
 Room E8C156
 Macquarie University, North Ryde, NSW 210
 Australia

 Tel +61 2 9850 9256

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 __
 R-help@r-project.org mailing 

Re: [R] From nested loop to mclapply

2011-04-18 Thread Allan Engelhardt
Try help(expand.grid, package=base) for one way to create the 
combinations of (i,j) outside the loop, or perhaps vignette(nested, 
package=foreach) which does it automatically (rather: naturally).


Allan

On 18/04/11 16:53, Alaios wrote:

Dear all,
I am trying to find a decent way to speed up my code.

So far I have used mclapply with really good results (parallel version of 
lapply). I have a nested loop that I would like to help me convert it to lapply

for (i in seq(from=-1,to=1-2/ncol(sr),length=ncol(sr))){
  for (j in seq(from=-1,to=1-2/nrow(sr),length=nrow(sr))){
estimatedsr[findCoord(c(i,j),sr)[1],findCoord(c(i,j),sr)[2]  
]-fxy(c(i,j))
  }

So far I have converted some one-depth for loops to lapply but I am not sure If 
I can use lapply to convert a nested loop to something simpler.

Best Regards
Alex

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to remove the double or single quote from a string (unquote?)?

2011-04-18 Thread Allan Engelhardt
This should be a FAQ: you are confusing the value with its printed 
representation.  Try


print(paste(V, 3:8, sep=), quote=FALSE)

to see that there are no quotes, and you may want to read up on 
help(as.formula, package=stats) which has the examples you are 
searching for.


Allan

On 18/04/11 17:38, JingJiang Yan wrote:

Is there a function to get a string without a pair of quotes around it?

I have several expressions like:
glm(V12 ~ V3, family=binomial, data=df1)
glm(V12 ~ V4, family=binomial, data=df1)
...
glm(V12 ~ V8, family=binomial, data=df1)

As you can see, the only differences among them are V3 ... V8.
Because sometimes several of these expressions are performed many times,
I want to use a variable i to change the V3 ... V8. I did this with:

 i - 3:8
 glm(V12 ~ paste(V, i, sep=), family=binomial, data=df1)

However, it seems the paste always returns a variable name with a pair 
of quotes, which were wrong in such condition.
I only find a function sQuote to add quotes to a string, and it 
looks I am looking for an opposite function of it.

Any advice will be appreciated.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R licence

2011-04-07 Thread Allan Engelhardt
The licences are available on the web site and you really should have yor 
lawyers look at them and give you professional advise.

The GPL2+ is probably the relevant one for your purposes and essentially 
require you to provide the source for the parts of R that you distribute.

However, the R licencing is a mess with core packages not on a GPL licence. We 
looked into it and it is a nightmare and I don't even think the UK CRAN mirrors 
are strictly speaking legal. So look carefully at what you use and get advice 
from somewhere other than a mailing list.

happy to discuss the UK side of things offline if you want, but I never looked 
at other jurisdictions.

Allan

- Original message -
 Hi,
 
 is it possible to use some statistic computing by R in proprietary
 software? Our software is written in c#, and we intend to use
 http://rdotnet.codeplex.com/
 to get R work there. Especially we want to use loess function.
 
 Thanks,
 
 Best regards,
 Stanislav
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html and provide commented,
 minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Popularity of R, SAS, SPSS, Stata, Statistica, S-PLUS updated

2011-03-25 Thread Allan Engelhardt
Not R, but just to get the data (format is month year,week,count) to 
compare with your students' output:


perl -MLWP::UserAgent -e 'my $ua = LWP::UserAgent-new(); my $l = 
$ua-request(HTTP::Request-new(GET = 
qq{http://www.listserv.uga.edu/archives/sas-l.html}))-content(); while 
( $l =~ m{href=\(/cgi-bin/wa\?A1=.*?)\(.*?), \s week \s 
(\d)/a}igxms ) { my ($m,$w) = ($2,$3); my $u = 
qq{http://www.listserv.uga.edu$1}; my $i = 
$ua-request(HTTP::Request-new(GET = qq{$u}))-content(); my $c = () = 
$i =~ m{href=/cgi-bin/wa\?A2}igxms; print qq{$m,$w,$c\n}; sleep(1); }'

March 2011,4,122
March 2011,3,255
March 2011,2,312
March 2011,1,318
February 2011,4,243
February 2011,3,230
February 2011,2,289
February 2011,1,354
January 2011,5,93
January 2011,4,345
January 2011,3,329
January 2011,2,385
January 2011,1,297
December 2010,5,110
December 2010,4,205
December 2010,3,290
December 2010,2,359
December 2010,1,311
November 2010,5,91
November 2010,4,246
November 2010,3,227
November 2010,2,262
November 2010,1,228
October 2010,5,51
October 2010,4,242
October 2010,3,212
October 2010,2,250
October 2010,1,384
September 2010,5,101
September 2010,4,345
September 2010,3,214
September 2010,2,287
September 2010,1,133
August 2010,5,90
August 2010,4,314
August 2010,3,242
August 2010,2,202
August 2010,1,234
July 2010,5,95
July 2010,4,231
July 2010,3,354
July 2010,2,306
July 2010,1,176
June 2010,5,98
June 2010,4,244
June 2010,3,147
June 2010,2,229
June 2010,1,216
May 2010,5,25
May 2010,4,237
May 2010,3,328
May 2010,2,261
May 2010,1,341
April 2010,5,167
April 2010,4,291
April 2010,3,324
April 2010,2,273
April 2010,1,288
March 2010,5,217
March 2010,4,351
March 2010,3,437
March 2010,2,524
March 2010,1,456
February 2010,4,473
February 2010,3,348
February 2010,2,379
February 2010,1,347
January 2010,5,108
January 2010,4,482
January 2010,3,387
January 2010,2,424
January 2010,1,398
December 2009,5,88
December 2009,4,252
December 2009,3,443
December 2009,2,373
December 2009,1,514
November 2009,5,158
November 2009,4,318
November 2009,3,461
November 2009,2,383
November 2009,1,494
October 2009,5,186
October 2009,4,515
October 2009,3,532
October 2009,2,547
October 2009,1,410
September 2009,5,209
September 2009,4,457
September 2009,3,435
September 2009,2,371
September 2009,1,355
August 2009,5,87
August 2009,4,528
August 2009,3,436
August 2009,2,526
August 2009,1,412
July 2009,5,270
July 2009,4,423
July 2009,3,449
July 2009,2,380
July 2009,1,346
June 2009,5,156
June 2009,4,390
June 2009,3,510
June 2009,2,473
June 2009,1,450
May 2009,5,107
May 2009,4,476
May 2009,3,487
May 2009,2,583
May 2009,1,494
April 2009,5,219
April 2009,4,592
April 2009,3,574
April 2009,2,516
April 2009,1,547
March 2009,5,284
March 2009,4,571
March 2009,3,553
March 2009,2,584
March 2009,1,691
February 2009,4,646
February 2009,3,508
February 2009,2,688
February 2009,1,489
[...]

Hope this helps a little.

Allan
(Who thinks it is very sad that he can remember that $c=()=$a=~$b 
construct...)



On 22/03/11 23:26, Bob Muenchen wrote:

 On 3/22/2011 5:15 PM, Hadley Wickham wrote:
I don't doubt that R may be the most popular in terms of 
discussion group

traffic, but you should be aware that the traffic for SAS comprises two
separate lists that used to be mirrored, but are no longer linked
Usenet --  news://comp.soft-sys.sas  (what you counted)
listserve -- SAS-L http://www.listserv.uga.edu/archives/sas-l.html

R programming challenge: create a script that parses those html pages
to compute the total number of messages per week!  (Maybe I'll use
this in class)

Hadley



That would be nice! I'd love to have all the sources, which includes 
various company forums. Sounds like students could be kept busy for 
quite a few projects. If any pull it off, please send it my way!


Cheers,
Bob



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Need help with error

2011-03-19 Thread Allan Engelhardt

As it says at the bottom of every post:


PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Without an example that fails, it is hard to help.

Allan

On 18/03/11 16:26, Savitri N Appana wrote:

Hi R users,

I am getting the following error when using the splsda function in R
v2.12.1:

Error in switch(classifier, logistic = { : EXPR must be a length 1
vector

What does this mean and how do I fix this?

Thank you in advance!

Best,
Savi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How do I delete multiple blank variables from a data frame?

2011-03-19 Thread Allan Engelhardt



On 19/03/11 01:35, Joshua Wiley wrote:

Hi Rita,

This is far from the most efficient or elegant way, but:

## two column data frame, one all NAs
d- data.frame(1:10, NA)
## use apply to create logical vector and subset d
d[, apply(d, 2, function(x) !all(is.na(x)))]


This works, but apply converts d to a matrix which is not needed, so try

d[, sapply(d, function(x) !all(is.na(x)))]


if performance is an issue (apply is about 3x slower on your test data 
frame d, more for larger data frames).


For the related problem of removing columns of constant-or-na values, 
the best I could come up with is


zv.1 - function(x) {
## The literal approach
y - var(x, na.rm = TRUE)
return(is.na(y) || y == 0)
}
sapply(train, zv.1)

See 
http://www.cybaea.net/Blogs/Data/R-Eliminating-observed-values-with-zero-variance.html 
for the benchmarks.


Allan



I am just apply()ing to each column (the 2) of d, the function
!all(is.na(x)) which will return FALSE if all of x is missing and TRUE
otherwise.  The result is a logical vector the same length as the
number of columns in d that is used to subset only the d columns with
at least some non-missing values.  For documentation see:

?apply
?is.na
?all
?[
?Logic

HTH,

Josh

On Fri, Mar 18, 2011 at 3:35 PM, Rita Carreiraritacarre...@hotmail.com  wrote:

Dear List Members,I have 55 data frames, each of which with 272 variables and 
267 observations. Some of these variables are blanks but the blanks are not the 
same for every data frame. I would like to write a procedure in which I import 
a data frame, see which variables are blank, and delete those variables. My 
data frames have variables named P1 to P136 and Q1 to Q136.
I have a couple of questions regarding this issue:
1) Is a loop an efficient way to address this problem? If not, what are my 
alternatives and how do I implement them?2) I have been playing with a single data 
frame to try to figure out a way of having R go through the columns and see which 
ones it should delete. I have figured out how to delete rows with missing data 
(newdata- na.omit(olddata)) but how do I do it for columns???
Thank you very much for your help and have a great weekend!
Rita  If you think education is expensive, 
try ignorance--Derek Bok



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cross-validation in rpart

2011-03-19 Thread Allan Engelhardt
I assume you mean rpart::xpred.rpart ?  The beauty of R means that you 
can look at the source.  For the simple case (where xval is a single 
number) the code does indeed do simple random sampling


xgroups- sample(rep(1:xval, length = nobs), nobs, replace = FALSE)


If you want another sampling, then you simply pass a vector as the xval 
parameter, as the documentation says: “This may also be an explicit list 
of integers that define the cross-validation groups”.


Hope this helps a little.

Allan

On 19/03/11 09:21, Penny B wrote:

I am trying to find out what type of sampling scheme is used to select the 10
subsets in 10-fold cross-validation process used in rpart to choose the best
tree. Is it simple random sampling? Is there any documentation available on
this?

Thanks, Penny.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph problem!!

2011-03-19 Thread Allan Engelhardt

You should probably tell us which part of

a- 
read.csv(http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv;)
hist(a)


doesn't do what you expect.

(Though often when people say histogram they want something else - 
what's with that anyhow?)


Allan

On 19/03/11 13:10, andrew456 wrote:

http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv
Life_Expectancies_2008.csv

I am trying to plot a histogram base on the file i uploaded above. I am
facing a trouble in sorting out the frequency of the life expectancies. I
wanted to plot a graph of life expectancies at birth versus frequency,but i
have no idea how to change the locations into frequencies taking the range
of life expectancies at birth from 40 to 90. I tried using
table(Life.Expectancies.at.Birth) to show the frequencies of each number and
I do not know how to continue.

--
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-graph-problem-tp3389613p3389613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plotting graph problem!!

2011-03-19 Thread Allan Engelhardt

On 19/03/11 15:04, Allan Engelhardt wrote:

You should probably tell us which part of

a- 
read.csv(http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv;)

hist(a)

Should be hist(a$Life.Expectancies.at.Birth), of course.  Sorry.



doesn't do what you expect.

(Though often when people say histogram they want something else - 
what's with that anyhow?)


Allan

On 19/03/11 13:10, andrew456 wrote:

http://r.789695.n4.nabble.com/file/n3389613/Life_Expectancies_2008.csv
Life_Expectancies_2008.csv

I am trying to plot a histogram base on the file i uploaded above. I am
facing a trouble in sorting out the frequency of the life 
expectancies. I
wanted to plot a graph of life expectancies at birth versus 
frequency,but i
have no idea how to change the locations into frequencies taking the 
range

of life expectancies at birth from 40 to 90. I tried using
table(Life.Expectancies.at.Birth) to show the frequencies of each 
number and

I do not know how to continue.

--
View this message in context: 
http://r.789695.n4.nabble.com/Plotting-graph-problem-tp3389613p3389613.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regex query (Apache logs)

2011-03-17 Thread Allan Engelhardt

Some comments:

1. [^\s] matches everything up to a literal 's', unless perl=TRUE.
2. The (.*) is greedy, so you'll need (.*?)\s(.*?)\s(.*?)$ or 
similar at the end of the expression


With those changes (and removing a space inserted by the newsgroup 
posting) the expression works for me.


 (pat - readLines(/tmp/b.txt)[1])
[1] 
^(\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3})\\s([^\\s]*)\\s([^\\s]*)\\s\\[([^\\]]+)\\]\\s\([A-Z]*)\\s([^\\s]*)\\s([^\\s]*)\\\s([^\\s]+)\\s(\\d+)\\s\(.*?)\\\s\(.*?)\\\s\(.*?)\$

 regexpr(pat, test, perl=TRUE)
[1] 1
attr(,match.length)
[1] 436

3. Consider a different approach, e.g. scan(textConnection(test), 
what=character(0))


Hope this helps

Allan


On 16/03/11 22:18, Saptarshi Guha wrote:

Hello R users,

I have this regex see [1] for apache log lines. I tried using R to parse
some data (only because I wanted to stay in R).
A sample line is [2]

(a) I saved the line in [1] into ~/tmp/a.txt and [2] into /tmp/a.txt

pat- readLines(~/tmp/a.txt)
test- readLines(/tmp/a.txt)
test
grep(pat,test)

returns integer(0)

The same query works in python via re.match() (i.e does return groups)

Using readLines, the regex is escaped for me. Does Python and R use
different regex styles?

Cheers
Saptarshi

[1]
  
^(\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s([^\s]*)\s([^\s]*)\s\[([^\]]+)\]\s([A-Z]*)\s([^\s]*)\s([^\s]*)\s([^\s]+)\s(\d+)\s(.*)\s(.*)\s(.*)$

[2]
220.213.119.925 addons.mozilla.org - [10/Jan/2001:01:55:07 -0800] GET
/blocklist/3/%8ce33983c0-fd0e-11dc-12aa-0800200c9a66%7D/4.0b5/Fennec/20110217140304/Android_arm-eabi-gcc3/chrome:%2F%2Fglobal%2Flocale%2Fintl.properties/beta/Linux%
202.6.32.9/default/default/6/6/1/ HTTP/1.1 200 3243 - Mozilla/5.0
(Android; Linux armv7l; rv:2.0b12pre) Gecko/20110217 Firefox/4.0b12pre
Fennec/4.0b5 BLOCKLIST_v3=110.163.217.169.1299218425.9706

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does R have a const object?

2011-03-16 Thread Allan Engelhardt



On 16/03/11 13:04, Michael Friendly wrote:

On 3/15/2011 2:23 PM, Uwe Ligges wrote:



On 15.03.2011 15:53, xiagao1982 wrote:

Hi, all,

Does R have a const object concept like which is in C++ language? I
want to set some data frames as constant to avoid being modified
unintentionally. Thanks!



Although there is almost never a No in R, the best short answer is: 
No.


This is just the flexibility of R.  I've just discovered a new class 
of geometries based on


 pi - 2.3
?Constants


Yes, but you can still print(base::pi) and rm(pi) to get back to our 
flat world, and you can't


 assign(pi, 4, pos = package:base)
Error in assign(pi, 4, pos = package:base) :
  cannot change value of locked binding for 'pi'

Just a feature of the search path.  Morale: if you want base::pi, write 
base::pi.


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Does R have a const object?

2011-03-16 Thread Allan Engelhardt



On 16/03/11 15:04, Gabor Grothendieck wrote:

On Wed, Mar 16, 2011 at 10:54 AM, Allan Engelhardtall...@cybaea.com  wrote:

[...]
Yes, but you can still print(base::pi) and rm(pi) to get back to our flat
world, and you can't


assign(pi, 4, pos = package:base)

Error in assign(pi, 4, pos = package:base) :
  cannot change value of locked binding for 'pi'

Just a feature of the search path.  Morale: if you want base::pi, write
base::pi.


Try this:


old.pi- pi
assignInNamespace(pi, 2.3, ns = base)
pi

[1] 2.3

base::pi

[1] 2.3


Ah!  assignInNamespace does an unlockBinding under the covers which 
obviously is Evil(tm).  The answer clearly is to do


assignInNamespace(unlockBinding, function (...) {warning(be safe);}, 
ns=base)


in your .Rprofile!  Let's Keep R Safe, Folks!

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Singularity problem

2011-03-16 Thread Allan Engelhardt

Numeric underflow.  Try qr.solve(a)

Allan

On 16/03/11 15:28, Feng Li wrote:

Dear R,

If I have remembered correctly, a square matrix is singular if and only if
its determinant is zero. I am a bit confused by the following code error.
Can someone give me a hint?


a- matrix(c(1e20,1e2,1e3,1e3),2)
det(a)

[1] 1e+23

solve(a)

Error in solve.default(a) :
   system is computationally singular: reciprocal condition number = 1e-17


Thanks in advance!


Feng



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] table() reading problem

2011-03-16 Thread Allan Engelhardt



On 16/03/11 09:20, fre wrote:

I have the following problem:

I have some string with numbers like k. I want to have a table like the
function table() gives. However I am not able to call the first row, the 1,
2, 3, 5, 6 or 9. I tried to do that by a data.frame, but that doesn't seem


The first row, as you call it, is accessible as the vector dimnames(x)$k 
(or as.numeric(dimnames(x)$k) if you need numbers, not strings).



to work either. The levels keep bothering me.

This is an example of the code:

k-c(1,1,1,1,1,3,5,6,2,1,3,9,2,3,1,1,1)

table(k)

k
1 2 3 5 6 9
9 2 3 1 1 1


Maybe something like

matrix(c(as.integer(dimnames(x)$k), x), nrow=2, byrow=TRUE)

is what you are looking for?

Hope this helps

Allan


x-table(k)

dim(x)

[1] 6

x[1] #But i only want the one

1
9

x-data.frame(x)

x[1,1] #You are not allowed to use this one for example 3*x[1,1] is
impossible

[1] 1
Levels: 1 2 3 5 6 9
I hope anyone has an idea of using the table function without this
inconvenience. I thought about writing a counter myself, but that seems
complicated.
Because I have to examine very large examples later on, I don't want to slow
the calculations down if possible.

Thanks for your help in advance.

Frederique

--
View this message in context: 
http://r.789695.n4.nabble.com/table-reading-problem-tp3381174p3381174.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do we read netcdf / hdf files in R?

2011-03-16 Thread Allan Engelhardt
Try the ncdf, ncdf4, and hdf5 packages from CRAN.  (I have never used 
the hdf format, but the ncdf4? packages seem to work well.)


Allan

On 16/03/11 07:25, Yogesh Tiwari wrote:

Dear R Users,

How do we read netcdf / hdf format files in R ?

Also, can we convert netcdf to hdf format in R?

Great Thanks,

Best Regards,
Yogesh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do we read netcdf / hdf files in R?

2011-03-16 Thread Allan Engelhardt
(For hdf5 you might want to try the development version at 
http://xweb.geos.ed.ac.uk/~hcp/Rhdf5 which improves error handling, cf. 
this thread: 
https://stat.ethz.ch/pipermail/r-devel/2011-March/060111.html .)


Allan

On 16/03/11 08:53, Allan Engelhardt wrote:
Try the ncdf, ncdf4, and hdf5 packages from CRAN.  (I have never used 
the hdf format, but the ncdf4? packages seem to work well.)


Allan

On 16/03/11 07:25, Yogesh Tiwari wrote:

Dear R Users,

How do we read netcdf / hdf format files in R ?

Also, can we convert netcdf to hdf format in R?

Great Thanks,

Best Regards,
Yogesh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] install.packages barfs on dependencies= argument

2011-03-15 Thread Allan Engelhardt

I'm sure I used to be able to do

my.dependencies- c(Depends, Imports, LinkingTo, Suggests, Enhances)
install.packages(animation, lib = Sys.getenv(R_LIBS_USER), dependecies = 
my.dependencies)

but now I get

Error in download.file(url, destfile, method, mode = wb, ...) :
  unused argument(s) (dependecies = c(Depends, Imports, 
LinkingTo, Suggests, Enhances))

Warning in download.packages(pkgs, destdir = tmpd, available = available,  :
  download of package 'animation' failed

Did anything change?

Allan

 sessionInfo()
R version 2.12.2 (2011-02-25)
Platform: x86_64-redhat-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.utf8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.utf8LC_COLLATE=en_GB.utf8
 [5] LC_MONETARY=C LC_MESSAGES=en_GB.utf8
 [7] LC_PAPER=en_GB.utf8   LC_NAME=C
 [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] ctv_0.7-0

loaded via a namespace (and not attached):
[1] tools_2.12.2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] install.packages barfs on dependencies= argument

2011-03-15 Thread Allan Engelhardt

Groan!  Thanks.  New baby mean no sleep.  I'll make some more coffee.

Allan

On 15/03/11 08:18, Jim Lemon wrote:

On 03/15/2011 06:57 PM, Allan Engelhardt wrote:

I'm sure I used to be able to do
...
unused argument(s) (dependecies = c(Depends, Imports, LinkingTo,


check the spelling.

Jim


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Serial Date

2011-03-15 Thread Allan Engelhardt



On 14/03/11 16:00, David Winsemius wrote:

On Mar 14, 2011, at 6:36 AM, Allan Engelhardt wrote:

On 14/03/11 02:00, Raoni Rosa Rodrigues wrote:

[...]
I'm working in a project with a software that register date and time 
data in serial time format. This format is used by excel, for 
exemple. In this format, 40597.3911423958 is 2011/2/23 09:23:15.

Not on my system,


Because ... There is a reasonably well understood MS bug in how it 
handles the non-existent 1900-02-29. (It _still_ accepts that as a 
valid date and calls it 60. I just re-demonstrated this amazingly 
persistent bug behavior on the Mac Excel 2011 variant.  )

Amazing.  Just say 'no' to Excel :-)


http://finzi.psych.upenn.edu/R/library/cxxPack/html/serialNumber.html


Unfortunately, cxxPack has not compiled for a while because of a 
RcppDate issue: 
http://cran.r-project.org/web/checks/check_results_cxxPack.html .  But 
the description is still funny.  Thanks.


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding the name vector

2011-03-15 Thread Allan Engelhardt


On 15/03/11 05:17, Joshua Wiley wrote:

Hi Laura,

?is.vector


Consider methods::is if you might ever use attributes:

x- 1:10
is.vector(x)
# [1] TRUE
comment(x)- Mine!
is.vector(x)
# [1] FALSE
is(x, vector)
# [1] TRUE


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] switch and factors

2011-03-14 Thread Allan Engelhardt
Maybe I am misunderstanding you, but I think it is documented?  The help 
page says


switch(EXPR, ...)
[...]
If the value of ‘EXPR’ is not a character string it is coerced to
integer. [...]


Since

is.character( factor('x', levels=c('y', 'x')) )
[1] FALSE


you get


 as.integer( factor('x', levels=c('y', 'x')) )

[1] 2


from the coerced to integer part?  Therefore the second statement (y) 
is evaluated regardless of the label (y=).


Allan

On 09/03/11 02:02, baptiste auguie wrote:

Dear list,

Reading the help page for ?switch didn't give me more than a hint at
what's going on here,

x = 5
y = 2

foo- function(a=x){
  switch(a, x = x,
  y = y)
}

foo(factor('x', levels=c('y', 'x')))

# 2

It seems that switch, when given a factor, uses the numeric codes
rather than the string levels as I would have naively expected. Is
this deliberate, should it be mentioned on the help page? I had an
input that had been invisibly converted to a factor by data.frame();
using switch resulted in serious confusion.

Thanks,

baptiste

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Serial Date

2011-03-14 Thread Allan Engelhardt


On 14/03/11 02:00, Raoni Rosa Rodrigues wrote:
 Hello R Help,

 I'm working in a project with a software that register date and time data in 
 serial time format. This format is used by excel, for exemple. In this 
 format, 40597.3911423958 is 2011/2/23 09:23:15.

Not on my system, but this should get you going:

( a- as.POSIXct(2011-02-23 09:23:15.1, tz=GMT) )
# [1] 2011-02-23 09:23:15 GMT
( b- difftime(a, ISOdatetime(1900,1,1,0,0,0), tz=GMT, units=days) )
# Time difference of 40595.39 days
as.double(b)
# [1] 40595.39
options(digits.secs = 6)
ISOdatetime(1900, 1, 1, 0, 0, 0) + 40597.3911423958*60*60*24 # TZ dependent
  [1] 2011-02-25 09:23:14.702997 GMT



Hope this helps a little.

Allan

   First part is number os days since 1900/1/1, and second part is a fraction 
 of a day.

 I need to make this transformation in R, and use it to make some algebrian 
 operations. I found that function 'serialNumber' of package 'cxxPack' 
 transform a as.data class object in this format, but without time fraction. 
 But I can't find nothing to do the inverse way, transform serial date format 
 in a POSIX class, for exemple.

 Since now, thanks for your time and atention.

 All best,

 Raoni
 Associate Researcher
 Fish Passage Center
 UFMG, Brazil.


 Fica proibido o uso da palavra liberdade,

 a qual será suprimida dos dicionários

 e do pântano enganoso das bocas.

 A partir deste instante

 a liberdade será algo vivo e transparente

 como um fogo ou um rio,

 e a sua morada será sempre

 o coração do homem.

 (Thiago de Mello)



   [[alternative HTML version deleted]]



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] creating character vector

2011-03-14 Thread Allan Engelhardt

You could try

scan(what=character(0), sep=,, file=textConnection(first,second,third))


but better to put the strings in a file (say, strings.txt), one per 
line, and read it using


scan(strings.txt, what=character(0), sep=\n)


Even better is to understand what you are really trying to do and we can 
maybe help with that.


Hope this helps a little


Allan
---
http://www.cybaea.net/Blogs/Data/


On 14/03/11 09:29, Maas James Dr (MED) wrote:

Is there a way to convince R to create a character vector without using the 
quotes?

This works

ex1-  c(first,second)

but when I try this it doesn't

ext- as.character(c(first,second))

it complains. I have many variables to put into character vectors so dispensing 
with the quotes would be useful.

Thanks

Jim


===
Dr. Jim Maas
University of East Anglia


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] positions and margins differ between X11 and SVG device

2011-03-08 Thread Allan Engelhardt
Looks the same on my machine, albeit using svg() instead of CairoSVG - 
what version are you using?


One sometimes useful technique is to create the first graphics in a 
vector format (pdf, svg) and then use an editing program like Inkscape 
or Illustrator to do any touch-up before publication, as well as 
resizing and format conversion.


Good luck!

Allan

On 08/03/11 17:02, Frank Schwach wrote:

Hi,

I'm trying to get a plot ready for publication, which involves getting
it to look nice at a rather small size and to fine-tune positioning all
the labels and sizes of the margins.
I realise that I may not be doing this the right way and I welcome any
comments about better approaches to do this. What I have done so far is
open an X11 device with the size I want for the final output and I have
now managed to arrange all the elements as I wanted in the R script that
draws the plot.
Saving the plot to a bitmap format preserves all the positions and sizes
but the resulting quality is too low. Saving it to svg (also tried
CairoSVG) of course gives me excellent quality, but the positions and
margins are all completely different from the X11 window. Now I'm
wondering how others deal with this (surely common) situation?
Figuring out the positions while working blindly (saving to svg file all
the time and opening it again to see the result) surely can't be the way
to adjust the positions. But what else can I do if X11 output and svg
differ so much?

In fact, here is a toy example to demonstrate the differences:


x1=seq(0,2,by=0.01)
y1=2*sin(2*pi*(x1-1/4))

x11(width=1.67, height=2.28)
plot(x1,y1)

# looks completely different in svg (at least on my machine):

CairoSVG(file=test.svg, width=1.67, height=2.28)
plot(x1,y1)
dev.off()


Any help or general advice on formatting plots for publication would be
very welcome!

Cheers,

Frank






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Probabilities outside [0, 1] using Support Vector Machines (SVM) in e1071

2011-03-07 Thread Allan Engelhardt
predict.svm only returns probabilities for model types (Model$type) less 
than 2, which I guess are classification models (?).  In any case, the 
probabilities are returned as an attribute which your result clearly 
lacks.  Trivial example:



 model- svm(Species ~ ., data = iris[-1,], probability=TRUE)
 ( p- predict(model, newdata=iris[1,], probability=TRUE) )

 1
setosa
attr(,probabilities)
 setosa versicolor   virginica
1 0.9796166 0.01147175 0.008911663
Levels: setosa versicolor virginica


Hope this helps a little.

Allan

On 04/03/11 18:54, Adam B. Smith wrote:

Hi All,

I'm attempting to use eps-regression or nu-regression SVM to compute
probabilities but the predict function applied to an svm model object
returns values outside [0, 1]:

Variable Data looks like:
Present X02 X03 X05 X06 X07 X13 X14 X15 X18
1 0 1634 48 2245.469 -1122.0750 3367.544 11105.013 2017.306 40 23227
2 0 1402 40 2611.519 -811.2500 3422.769 10499.425 1800.475 40 13822
3 0 1379 40 2576.150 -842.8500 3419.000 10166.237 2328.756 37 14200
4 0 1869 51 2645.794 -982.2938 3628.088 9610.037 1699.656 43 20762
...

and bgEnv looks similar:
X02 X03 X05 X06 X07 X13 X14 X15 X18
1 1001 39 2521.406 -38.0875 2559.494 48507.312 3925.7563 63 20616
2 1587 39 3148.056 -895.0187 4043.075 5937.669 910.9062 55 15156
3 1610 40 4116.918 172.6812 3944.237 2287.431 196.0312 51 2739
4 1495 43 3678.381 236.9250 3441.456 3298.625 23.9875 86 281
5 1564 43 3010.988 -623.6063 3634.594 3416.350 819.6375 34 3848
...

modelFormula- as.formula(Present ~ X02 + X03 + X05 + X06 + X07 + X13 +
X14 + X15 + X18)

Model- svm( modelFormula,
data=Data,
gamma=0.25,
cost=4,
nu=0.10,
kernel='radial',
scale=TRUE,
type='nu-regression',
na.action=na.omit,
probability=TRUE
)

bgPreds- predict(
Model,
newdata=bgEnv,
type='nu-regression',
probability=TRUE
)

bgPreds looks like:
11 12 13 14 15 16 17 18
0.54675813 0.37587560 0.39526542 0.67043587 -0.03079247 0.16696996
0.04714134 0.06989950
19 20
0.07615735 0.14923408

Notice the negative value.  I can also get values1.  I had thought
argument probability=TRUE would give probabilities.

Any help would be greatly appreciated!

Adam


Adam B. Smith
University of California, Berkeley

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] transaction list transformation to use rpart.

2011-03-07 Thread Allan Engelhardt



On 06/03/11 22:34, John Dennison wrote:

[...]
from data like

Customer-ID | Item-ID
cust1   | 2
cust1   | 3
cust1   | 5
cust2  | 5
cust2  | 3
cust3 | 2
...

#read in data to a sparse binary transaction matrix
txn = read.transactions(file=tranaction_list.txt, rm.duplicates= TRUE,
format=single,sep=|,cols =c(1,2));

#tranaction matrix to matrix
a-as(txn, matrix)

#matrix to data.frame
b-as.data.frame(a)

I end up with a data.frame like:

X   X.1 X.2  X.3 X.4 X.5 ...
cust1  01   101
cust2  00   101
cust3  01   000
...

  However the as.data.frame(a) transforms the matrix into a numeric
data.frame so when I implement the rpart algorithm it automatically returns
a regression classification tree.


I am not sure your approach with rpart is going to give you what you are 
looking for, but on to your R question:



[...] I can't successfully transform the data.frame to a factor. i
tried:

b_factor-as.factor(b)
Error in sort.list(y) :
   'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?


You need to do each column individually, i.e. b_factor$X.1 - 
as.factor(b$X.1) or



 str( as.data.frame(lapply(b, as.factor)) )

'data.frame':4 obs. of  4 variables:
 $ X.2  : Factor w/ 2 levels 0,1: 2 1 2 1
 $ X.3  : Factor w/ 2 levels 0,1: 2 2 1 1
 $ X.5  : Factor w/ 2 levels 0,1: 2 2 1 1
 $ X.Item.ID: Factor w/ 2 levels 0,1: 1 1 1 2


Also have a look at as(txn, data.frame) for a different format that 
may (with some clean up) be easier to use.



 as(txn, data.frame)

 transactionID  items
1 cust1{ 2, 3, 5}
2  cust2  { 3, 5}
3   cust3{ 2}
4 Customer-ID  { Item-ID}


Hope this helps a little.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] openNLP package error

2011-01-05 Thread Allan Engelhardt

Apologies that I am late on this thread.

On 02/12/10 17:39, Sascha Wolfer wrote:
I seem to have a problem with the openNLP package, I'm actually stuck 
in the very beginning. Here's what I did:

 install.packages(openNLP)
 install.packages(openNLPmodels.de, repos = 
http://datacube.wu.ac.at/;, type = source)


 library(openNLPmodels.de)
 library(openNLP)

So I installed the main package as well as the supplementary german 
model. Now, I try to use the sentDetect function:


 s - c(Das hier ist ein Satz. Und hier ist noch einer - sogar mit 
Gedankenstrich. Ist das nicht toll?)

 sentDetect(s, language = de, model = openNLPmodels.de)

I get the following error message which I can't make any sense of:

Fehler in .jnew(opennlp/maxent/io/SuffixSensitiveGISModelReader, 
.jnew(java.io.File,  :
  java.io.FileNotFoundException: openNLPmodels.de (No such file or 
directory)


The correct syntax seems to be

sentDetect(s, model = system.file(models, de-sent.bin, package = 
openNLPmodels.de))


but unfortunately I get

Error in .jcall(.jnew(opennlp/maxent/io/SuffixSensitiveGISModelReader,  :
  java.io.UTFDataFormatException: malformed input around byte 48


YMMV.  But you get the idea on the syntax of the model= argument.  This 
works:


sentDetect(s, model = system.file(models, sentdetect, EnglishSD.bin.gz, package = 
openNLPmodels.en))
# [1] Das hier ist ein Satz. 
# [2] Und hier ist noch einer - sogar mit Gedankenstrich. 

# [3] Ist das nicht toll?


Hope this helps you a little.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Questions about Probit Analysis

2010-11-17 Thread Allan Engelhardt
Sorry I am a bit late to this discussion, but I can't see if you ever 
got an answer.  Anyhow, on your first question:


On 31/10/10 19:14, Lorenzo Isella wrote:

Dear All,
I have some questions about probit regressions.
I saw a nice introduction at

http://bit.ly/bU9xL5

and I mainly have two questions.

(1) The first is almost about data manipulation. Consider the 
following snippet


[...]
mydata - read.csv(url(http://www.ats.ucla.edu/stat/r/dae/binary.csv;))
names(mydata) - c(outcome,x1,x2,x3)

myprobit - 
glm(mydata$outcome~mydata$x1+mydata$x2+as.factor(mydata$x3), 
family=binomial(link=probit))

[...]
#Now assume I can make a regression only on x1


myprobit2 - glm(mydata$outcome~mydata$x1, 
family=binomial(link=probit))

[...]
Finally, I generate the data frame mydatanew (see some of its entries 
below)


 mydatanew
x1 successes failures
1  220 01
2  300 12
3  340 13
4  360 04
5  380 08
[...]

where for every value of x1 I count the number of 0 and 1 outcomes 
(namely number of failures and number of successes). [...] How can I 
actually feed R with mydatanew to perform again a logistic regression 
on x1 only?


myprobit3 - glm(cbind(successes, failures) ~ x1, 
family=binomial(link=probit), data = mydatanew )

all.equal(coef(myprobit2), coef(myprobit3), check.attributes = FALSE)
# [1] TRUE


Your second question we could discuss offline: it is not really an R 
question, but you might want to have a look at something like the MNP 
and perhaps survey packages.



Hope this helps a little.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [semi-OT] Using fortune() in an email signature

2010-09-01 Thread Allan Engelhardt

On 01/09/10 22:18, Dirk Eddelbuettel wrote:

[...]

Doing the same with Rscript is left as an exercise
   


I like exercises:

Rscript --default-packages=fortunes -e print(fortune())

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Finding pairs

2010-08-31 Thread Allan Engelhardt
Something like this may get you started

library(igraph)
df- data.frame(A=0:7, B=c(rep(NA,3), 0, 1, 3, 7, 2)) # Assuming this is your 
data
b- as.matrix(df)
print(b)
# A  B
#[1,] 0 NA
#[2,] 1 NA
#[3,] 2 NA
#[4,] 3  0
#[5,] 4  1
#[6,] 5  3
#[7,] 6  7
#[8,] 7  2
z- graph.edgelist(na.omit(b))
neighborhood(z, Inf, mode = in)
#[[1]]
#[1] 0 3 5
#
#[[2]]
#[1] 1 4
#
#[[3]]
#[1] 2 7 6
#
#[[4]]
#[1] 3 5
#
#[[5]]
#[1] 4
#
#[[6]]
#[1] 5
#
#[[7]]
#[1] 6
#
#[[8]]
#[1] 7 6


Hope this helps

Allan

On 24/08/10 19:28, Mike Rhodes wrote:



 Dear R Helpers,


 I am a newbie and recently got introduced to R. I have a large database 
 containing the names of bank branch offices along-with other details. I am 
 into Operational Risk as envisaged by BASEL II Accord. 


 I am trying to express my problem and I am using only an indicative data 
 which comes in coded format.




 A (branch)  B (controlled by)


 144   
 145  
 146   
 147   144 
 148   145 
 149   147
 151   146  
   ..  ...
   
 ..  ...


 where 144's etc are branch codes in a given city and B is subset of A.




 If a branch code appearing in A also appears in B (which is paired with 
 some otehr element of A e.g. 144 appearing in A, also appears in B and is 
 paired with 147 of A and likewise), then that means 144 is controlling 
 operations of bank office 147. Again, 147 itself appears again in B and is 
 paired with bank branch coded 149. Thus, 149 is controlled by 147 and 147 is 
 controlled by 144. Likewise there are more than 700 hundred branch name codes 
 available.


 My objective is to group them as follows -


 Bank Branch


 144  147149 


 145


 146   151  


 148
 .


 or even the following output will do.


 144
 147
 149


 145


 146
 151


 148
 151
 ..


 I understand I should be writing some R code to begin with which I had tried 
 also but as of now I am helpless. Please guide me.


 Mike





   [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Remove Loading required package message

2010-08-31 Thread Allan Engelhardt



On 27/08/10 10:19, Barry Rowlingson wrote:

On Fri, Aug 27, 2010 at 9:09 AM, Sébastien Moretti
sebastien.more...@unil.ch  wrote:
   
[...]

  require it quietly:

   

require(sp,quietly=TRUE)
 


Doesn't work for me:

 require(Matrix, quietly=TRUE)
Loading required package: lattice

Attaching package: 'Matrix'

The following object(s) are masked from 'package:base':

det

 sessionInfo()
R version 2.12.0 Under development (unstable) (2010-08-30 r52845)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] Matrix_0.999375-43 lattice_0.18-8 ctv_0.6-0

loaded via a namespace (and not attached):
[1] grid_2.12.0  tools_2.12.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] any statement equals to 'goto'?

2010-08-31 Thread Allan Engelhardt

Sounds like you want

help(tryCatch) # Catch the error and do something
help(next)  # Go to the next value of the surrounding loop

Hope this helps a little.

Allan

On 31/08/10 19:34, karena wrote:

I have the following code:
-
result- matrix(NA, nrow=1, ncol=5)
for(i in 1:(nsnp-1)) {
   for(j in (i+1):nsnp){
tempsnp1- data.lme[,i]
tempsnp2- data.lme[,j]
fm1- lme(trait~sex+age+rmtemp.b+fc+tempsnp1+tempsnp2+tempsnp1*tempsnp2,
random=~1|famid, na.action=na.omit)
fm2- lme(trait~sex+age+rmtemp.b+fc+tempsnp1+tempsnp2, random=~1|famid,
na.action=na.omit)
result[1,1]- i
result[1,2]- j
result[1,3]- colnames(data.lme)[i]
result[1,4]- colnames(data.lme)[j]
result[1,5]- 2*(summary(fm1)$logLik-summary(fm2)$logLik)
if (i==1  j==2) results- result
if (i!=1 | j!=2) results- rbind(results, result)
}
-

after submitting this code, I got the error message saying Error in
MEEM(object, conLin, control$niterEM) :
Singularity in backsolve at level 0, block 1, I know this might be some
issue due to the missing values. what I want to do is to skip the 'bad'
variable, for example, I got this error message when j=116, so I just wanna
ignore variable116, is there a statement that can do this: when getting this
error message 'Error in MEEM...', directly goto the next variable?

thank you,

karena






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Shifting of Principal amount while calculating Present Value

2010-08-28 Thread Allan Engelhardt
Matrix operations work the other way to what you expect on rows and 
columns.  Try

## Your data:
time - c(1, 2, 3)  # Please don't use `t' as a variable name
cash_flow - c(7, 7, 107)
zero_rate - data.frame(rating=factor(c(AAA,AA,A,B)),
 year1=c(3.60, 3.65, 3.72, 4.10),
 year2=c(4.17,4.22,4.32,4.67),
 year3=c(4.73,4.78,4.93,5.25))
zero_rate1 - zero_rate[, -1]
## Make clear we are dealing with matrices (not really needed)
zero_rate2 - as.matrix(zero_rate1)

## The calculation
colSums(cash_flow/(t(1 + zero_rate2 / 100)^time))
# [1] 106.3549 106.2122 105.7969 104.8871

Here t() is the transpose function.  Try to step through each part of 
the calculation to figure it out.  You may also want to look at c(1, 2, 
3) / matrix(1, 3, 3) and matrix(sqrt(2), 3, 3)^c(1, 2, 4) for smaller 
examples.

Hope this helps a little.

Allan.

On 20/08/10 12:24, Sarah Sanchez wrote:
 Dear R Helpers

 I have following data -

 cash_flow = c(7, 7, 107)  # 107 = Principle 100 + interest of 7%
 t = c(1,2,3)

 and zero rate table as

 rating year1   year2   year3
 AAA3.604.17  4.73
 AA  3.654.22 
   4.78
 A 3.72   4.32  4.93
 BBB4.104.67 5.25


 For each of these ratings I need to calculate the Present Value as (say e.g. 
 for AAA)

 PV(AAA)  =  7/(1+3.60/100) + 7/(1+4.17/100)^2 +  107/(1+4.73/100)^3 which is 
 equal to 106.3549

 Similarly when used the respective rates,  PV(A) = 106.2122, PV(A) = 105.7969 
 and PV(BBB) = 104.8871

 #

 ## My problem

 I have tried the following R code.

 zero_rate =
   read.csv('zero_rate_table.csv')
 zero_rate1 = zero_rate[, -1]

 cash_flow = c(7, 7, 107)

 t = c(1,2,3)


 PV_table = cash_flow / (1+zero_rate1/100)^t

 ## Then using rowSums, I should get the required result. However, I am 
 getting following output as


 PV_table
  
  year_1year_2   year_3
 [1,]  6.756757 6.45078  93.147342
 [2,]  6.515675   94.521493  6.680664
 [3,] 95.8950646.710123  6.357680
 [4,]   6.724304   6.389305 91.773536


 rowSums(PV_table) gives

 [1] 106.35489 107.71783 108.96287 104.88714.

 Thus, the result is correct only in first case. In second case onwards, there 
 is a shift of final cashflow e.g. in 2nd case, pertaining to year 2, value is 
 94.521493 which means it includes principal of 100.

 Likewise
   in third case, the principal is included in the first cash-flow itself. 
 Again the fourth result is correct. So there is pre-shift of capital.

 I am not sure whether I am making any mistake in reading the zero_rate csv 
 file or my formula is incorrect. Please guide.

 Thanking you all in advance

 Sarah


   






   [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] OT: Re: Fwd: R SOFTWARE

2010-08-27 Thread Allan Engelhardt
On 18/08/10 10:46, Nokuzola Simakuhle wrote:
 NB: This email and its contents are subject to the Eskom Holdings Limited 
 EMAIL LEGAL NOTICE

 which can be viewed at http://www.eskom.co.za/email_legalnotice



Since your email policy allows for no copying or distribution of your 
message, you should probably set the X-No-Archive message header.  
Google it, if you don't know how.

I don't actually know if the mailing list honours that header, but at 
least you've tried.  Can't really expect the computer to read and 
understand your PDF with three pages of legalese.

Allan
(who collects these kinds of notices)
|
|

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression on several groups

2010-08-13 Thread Allan Engelhardt

 Please read the posting guide and include a standalone example.

Maybe you want something like the results from

lm(weight ~ Time, data = ChickWeight, subset = Diet==1)
lm(weight ~ Time, data = ChickWeight, subset = Diet==2)
## ... etc ...

Then you could do

(m - lm(weight ~ Time*Diet, data = ChickWeight))

To get the Diet==2 coefficients from above you could use something like

sum(coef(m)[c((Intercept), Diet2)])  # Intercept
sum(coef(m)[c(Time, Time:Diet2)])# Slope

Hope this helps a little.

Allan

On 12/08/2010 17:11, JesperHybel wrote:

I have a simple dataset of a numerical dependent Y, a numerical independent X
and a categorial variable Z with three levels. I want to do linear
regression Y~X for each level of Z. How can I do this in a single command
that is without using lm() applied three isolated times?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [OT] R on Atlas library

2010-08-09 Thread Allan Engelhardt
I don't know about the specific function (a simple, stand-alone 
reproducible example is always helpful, see the posting guide for 
details), but ATLAS certainly uses all my cores on a test like this:


s - 7500  #  Adjust for your hardware
a - matrix(rnorm(s*s), ncol = s, nrow = s)
b - crossprod(a)  #  Uses all cores here.

My configuration of R with ATLAS on Linux (Fedora) is described at 
http://www.cybaea.net/Blogs/Data/Faster-R-through-better-BLAS.html


Maybe your distribution has single-threaded ATLAS and you forgot to 
rebuild it with enable_native_atlas 1 or the equivalent for your platform?


Allan


On 06/08/10 15:12, Matthias Gondan wrote:

Dear List,

I am aware this is slightly off-topic, but I am sure there are people who 
already had the problem and who perhaps solved it.

I am running long-lasting model fits using constrOptim command. At work
there is a linux computer (Quad Core, debian) on which I already have
compiled R and Atlas, in the hope that things will go faster on that
machine.

Atlas offers the possibility to be compiled for multiple cores, I used
that option, but without success. Performance meter (top) for Linux
shows 25% CPU usage, and there is no loss of performance if I run
4 instances of R doing heavy matrix multiplications. Performance drops
when a 5th instance of R is doing the same job, so I assume my attempt
was not successful.

I am sure I did something wrong. Is there anybody who sucessfully
runs R with Atlas and all processors? A short description of the
necessary steps would be helpful.

Searching around the internet was not very encourageing. Some people
wrote that it is not so simple to have Atlas fully exploit a multicore
computer.

I hope this is not too unspecific.

Best wishes,

Matthias




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] About R base

2010-08-09 Thread Allan Engelhardt

Yes, there are differences between R on Windows and R on Linux.

You probably want this document:

http://cran.r-project.org/bin/windows/base/rw-FAQ.html

Hope this helps a little.

Allan

On 06/08/10 16:47, Stephen Liu wrote:

Hi folks,

I have R x64 2.11.1 installed on Win 7 64 bit which is running as VM (guest) on
Oracle VirtualBox.  Is there any difference between it and R on Linux?  I can
install another R on Debian 5.04, also running as VM.  TIA

B.R.
Stephen L




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory use without running gc()

2010-08-09 Thread Allan Engelhardt

How about asking the operating system, e.g.

### Method 1
## Setup
cmd - paste(ps -o vsz, Sys.getpid())
## In your logging routine
z - system(cmd, intern = TRUE)
cat(Virtual size: , z[2], \n, sep = )

### Method 2
## Setup
file - paste(/proc, Sys.getpid(), stat, sep = /)
what - vector(list, 44); what[[23]] - integer(0)
## In your logging routine
vsz - scan(file, what = what, quiet = TRUE)[[23]]/1024
cat(Virtual size: , vsz, \n, sep = )

### Time the methods
library(rbenchmark)
benchmark(scan = scan(file, what = what, quiet = TRUE)[[23]]/1024,
  system = system(cmd, intern = TRUE),
  columns = c(test, replications, elapsed, relative), 
order = relative)

# test replications elapsed relative
# 1   scan  100   0.015   1.
# 2 system  100   2.408 160.5333


The scan method should be plenty fast.  Change as appropriate if your 
definition of memory is different from virtual size.


Hope this helps a little.

Allan


On 06/08/10 19:11, Johann Hibschman wrote:

jim holtmanjholt...@gmail.com  writes:

   

?memory.size
 

Only works on Windows.  I guess I should have specified; this is on Linux.

Thanks,
Johann

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] REmove level with zero observations

2010-08-05 Thread Allan Engelhardt

On 03/08/10 21:50, GL wrote:

If I have a column with 2 levels, but one level has no remaining
observations. Can I remove the level?
   


Like this?

d - data.frame(a = factor(rep(A, 3), levels = c(A, B)))
levels(d$a)
# [1] A B
d$a - d$a[,drop=TRUE]
levels(d$a)
# [1] A


Hope this helps

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Is there paid support for R?

2010-08-02 Thread Allan Engelhardt

On 02/08/10 13:17, Chris Murphy wrote:

Hi,

I am trying to get R installed at work and I was asked if there were any
companies that offered support.
   


There are.  What country are you looking to support?  What do you mean 
by support?  (It can be anything from installation through SLAs for 
help on specific packages to someone we can sue)


Allan.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply for logical conditions

2010-08-02 Thread Allan Engelhardt

`|` is a binary operator which is why the apply will not work.  See

help(Reduce)

For example,

set.seed(1)
data - data.frame(A = runif(10)  0.5, B = runif(10)  0.5, C = 
runif(10)  0.5)

Reduce(`|`, data)
#  [1]  TRUE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

Hope this helps.

Allan

On 02/08/10 21:35, Alastair wrote:

Hi,

I've got some boolean data in a data.frame in the form:
   XYZA   B   C
[1]  T TFT   F   F
[2]  F TTF   F   F
.
.
.


What I want to do is create a new column which is the logical disjunction of
several of the columns.
Just like:

new.column- data$X | data$Y | data$Z

However I don't want to hard code the particular columns into the expression
like that. I've tried using apply row wise with `|` as the function:

columns- c(X,Y,Z)
apply(data[,columns], 1,`|`)

This doesn't seem to do what I would have expected, does anyone have any
advice how to use the the apply or similar function to perform a boolean
operation on each row (and a specific subset of the columns) in a data
frame?

Thanks,
Alastair





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading timestamp column from MySQL

2010-07-29 Thread Allan Engelhardt



On 29/07/10 00:20, harsh yadav wrote:

Hi,

I am reading a SQL (MySQL) table in R data frame.

When I read in the table that has a timestamp data-type field, R gives it
the following format:-

1.236887e+12

So when I want to manipulate a column with timestamp = 1236887146615

It returns me multiple rows, as many timestamps gets converted to same
value: 1.236887e+12

Any ideas of how this could be dealt with, so that I can get the entire
timestamp field.
   


It should just be a printing issue, not a problem with the data, in 
which case you can just run you analysis as usual.  If it *is* an issue 
with the data, and not the printing, then we probably need more 
information, see the posting guide.


If you are worried about the formatting, look at options(scipen); try 
a value of 5 or so to get you started.



[*Off topic*:
why does
options(scipen = 2^31-13); print(1236887146615);
produce different output [1] from
options(scipen = 2^31-12); print(1236887146615);
and why does the rather intuitive approach suggested by one of my 
colleagues,

options(scipen = +Inf); print(1236887146615);
produce warnings (and why does warnings() then produce even more 
warnings, ad infinitum)? ]


Hope this helps,


Allan

[1] at least under trunk and also 2.11.1

 sessionInfo()
R version 2.12.0 Under development (unstable) (2010-07-28 r52631)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_GB.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_GB.UTF-8LC_COLLATE=en_GB.UTF-8
 [5] LC_MONETARY=C  LC_MESSAGES=en_GB.UTF-8
 [7] LC_PAPER=en_GB.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] ctv_0.6-0

loaded via a namespace (and not attached):
[1] tools_2.12.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Function to return variable name

2010-07-29 Thread Allan Engelhardt

On 29/07/10 04:21, Jeremy Miles wrote:

I'd like a function that returns the variable name.

As in:

MyData$Var1

Would return:

Var1
   


Not quite sure what you mean, but does this get you started?

nn - function(x) deparse(substitute(x))
str( z - nn(airquality$Month) )
#  chr airquality$Month
## Not sure what now - maybe:
strsplit(z, $, fixed = TRUE)[[1]][-1]
# [1] Month

You'll have to figure out with a variable called foo$bar$baz and with 
airquality[Month] etc.


Hope this helps a little.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Checking package licences including dependencies?

2010-07-27 Thread Allan Engelhardt
I only recently discovered options(available_packages_filters = 
list(add = TRUE, license/FOSS)) [cf. help(available.packages, 
package=utils) in R 2.10.0 or later] which goes nicely with my 
options(checkPackageLicense = TRUE) [new in R 2.11].


But now I want to purge my library of packages I would not have 
installed had I known about this option earlier (I'm looking at you, 
gam!).


Short of erasing the whole directory of libraries and re-installing it, 
is there an easy way of achieving this?


I could probably roll something based on tools:::analyze_license() but I 
think the erase-and-reinstall option might be easier in this case :-)


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Checking package licences including dependencies?

2010-07-27 Thread Allan Engelhardt

On 27/07/10 13:20, Ben Bolker wrote:

Allan Engelhardtallaneat  cybaea.com  writes:

   

I only recently discovered options(available_packages_filters =
list(add = TRUE, license/FOSS)) [cf. help(available.packages,
package=utils) in R 2.10.0 or later] which goes nicely with my
options(checkPackageLicense = TRUE) [new in R 2.11].

But now I want to purge my library of packages I would not have
installed had I known about this option earlier (I'm looking at you,
gam!).
 

  What's the problem with gam? The ACM license on akima (on which it
depends) that requires permission/licensing for
commercial use?
   


In a word: yes.  It depends on a non-FOSS package.  This is what the new 
options protect us against.  For this particular project, I want to 
avoid all non-FOSS dependencies.


[OT: Specifically for gam, I am not sure it is really OK to release it 
as GPL-2 when it depends on a non-free library that, as far as I can 
see, does not meet the definition of a system library; see 
http://www.gnu.org/licenses/gpl-faq.html#GPLIncompatibleLibs for a 
discussion.  But IANAL and for this project is is a moot point: I want 
to be clean.]


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Checking package licences including dependencies?

2010-07-27 Thread Allan Engelhardt

Great, thanks.  I couldn't quite get your syntax to work, but

z - packageStatus(.libPaths()[1])[[1]]
unname( z$Package[z$Status == unavailable] )

seems to do the trick for me.

Thanks again.

Allan

On 27/07/10 16:31, Prof Brian Ripley wrote:

If I understand you correctly, set the filter and use packageStatus().
Its summary() method tells you which packages you have installed which 
are 'unavailable'.  E.g. my Mac (with pkgType = source) shows in 
R-devel


summary(packageStatus(.libPaths()[1]))$Libs[[1]]$unavailable
 [1] BayesXEVER  TSA   akima degreenet difR
 [7] ergm  ffgam   isa2  latentnet locfit
[13] mapproj   rtiff statnet   tripack

On Tue, 27 Jul 2010, Allan Engelhardt wrote:

I only recently discovered options(available_packages_filters = 
list(add = TRUE, license/FOSS)) [cf. help(available.packages, 
package=utils) in R 2.10.0 or later] which goes nicely with my 
options(checkPackageLicense = TRUE) [new in R 2.11].


But now I want to purge my library of packages I would not have 
installed had I known about this option earlier (I'm looking at you, 
gam!).


Short of erasing the whole directory of libraries and re-installing 
it, is there an easy way of achieving this?


I could probably roll something based on tools:::analyze_license() 
but I think the erase-and-reinstall option might be easier in this 
case :-)


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] the real dimnames

2010-07-26 Thread Allan Engelhardt

On 26/07/10 19:01, Michael Lachmann wrote:

Hi,

R seems to have a feature that isn't used much, which I don't really 
now how to call. But, the dimnames function, can in addition to giving 
names to rows/columns/dim 3 rows/dim 4 rows... can also give labels to 
the dimensions themselves.


Thus, I can do:
A = matrix(1:9,3,3)

dimnames(A) = list(from=c(), to=c() )


If you just want to set the names of the dimnames, just do

names(dimnames(A)) - c(from, to)

Remember that dimnames is a list and all lists have names.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] start and end times to yes/no in certain intervall

2010-07-23 Thread Allan Engelhardt

I like loops for this kind of thing so here is one:

df- structure(list(start = c(15:00, 15:00, 15:00, 11:00,
14:00, 14:00, 15:00, 12:00, 12:00, 12:00, 12:00,
12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00,
12:00, 12:00), end = c(16:00, 16:00, 16:00, 12:00,
16:00, 15:00, 16:00, 13:00, 13:00, 13:00, 13:00,
13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00,
13:00, 13:00)), .Names = c(start, end), row.names = c(NA,
20L), class = data.frame)

duration- with(df, floor(
  as.numeric(
   difftime(strptime(end, format=%H:%M),
strptime(start, format=%H:%M),
units = hours

start- as.integer(substr(df$start, 1, 2))

a- matrix(FALSE, nrow = NROW(df), ncol = 24,
dimnames = list(1:NROW(df), paste(t, 0:23, sep = )))
for (i in 1:NROW(df)) {
names- paste(t, seq(start[i], by = 1L, length.out = duration[i]), sep = 
)
a[i, names]- TRUE
}
r- range(which(apply(a, 2, any)))
a- a[, r[1]:r[2]]  #  Drop columns we do not need
head(a)
# t11   t12   t13   t14   t15
# 1 FALSE FALSE FALSE FALSE  TRUE
# 2 FALSE FALSE FALSE FALSE  TRUE
# 3 FALSE FALSE FALSE FALSE  TRUE
# 4  TRUE FALSE FALSE FALSE FALSE
# 5 FALSE FALSE FALSE  TRUE  TRUE
# 6 FALSE FALSE FALSE  TRUE FALSE



Hope this helps a little.

Allan


On 23/07/10 11:02, Stefan Uhmann wrote:

Hi List,

I have start and end times of events

structure(list(start = c(15:00, 15:00, 15:00, 11:00,
14:00, 14:00, 15:00, 12:00, 12:00, 12:00, 12:00,
12:00, 12:00, 12:00, 12:00, 12:00, 12:00, 12:00,
12:00, 12:00), end = c(16:00, 16:00, 16:00, 12:00,
16:00, 15:00, 16:00, 13:00, 13:00, 13:00, 13:00,
13:00, 13:00, 13:00, 13:00, 13:00, 13:00, 13:00,
13:00, 13:00)), .Names = c(start, end), row.names = c(NA,
20L), class = data.frame)

and I would like the data to look like this:


  t9   t10   t11   t12   t13   t14   t15   t16   t17
1  FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
2  FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
3  FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
4  FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE
5  FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE FALSE FALSE
6  FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE
7  FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE
8  FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
9  FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE
10 FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE


Which means, that I just get a TRUE for every hour the event was 
taking place. A finishing time of 16:00 means that t16 is FALSE, 
because the event was finished until 16:00; 16:15 as end time would 
result in t16 being TRUE.
It would be nice if the function would add the variables needed (t9 
..) as well and depending on the times put in (no t9 if there is no 
event starting before 10:00).


Thanks for any suggestion,
Stefan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Regex in an IF statement?

2010-07-23 Thread Allan Engelhardt

help(difftime) is probably what you want.

If not, please post a standalone example

Allan

On 23/07/10 15:18, Jim Hargreaves wrote:

Dear List

I have the POSIX timestamps of two events, A and B respectively. I 
want an expression that tests if A occurs within 4 weeks of B.


Something like:

if (ts_A is between ts_B + 4(weeks) and ts_B - 4) ...

Can probably done by regex but the regex docu for R says nothing about 
if statements.


Any help greatly appreciated,

Regards,
Jim Hargreaves

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] converting a time to nearest half-hour

2010-07-23 Thread Allan Engelhardt
The arithmetic that David describes should work fine (POSIXct is 
internally in UTC) unless you are in the Chatham Islands (which has a 
UTC+12:45 time zone [1]) or Nepal (UTC+05:45 [2]) or some such place 
with a funny idea about when the 1/2 hour mark is.



The formatting of the output may be tricky, if the original poster 
really want the (past/future) time zone as opposed to the current one, 
which is what his text could be read to suggest.


Good test cases would be 2010-03-28 01:46:00 GMT which should 
(presumably) round to 2010-03-28 03:00:00 BST and 2010-10-31 02:46 
BST which probably rounds to 2010-10-31 02:00 GMT.  I can't quite get 
it to work in R, but it should be possible



Allan

[1] https://secure.wikimedia.org/wikipedia/en/wiki/UTC%2B12:45
[2] https://secure.wikimedia.org/wikipedia/en/wiki/UTC%2B05:45

On 23/07/10 16:35, David Winsemius wrote:


On Jul 23, 2010, at 11:20 AM, murali.me...@avivainvestors.com 
murali.me...@avivainvestors.com wrote:



Hi folks,

I've got a POSIXct datum as follows:


Sys.time()

[1] 2010-07-23 11:29:59 BST

I want to convert this to the nearest half-hour, i.e., to 2010-07-23 
11:30:00 BST


(If the time were 11:59:ss, I want to convert to 12:00:00).

How to achieve this?


Couldn't you just coerce to numeric, divide by 60(sec)*30(half-hour 
minutes), round to integer, multiply by 60*30,  coerce to POSIXct?


_
David Winsemius, MD
West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to deal with more than 6GB dataset using R?

2010-07-23 Thread Allan Engelhardt
read.table is not very inefficient IF you specify the colClasses= 
parameter.  scan (with the what= parameter) is probably a little more 
efficient.  In either case, save the data using save() once you have it 
in the right structure and it will be much more efficient to read it 
next time.  (In fact I often exit R at this stage and re-start it with 
the .RData file before I start the analysis to clear out the memory.)


I did a lot of testing on the types of (large) data structures I 
normally work with and found that options(save.defaults = 
list(compress=bzip2, compression_level=6, ascii=FALSE)) gave me the 
best trade-off between size and speed.  Your mileage will undoubtedly 
vary, but if you do this a lot it may be worth getting hard data for 
your setup.


Hope this helps a little.

Allan

On 23/07/10 17:10, babyfoxlo...@sina.com wrote:

nbsp;Hi there,

Sorry to bother those who are not interested in this problem.

I'm dealing with a large data set, more than 6 GB file, and doing regression 
test with those data. I was wondering are there any efficient ways to read 
those data? Instead of just using read.table()? BTW, I'm using a 64bit version 
desktop and a 64bit version R, and the memory for the desktop is enough for me 
to use.
Thanks.


--Gin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to deal with more than 6GB dataset using R?

2010-07-23 Thread Allan Engelhardt

On 23/07/10 17:36, Duncan Murdoch wrote:

On 23/07/2010 12:10 PM, babyfoxlo...@sina.com wrote:

[...]


You probably won't get much faster than read.table with all of the 
colClasses specified.  It will be a lot slower if you leave that at 
the default NA setting, because then R needs to figure out the types 
by reading them as character and examining all the values.  If the 
file is very consistently structured (e.g. the same number of 
characters in every value in every row) you might be able to write a C 
function to read it faster, but I'd guess the time spent writing that 
would be a lot more than the time saved.


And try the utils::read.fwf() function before you roll your own C code 
for this use case.


If you do write C code, consider writing a converter to .RData format 
which R seems to read quite efficiently.


Hope this helps.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] sparsity

2010-07-23 Thread Allan Engelhardt
The functions {summary,bandwidth}.rq() in the package quantreg looks 
promising, but it is not really my area...


http://lmgtfy.com/?q=Hall-Sheater+sparsity+%22r-project%22

Hope this helps a little.

Allan

On 23/07/10 21:02, Nathalie Gimenes wrote:

Hi All,

I would like to estimate the sparsity function using Hall Sheater
bandwidth.

Is there any command for this?

Thanks,

NGS

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] please help me for this Error: cannot listen to TCP port 15259

2010-07-22 Thread Allan Engelhardt
The short-term work-around is probably to use the help_type= argument to 
help, e.g.


help(help, help_type=text)

Hope this helps a little.

Allan

On 22/07/10 07:30, Na.Ebrahimi wrote:

hi,
I get an Error when I try to run function help in R,
it says
  starting httpd help server ...Error: cannot listen to TCP port 15259
I would be so glad if you help me




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Drop firms in unbalanced panel if not more than 5 observations in consecutive years for all variables

2010-07-22 Thread Allan Engelhardt
One option:

/## Your data:/
data - structure(list(id = structure(c(1L, 2L, 2L, 3L, 3L, 3L, 3L, 4L,
4L, 4L, 4L, 4L), .Label = c(a, b, c, d), class = factor),
 year = c(2000L, 2000L, 2001L, 1999L, 2000L, 2001L, 2002L,
 1998L, 1999L, 2000L, 2001L, 2002L), y = c(1L, NA, 3L, 1L,
 2L, 4L, 5L, 6L, 5L, 6L, 7L, 3L), z = c(1L, 2L, 3L, 1L, 2L,
 NA, 4L, 5L, NA, 6L, 7L, 6L)), .Names = c(id, year, y,
z), class = data.frame, row.names = c(1, 2, 3, 4,
5, 6, 7, 8, 9, 10, 11, 12))
/## Drop missing data/
Data - na.omit(data)
/## Find three years in a row with data/
n - names(which(apply(table(Data$id, Data$year), 1, function(x) 
{z-rle(x); z-z[z$values0]; any(z$lengths=3)})))
/## Now get the original data for these ids/
data[data$id %in% n, ]
/#id year y  z
# 8   d 1998 6  5
# 9   d 1999 5 NA
# 10  d 2000 6  6
# 11  d 2001 7  7
# 12  d 2002 3  6/

This is probably not optimal, but I hope it helps a little.

Allan

On 22/07/10 10:18, Christian Schoder wrote:
 Dear R-user,

 a few weeks ago I consulted the list-serve with a similar question.
 However, my task changed a little but sufficiently to get lost again. So
 I would appreciate any help on the following issue.

 I use the plm package and work with firm-level data in a panel. I would
 like to eliminate all firms that do not fulfill the requirement of
 having an observation in every variable used for at least x consecutive
 years.

 For illustration of the problem assume the following data set

 data
  
 id year  y  z
 1   a 2000  1  1
 2   b 2000 NA  2
 3   b 2001  3  3
 4   c 1999  1  1
 5   c 2000  2  2
 6   c 2001  4 NA
 7   c 2002  5  4
 8   d 1998  6  5
 9   d 1999  5 NA
 10  d 2000  6  6
 11  d 2001  7  7
 12  d 2002  3  6
 where id is the index of the firm, year the index for the year, and y
 and z are variables. Now, I would like to get rid of all firms with,
 let's say, less than 3 consecutive years in which there are observations
 for every variable. Hence, the procedure should yield

 data.reduced
  
 id year  y  z
 1   d 1998  6  5
 2   d 1999  5 NA
 3   d 2000  6  6
 4   d 2001  7  7
 5   d 2002  3  6

 Thank you very much for any help!

 Cheers, Christian

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SQL/R

2010-07-22 Thread Allan Engelhardt

There are so many ways  Here is one:

aggregate(v ~ u, data=X, function(...) length(unique(...)))
#u v
# 1 T1 2
# 2 T2 1

Hope this helps

Allan.

On 22/07/10 12:52, Gildas Mazo wrote:

Dear R users,

I want to aggregate data in the following way:

###

X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a))
X
library(sqldf)
sqlOut- sqldf(select count(distinct(v)) from X group by u)
sqlOut

###

Now I want to get the same result without using SQL. How can I achieve
that ?

Thanks for your help,

Gildas

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave special characters problem

2010-07-22 Thread Allan Engelhardt
A standalone example is always helpful.  The following works for me, so 
I am probably not understanding your problem:


---[BEGIN: umlaut.Rnw]---
\documentclass[a4paper]{article}
\usepackage{ucs}
\usepackage[utf8x]{inputenc}

\begin{document}

test,echo=TRUE=
x - data.frame(Geschäftslage=1:10)
summary(x)
@
\end{document}
---[END: umlaut.Rnw]---

$ R CMD Sweave umlaut.Rnw
$ R CMD pdflatex umlaut.tex
$ gnome-open umlaut.pdf

Allan


On 22/07/10 13:19, Bunny, lautloscrew.com wrote:

Dear all,

I use Sweave to create my reports. Unfortunately my script crashes whenever I 
my R code contains special characters like umlauts.
Is there a way to to escape special characters in Sweave... This is the line 
that crashes Sweave:

gl_bybranch = ddply(new_wans,.(period,Branchen), function(X) 
data.frame(Geschäftslage=mean(X$sentiment)))

Unfortunately I can't just rename it, because I it´s displayed in the legend of 
graphics later on.

Thx for any suggestions!

best

matt

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Recommended way of requiring packages of a certain version?

2010-07-16 Thread Allan Engelhardt
What is the recommended way of requiring a certain version when loading 
a package (or, indeed, from R itself)?


Perl has the

require module version
use module version
require version
use version

constructs which is kind of what I am looking for (especially 'use' 
which is evaluated at compile time), but R seems to have lost the 
version= argument to require().


The best I have been able to come up with are constructs of the form

if ( utils::compareVersion(utils::packageDescription(data.table, 
fields=Version), 1.5)  0 ) stop(Need data.table version 1.5 or later)
if ( utils::compareVersion(as.character(getRversion()), 2.11) = 0 ) 
stop(Does not work yet with latest R.)


But this is tedious and error prone.  I have created my own require() 
function to automate this (it takes a vector of versions and a vector of 
comparisons and only load the package if they are all met), but it is 
non-standard and just that much harder for my colleagues to maintain. 
Somebody must already have done this?


Allan

PS: Shouldn't utils::compareVersion(2.11.0, 2.11) return zero 
instead of one?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: coeficient of determination

2010-07-16 Thread Allan Engelhardt

Try

RSiteSearch(coefficient of determination)

which seems to list some promising candidates.

Allan

On 15/07/10 16:47, Alireza Ehsani wrote:

Hello there
how can i calculate coefficient of determination with R?
Kind regards

Ali Reza Ehsani
Ph.d. student




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Recommended way of requiring packages of a certain version?

2010-07-16 Thread Allan Engelhardt

On 16/07/10 09:32, Paul Hiemstra wrote:

Hi Allan,

When you create an R package you can specify in the DESCRIPTION file 
that your package depends on a certain R version and versions of 
packages. For example:

[...]
Depends: R (= 2.7.0), methods, sp (= 0.9-4), gstat (= 0.9-58)


Thanks Paul, this is helpful.  It doesn't quite do what I want because 
support for operators other than `=` is erratic (the documentation at 
[1] seems to suggest that R CMD INSTALL supports any operator while 
install.packages() only supports `=`).


Also I am not sure if it is checked on every package load?


[...]
So distributing code to other people is preferably done using R 
packages, which gives you this option.


Agreed in principle (and that is kind of what I am developing), but 
sometimes you just want to send a piece of analysis you are working on.


But thanks for the pointer.


Allan

[1] http://cran.r-project.org/doc/manuals/R-exts.html#fn-2

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [R-pkgs] New package list for analyzing list surveyexperiments

2010-07-15 Thread Allan Engelhardt

On 13/07/10 19:16, Erik Iverson wrote:

Raubertas, Richard wrote:
I agree that 'list' is a terrible package name, but only secondarily 
because it is a data type.  The primary problem is that it is so generic


as to be almost totally uninformative about what the package does.
For some reason package writers seem to prefer maximally 
uninformative names for their packages.  To take some examples of 
recently announced packages, can anyone guess what packages 'FDTH', 
'rtv', or 'lavaan' do?  Why the aversion to informative names along 
the lines of
'Freq_dist_and_histogram', 'RandomTimeVariables', and 
'Latent_Variable_Analysis', respectively? 


I'm sure it's part tradition...

ls
cat
rm
cp
mv
su


No need to leave R, which has some 9 one-letter symbols, 35 two-letter 
symbols, and 121 three letter symbols in the base and core packages.  It 
is just unreal.  (The equivalent numbers from my Linux system are 3, 56, 
and 148, and one of those one-letter commands is R!!)


Quiz yourself on these:

c C D F I q s t T

ar as bs by cm de df dt el gc gl if Im is 
lh lm ls lu ns pf pi pt qf qq qr qt Re rf 
rm rt sd te ts VA vi


abs acf ACF AIC all aml any aov Arg ave bam bcv 
bdf BIC bmp BOD box bxp cat cav ccf cch cd4 cgd 
co2 CO2 col cor cos cov cut DDT det
dim Dim dir end exp fft fgl fir fix for gam get 
glm gls Gun hat hcl hsv IGF IQR Kfn knn lag lcm 
lda lme log lqs mad Map max mca min mle Mod new 
nlm nls npk nsl OME one Ops pam par pbc PBG pdf 
pie png ppr qda raw rep rev rfs rgb rig rle rlm 
row rug seq sin SOM SSD SSI stl str sub sum svd 
svg tan tar tau tcl tmd try tsp two ucv unz url 
var x11 X11 xor



Generated from R --vanilla with:

for (p in c(base, boot, class, cluster, codetools, datasets, 
foreign, graphics, grDevices, grid, KernSmooth, lattice, 
MASS, Matrix, methods, mgcv, nlme, nnet, rpart, spatial, 
splines, stats, stats4, survival, tcltk, tools, utils)) 
library(p, character.only=TRUE)

rm(p)
one - unique(grep(^[[:alnum:]]+$, apropos(^.$), value=TRUE))
two - unique(grep(^[[:alnum:]]+$, apropos(^..$), value=TRUE))
three - unique(grep(^[[:alnum:]]+$, apropos(^...$), value=TRUE))

and from the bash shell with

ls -1 {/usr,}/bin/? 2/dev/null
ls -1 {/usr,}/bin/?? | perl -ne 'print substr $_,-3' | sort -u | wc -l
ls -1 {/usr,}/bin/??? | perl -ne 'print substr $_,-4' | sort -u | wc -l

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SARIMA model

2010-07-15 Thread Allan Engelhardt

Try

RSiteSearch(SARIMA)

Allan

On 13/07/10 14:15, FMH wrote:

Dear All,

Could someone please advice me the appropriate package for fitting the SARIMA
model?


Thanks
Fir




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] question regarding varImpPlot results vs. model$importance data on package RandomForest

2010-07-15 Thread Allan Engelhardt
Use the source, Luke. varImpPlot calls 
randomForest:::importance.randomForest (yeah, that is three colons) and 
reading about the scale= parameter in help(importance, 
package=randomForest) should enlighten you.  For the impatient, try


varImpPlot(mtcars.rf, scale=FALSE)


Hope this helps a little.

Allan

On 14/07/10 00:46, Mike Williamson wrote:

Hi everyone,

 I have another Random Forest package question:

- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the % increase in MSE and IncNodePurity exactly as can be
found from the importance section of the model results.
   - However, the plot does not, in fact, match the importance section
   of the random forest model.

 E.g., if you use the example given in the ?randomForest, you will see
the plot showing the highest few %IncMSE values around 17 or 18%.  But if
you look at the $importance, it is 9.7, 9.4, 7.7, and 7.3.  Perhaps more
importantly, for the plot, it will show wt is highest %MSE, then disp,
then cyl, then hp; whereas the $importance will show wt, then disp,
then hp, then cyl.  And the ratios look somewhat different, too.
 Here is the code for that example:

set.seed(4543)
data(mtcars)
mtcars.rf- randomForest(mpg ~ ., data=mtcars, ntree=1000,
keep.forest=FALSE,
importance=TRUE)
varImpPlot(mtcars.rf)

 I am using version 2.11.1 of 'R' and version 4.5-35 of Random Forest.

 I don't really care or need for the varImpPlot to work just right.  But
I am not sure which is accurate:  the varImpPlot or the $importance
section.  Which should I trust more, especially when they disagree
appreciably?

  Thanks!
  Mike



Telescopes and bathyscaphes and sonar probes of Scottish lakes,
Tacoma Narrows bridge collapse explained with abstract phase-space maps,
Some x-ray slides, a music score, Minard's Napoleanic war:
The most exciting frontier is charting what's already here.
   -- xkcd

--
Help protect Wikipedia. Donate now:
http://wikimediafoundation.org/wiki/Support_Wikipedia/en

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Convergent series

2010-07-15 Thread Allan Engelhardt
Not 100% if this is what you are looking for, but maybe Reduce(+, x) 
will do it?  E.g.


Reduce(+, 1/2^(1:10))
# [1] 0.9990234


Hope this helps.

Allan

On 14/07/10 11:57, David Bickel wrote:
What are some reliable R functions that can compute the value of a 
convergent series?


David



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] qplot in ggplot2 not working any longer - (what did I do?)

2010-07-15 Thread Allan Engelhardt
The problem is that install.packages() and friends forks R and there is 
no way to specify general options to that process (you can do the 
configure.{args,vars} but nothing else).


One work-around could be to download the package and install it with R 
--vanilla CMD INSTALL from the command line.


But it is probably better to fix your .Rprofile so it doesn't break the 
install; see interactive() and commandArgs() for functions that may be 
useful.


Hope this helps a little.

Allan

On 14/07/10 22:03, stephen sefick wrote:

This is the first time that I have tried to update packages with a
tinkered around with .Rprofile.  I start R with R --vanilla and it
does not load my .Rprofile, but when I issue the command
update.packages() R downloads the packages as expected, but then seems
to load .Rprofile before compiling the packages sources.  What am I
doing wrong?
kindest regards,

Stephen Sefick

see- Session info output and .Rprofile

Ubuntu 10.04
R 2.11.1

Session info:

R version 2.11.1 (2010-05-31)
x86_64-pc-linux-gnu


locale:
  [1] LC_CTYPE=en_US.utf8   LC_NUMERIC=C
  [3] LC_TIME=en_US.utf8LC_COLLATE=en_US.utf8
  [5] LC_MONETARY=C LC_MESSAGES=en_US.utf8
  [7] LC_PAPER=en_US.utf8   LC_NAME=C
  [9] LC_ADDRESS=C  LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.utf8 LC_IDENTIFICATION=C

attached base packages:
[1] graphics  grDevices utils datasets  stats grid  methods
[8] base

other attached packages:
  [1] gpclib_1.5-1StreamMetabolism_0.03-3 maptools_0.7-34
  [4] lattice_0.18-8  sp_0.9-64   foreign_0.8-40
  [7] chron_2.3-35zoo_1.6-3   vegan_1.17-3
[10] ggplot2_0.8.8   proto_0.3-8 reshape_0.8.3
[13] plyr_0.1.9


.Rprofile:
#source USGS graphing function for base data
source(~/R_scripts/USGS.R)
source(~/R_scripts/publication_ggplot2_theme.R)
source(~/R_scripts/llScript.R)


#set help_type
options(help_type=html)

#exit to get around anoying q behavior
exit- function(save=no){q(save=save)}

#most used libraries
library(ggplot2)
library(vegan)
library(StreamMetabolism)

#allow gpclib package to be used
gpclibPermit()






__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] taking daily means from hourly data

2010-07-14 Thread Allan Engelhardt
This is one way:

df- data.frame(Time=as.POSIXct(2009-01-01, format=%Y-%m-%d) + seq(0, 
60*60*24*365-1, 60*60),
  lev.morgan=3+runif(24*365),
  lev.lock2=3+runif(24*365),
  flow=1000+rnorm(24*365, 200),
  direction=runif(24*365, 0, 360),
  velocity=runif(24*365, 0, 10))
(df2- aggregate(df[ , -1], list(date=as.Date(df$Time)), FUN=mean, na.rm=TRUE))



Hope this helps

Allan


On 15/07/10 05:52, Meissner, Tony (DFW) wrote:
 I have a data frame (morgan) of hourly river flow, river levels and wind 
 direction and speed thus:
   Time   hour lev.morgan lev.lock2 lev.lock1 flow   direction 
  velocity
 1  2009-07-06 15:00:00   15  3.266 3.274 3.240 1710.6   180.282   
  4.352
 2  2009-07-06 16:00:00   16  3.268 3.272 3.240 1441.8   192.338   
  5.496
 3  2009-07-06 17:00:00   17  3.268 3.271 3.240 1300.1   202.294   
  2.695
 4  2009-07-06 18:00:00   18  3.267 3.274 3.241 1099.1   237.161   
  2.035
 5  2009-07-06 19:00:00   19  3.265 3.277 3.243  986.6   237.576   
  0.896
 6  2009-07-06 20:00:00   20  3.266 3.281 3.242 1237.6   205.686   
  1.257
 7  2009-07-06 21:00:00   21  3.267 3.280 3.242 1513.326.080   
  0.664
 8  2009-07-06 22:00:00   22  3.267 3.281 3.242 1819.5   264.280   
  0.646
 9  2009-07-06 23:00:00   23  3.267 3.281 3.242 1954.4   337.137   
  0.952
 10 2009-07-07 00:00:000  3.267 3.281 3.242 1518.9   260.006   
  0.562
 11 2009-07-07 01:00:001  3.267 3.281 3.242 1082.6   252.172   
  0.673
 12 2009-07-07 02:00:002  3.267 3.280 3.243 1215.9   190.007   
  1.286
 13 2009-07-07 03:00:003  3.267 3.279 3.244 1093.5   260.415   
  1.206
 : :   :   :   :  : :: 
 :
 : :   :   :   :  : :: 
 :

 Time is of class POSIXct
 I wish to take daily means of the flow, levels, and wind parameters and put 
 them into a new dataframe.  I envisage doing this with the following example 
 code:

 morgan$fTime- factor(substr(as.character(morgan$Time),1,10))
 dflow- tapply(morgan[,flow], morgan$fTime, mean)
 day- tapply(morgan[,Time], morgan$fTime, mean)
:
:

 daily- as.data.frame(cbind(day,dflow, dlev.morg,dlev.lock2, ...))
 daily$day- with(daily, as.POSIXct(1970-01-01, %Y-%m-%d, 
 tz=Australia/Adelaide) + day)
 rownames(daily)- NULL

 Is there a more efficient way of doing this?  I am running R-2.11.0 under 
 Windows XP

 Tschüß
 Tony Meissner
 Principal Scientist (Monitoring)
 Resources Monitoring Group
 Science, Monitoring and Information Division
 Department for Water
 Imagine ©
 *(ph) (08) 8595 2209
 *(mob) 0401 124 971
 *(fax) (08) 8595 2232
 * 28 Vaughan Terrace, Berri SA 5343
  PO Box 240, Berri SA 5343
  DX 51103
 ***The information in this e-mail may be confidential and/or legally 
 privileged.  Use or disclosure of the information by anyone other than the 
 intended recipient is prohibited and may be unlawful.  If you have received 
 this e-mail in error, please advise by return e-mail or by telephoning +61 8 
 8595 2209




   [[alternative HTML version deleted]]




 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply is slower than for loop?

2010-07-12 Thread Allan Engelhardt
On 12/07/10 08:16, Liviu Andronic wrote:
 On Fri, Jul 9, 2010 at 9:11 PM, Gene Leynesgleyne...@gmail.com  wrote:

 [...]
 Check Rnews for an article discussing proper usage of apply and for.
 Liviu


I am guessing you are referencing

@article{Rnews:Ligges+Fox:2008  
http://cran.r-project.org/doc/Rnews/bib/Rnews.html#Rnews:Ligges+Fox:2008,
   author = {Uwe Ligges and John Fox},
   title = {{{R} {H}elp {D}esk}: {H}ow Can {I} Avoid This Loop or Make It 
Faster?},
   journal = {R News},
   year = 2008,
   volume = 8,
   number = 1,
   pages = {46--50},
   month = {May},
   url = {http://CRAN.R-project.org/doc/Rnews/},
   pdf = {http://CRAN.R-project.org/doc/Rnews/Rnews_2008-1.pdf}
}


Allan


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply is slower than for loop?

2010-07-10 Thread Allan Engelhardt



On 09/07/10 21:19, Duncan Murdoch wrote:

On 09/07/2010 4:11 PM, Gene Leynes wrote:

I thought the apply functions are faster than for loops, but my most
recent test shows that apply actually takes a significantly longer 
than a

for loop.  Am I missing something?


Probably not.  apply() needs to figure out the shape of the results it 
gets from each row in order to put them into the final result matrix.  
You know that in advance, and set up the result to hold them, so your 
calculation would be more efficient.


Plus, in the way it is set up, apply has to make a much larger temporary 
matrix.  If you do


ds1- 1+ds
library(rbenchmark)
benchmark(apply=apply(1+ds, 1, cumprod),
  forloop=for (i in 1:nrow(ds)){ y2[i,]-cumprod(1+ds[i,]) },
  apply2=apply(ds1, 1, cumprod),
  replications=1, columns=c(test,elapsed,relative,user.self,sys.self), 
order=elapsed)
#  test elapsed relative user.self sys.self
# 2 forloop   1.863 1.00 1.8610.000
# 3  apply2   2.175 1.167472 1.9340.239
# 1   apply   2.443 1.311326 2.1080.334



you can sense some of the impact of that.

But if it is speed you are after, sapply may be even faster (you'll need 
to t() the result again):


benchmark(forloop=for (i in 1:nrow(ds)){ y2[i,]-cumprod(1+ds[i,]) },
  sapply={dsone-1+ds;sapply(1:NROW(dsone), function(i) 
cumprod(dsone[i,]))},
  replications=1, columns=c(test,elapsed,relative,user.self,sys.self), 
order=elapsed)
#  test elapsed relative user.self sys.self
# 2  sapply   1.539 1.00 1.3000.239
# 1 forloop   1.878 1.220273 1.8780.000
zz- sapply(1:NROW(ds1), function(i) cumprod(dsone[i,]))
identical(t(zz), y2)
# [1] TRUE


Hope this helps a little.

Allan



The *apply functions are designed to be convenient and clear to read, 
not necessarily fast.


Duncan Murdoch

It doesn't matter much if I do column wise calculations rather than 
row wise


## Example of how apply is SLOWER than for loop:

#rm(list=ls())

## DEFINE VARIABLES
mu=0.05 ; sigma=0.20 ; dt=.25 ; T=50 ; sims=1e5
timesteps = T/dt

## MAKE PHI AND DS
phi = matrix(rnorm(timesteps*sims), nrow=sims, ncol=timesteps)
ds = mu*dt + sigma * sqrt(dt) * phi

## USE APPLY TO CALCULATE ROWWISE CUMULATIVE PRODUCT
system.time(y1 - apply(1+ds, 1, cumprod))
## UNTRANSFORM Y1, BECAUSE ROW APPLY FLIPS THE MATRIX
y1=t(y1)

## USE FOR LOOP TO CALCULATE ROWWISE CUMULATIVE PRODUCT
y2=matrix(NA,nrow(ds),ncol(ds))
system.time(
for (i in 1:nrow(ds)){
y2[i,]-cumprod(1+ds[i,])
}
)

## COMPARE RESULTS TO MAKE SURE THEY DID THE SAME THING
str(y1)
str(y2)
all(y1==y2)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Current script name from R

2010-07-09 Thread Allan Engelhardt
I'm assuming you are using Rscript (please provide self-contained 
examples when posting) in which case you could look for the element in 
(base|R.utils)::commandArgs() that begin with the string --file= - the 
rest is the file name.  See the asValues= parameter in 
help(commandArgs, package=R.utils) for a nice way to get the parameter.


For an invocation of the form R  foo.R you'd need to inspect your 
system's process table (so don't do that).


Hope this helps.

Allan

On 09/07/10 10:48, Ralf B wrote:

Is there a way for a script to find out about its own name ?

Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Current script name from R

2010-07-09 Thread Allan Engelhardt

On 09/07/10 12:18, Ralf B wrote:

I am using RGUI, the  command line or the StatET Eclipse environment.
Should this not all be the same?
   


No, there is no particular reason why they should.

Allan


Ralf

On Fri, Jul 9, 2010 at 7:11 AM, Allan Engelhardtall...@cybaea.com  wrote:
   

I'm assuming you are using Rscript (please provide self-contained examples
when posting) in which case you could look for the element in
(base|R.utils)::commandArgs() that begin with the string --file= - the
rest is the file name.  See the asValues= parameter in help(commandArgs,
package=R.utils) for a nice way to get the parameter.

For an invocation of the form R  foo.R you'd need to inspect your system's
process table (so don't do that).

Hope this helps.

Allan

On 09/07/10 10:48, Ralf B wrote:
 

Is there a way for a script to find out about its own name ?

Ralf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

   




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how can achive step by step execution of the script

2010-07-09 Thread Allan Engelhardt
Sounds like you want devAskNewPage(TRUE) or the related 
options(device.ask.default).  See help(devAskNewPage, 
package=grDevices).


Hope this helps.

Allan

On 09/07/10 13:54, vijaysheegi wrote:

Hi R Experts,
I have certain code ,i want to achive interactive execution .
For ex:
1. as part of input ,it should ask file name  or table name as input.
2.in script so many graphs i need to draw,it should wait till certain key is
pressed .
3:i am using windows R,rscriptscriptname  is not working.



Please some one help me.

Thanks Experts in advance





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why Rscript behaves strangely with file name starting with 'size'?

2010-07-08 Thread Allan Engelhardt

It looks like a bug in R, not Rscript: what Rscript does is effectively

R --file=size.R


and you can easily test that

echo 'print(Hello world)'  size.R
R --file=size.R


give you the error while

cp size.R asize.R
R --file=asize.R


works as expected.

None of which helps you directly, but perhaps one of the developeRs can 
help fix what seems to be broken options parsing in the R code.


Allan

On 08/07/10 04:43, thmsfuller...@gmail.com wrote:

Hello All,

It seems weird to me that Rscript does the following thing and enters
the R prompt mode. If I change the file name to something that doesn't
start with 'size', then Rscript runs normally. Does anybody know what
is going on?

$ Rscript size.R
WARNING: '--file=size.R' value is invalid: ignored

   
 


$ Rscript --version
R scripting front-end version 51072




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why Rscript behaves strangely with file name starting with 'size'?

2010-07-08 Thread Allan Engelhardt
Having a quick look at CommandLineArgs.c (yuck! :-P) and some 
experimentation shows that a workaround could be

Rscript ./size.R


or the equivalent

R --file=./size.R


Hope this helps a little.

Allan

For developers: I'm /guessing/ the problem is in CommandLineArgs.c 
around the line

else if(strncmp(*av+7, size, 4) == 0) {


which looks like a nasty hack that just happens to match --file=size.R 
as well as --min-vsize=N et al that it tries to detect.  But seriously: 
there are libraries for options parsing!



On 08/07/10 07:49, Allan Engelhardt wrote:
 It looks like a bug in R, not Rscript: what Rscript does is effectively

 R --file=size.R


 and you can easily test that

 echo 'print(Hello world)'  size.R
 R --file=size.R


 give you the error while

 cp size.R asize.R
 R --file=asize.R


 works as expected.

 None of which helps you directly, but perhaps one of the developeRs 
 can help fix what seems to be broken options parsing in the R code.

 Allan

 On 08/07/10 04:43, thmsfuller...@gmail.com wrote:
 Hello All,

 It seems weird to me that Rscript does the following thing and enters
 the R prompt mode. If I change the file name to something that doesn't
 start with 'size', then Rscript runs normally. Does anybody know what
 is going on?

 $ Rscript size.R
 WARNING: '--file=size.R' value is invalid: ignored


 $ Rscript --version
 R scripting front-end version 51072



 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath vector problem; full program enclosed

2010-07-07 Thread Allan Engelhardt


On 07/07/10 06:03, Paul Johnson wrote:
 [...]
 Hi, Duncan and David

 Thanks for looking.  I suspect from the comment you did not run the
 code.  The expression examples I give do work fine already.  But I
 have to explicitly put in values like 1.96 to make them work.  I'm
 trying to avid that with substitute, which does work for b2, b3, b4,
 b5, all but b1.  Why just one?

 I'm uploading a picture of it so you can see for yourself:

 http://pj.freefaculty.org/R/plotmathwrong.pdf

 please look in the middle axis.

 Why does only b1 not work, but the rest do?



Because you only had one (as.)expression in your original code, 
perhaps?  This version works for me:

### Filename: plotMathProblem.R
### Paul Johnson July 5, 2010
### email mepaulj...@ku.edu
### 2010-07-06 AE : Changes.

   sigma- 10.0
   mu- 4.0

   myx- seq( mu - 3.5*sigma,  mu+ 3.5*sigma, length.out=500)

   myDensity- dnorm(myx,mean=mu,sd=sigma)

   ### xpd needed to allow writing outside strict box of graph
   ### Need big bottom margin to add several x axes
   par(xpd=TRUE, ps=10, mar=c(18,2,2,2))

   plot(myx, myDensity, type=l, xlab=, ylab=Probability Density ,
main=myTitle1, axes=FALSE)
   axis(2, pos= mu - 3.6*sigma)
   axis(1, pos=0)

   lines(c(myx[1],myx[length(myx)]),c(0,0)) ### closes off axes


   addInteriorLine- function(x, m, sd){
 for (i in 1:(length(x))){
   lines( c(x[i],x[i]), c(0, dnorm(x[i],m=m,sd=sd)), lty= 14, lwd=.2)
 }
   }


   dividers- c(qnorm(0.025), -1, 0, 1, qnorm(0.975))
   addInteriorLine(mu+sigma*dividers, mu,sigma)

   # bquote creates an expression that text plotters can use
   t1-  bquote( mu== .(mu))
   mtext(bquote( mu == .(mu)), 1, at=mu, line=-1)


   addInteriorLabel- function(pos1, pos2, m, s){
 area- abs(100*( pnorm(m+pos1*s,m,s)-pnorm(m+pos2*s, m,s)))
 mid- m+0.5*(pos1+pos2)*s
 text(mid, 0.5*dnorm(mid,m,s),label=paste(round(area,2),%))
   }


   addInteriorLabel(dividers[1],dividers[2],  mu, sigma)
   addInteriorLabel(dividers[2],dividers[3],  mu, sigma)
   addInteriorLabel(dividers[3],dividers[4],  mu, sigma)
   addInteriorLabel(dividers[4],dividers[5],  mu, sigma)




### Following is problem point: axis will
### end up with correct labels, except for first point,
### where we end up with b1 instead of mu - 1.96*sigma.
b1- substitute( mu - d*sigma, list(d=*-round(dividers[1],2))*  )
b2- substitute( mu - sigma )
b3- substitute( mu )
b4- substitute( mu + sigma )
b5- substitute( mu + d*sigma, list(d=round(dividers[5],2)) )
##   plot(-20:50,-20:50,type=n,axes=F)
axis(1, line=4,at=mu+dividers*sigma,
labels=*as.expression(c(b1,b2,b3,b4,b5))*, padj=-1)




### This gets right result but have to hard code the dividers
   b1- expression( mu - 1.96*sigma )
   b2- expression( mu - sigma )
   b3- expression( mu )
   b4- expression( mu + sigma )
   b5- expression( mu + 1.96*sigma )
   axis(1, line=8,at=mu+dividers*sigma, labels=c(b1,b2,b3,b4,b5), padj=-1)




Allan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] grayscale wireframe??

2010-07-07 Thread Allan Engelhardt
A standalone example is always appreciated (cf. the posting guide) but 
try and see if help(gray.colors, package=grDevices) is the sort of 
thing you are looking for.


Hope this helps

Allan

On 06/07/10 23:30, Marlin Keith Cox wrote:

I need grayscale formatting for a wireframe.
The only col.regions that I can find are color palettes are all colored:

rainbow(n, s = 1, v = 1, start = 0, end = max(1,n - 1)/n,
 gamma = 1, alpha = 1)
heat.colors(n, alpha = 1)
terrain.colors(n, alpha = 1)
topo.colors(n, alpha = 1)
cm.colors(n, alpha = 1)

The code follows:

X11()
library(lattice)
par(family=serif, cex=1.2)
wireframe(z ~ y*x, mat.df,
   drape = TRUE,col.regions=rainbow(100),
   zlab = list(Water mass error (%),rot=90), zlim=c(-50,180),
   xlab = list(Resistance error (%),rot=-9),
   ylab = list(Length error (%),rot=38),
   scales = list(arrows = FALSE),
screen = list(z = -35, x = -77, y = 10))





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath vector problem; full program enclosed

2010-07-07 Thread Allan Engelhardt

Ooops, I didn't convert this one to text right for the list.

b1- substitute( mu - d*sigma, list(d=*-round(dividers[1],2))*  )


should be

b1- substitute( mu - d*sigma, list(d=-round(dividers[1],2))  )


and similarly for

labels=*as.expression(c(b1,b2,b3,b4,b5))*, padj=-1)

read

labels=as.expression(c(b1,b2,b3,b4,b5)), padj=-1)

Apologies


Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualization of coefficients

2010-07-07 Thread Allan Engelhardt
Thanks Tal.  Nice summary on the web page.  I think the last example 
would be even better if it was a stand-alone piece of code (i.e.: with 
data) that we could run.  For example


library(arm)
data(Mroz, package = car)
M1-  glm(lfp ~ ., data = Mroz, family = binomial)
M2- bayesglm(lfp ~ ., data = Mroz, family = binomial)
M3-  glm(lfp ~ ., data = Mroz, family = binomial(probit))
coefplot(M2, xlim=c(-2, 6),intercept=TRUE)
coefplot(M1, add=TRUE, col.pts=red,  intercept=TRUE)
coefplot(M3, add=TRUE, col.pts=blue, intercept=TRUE, offset=0.2)


Allan

On 07/07/10 09:16, Tal Galili wrote:

Hello David,
Thanks to your posting I started looking at the function in the arm 
package.  It appears this function is quite mature, and offers (for 
example) the ability to easily overlap coefficients from several models.


I updated the post I published on the subject, so at the end of it I 
give an example of comparing the coef of several models:

http://www.r-statistics.com/2010/07/visualization-of-regression-coefficients-in-r/

Thanks again for the pointer.

Best,
Tal
[...]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Wavelet

2010-07-07 Thread Allan Engelhardt

Try

RSiteSearch(MORLET)

before you post.

Allan

On 07/07/10 09:38, nuncio m wrote:

Hi useRs,
Is it possible to get MORLET wavelet in R
Thanks
nuncio




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath vector problem; full program enclosed

2010-07-07 Thread Allan Engelhardt

On 07/07/10 18:52, Paul Johnson wrote:
 [...]
 1:
 axis(1, line=6, at=mu+dividers*sigma,
 labels=as.expression(c(b1,b2,b3,b4,b5), padj=-1))


 2:
 axis(1, line=9, at=mu+dividers*sigma,
 labels=c(as.expression(b1),b2,b3,b4,b5), padj=-1)

 This second one shouldn't work, I think.
 It has as.expression on only the first element, and yet they all come
 out right. Is there a spill over effect?


You can call it spill-over if you want: c() makes a *vector*, not a 
list, so all elements have to be of the same type and they are all 
coerced to the more general type (the first).  It is similar to

  c(1, 2, 3)
[1] 1 2 3

which is different from c(1, 2, 3).

Hope this helps

Allan

PS: BTW, I think the () are wrong in example 1? :-)


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selection with changing number of columns

2010-07-06 Thread Allan Engelhardt
I'm not sure your question completely makes sense, but perhaps this will 
help:


set.seed(1)
d- data.frame(matrix(floor(runif(100, max=10)), 10))  # Example data
d[apply(d == 9, 1, any), ]  # Select rows with 9 in any column

## Or more generally:
d[ , c(1, 2, 3)] == c(2, 2, 9)   #  Or maybe d == 0:9 to select on 
all columns
apply(d[ , c(1, 2, 3)] == c(2, 2, 9), 1, any)  # any() being the general `|` 
function here
d[apply(d[ ,c(1, 2, 3)] == c(2, 2, 9), 1, any), ]  # Finally: the rows we were 
looking for


A bit over-engineered, perhaps, but gets you to use some generally 
useful functions.


Hope this helps.

Allan



On 06/07/10 09:33, Kunzler, Andreas wrote:

Dear list,

I'm looking for a way to select rows of a data.frame with changing number of 
columns (constraint) involved.

Assume a data (d) structure like


Var.1 Var.2 Var.3
9   2   1
2   9   5
1   2   1

I know the number of involved columns.

Is there a way to generate the following selection automatically (maybe for 
loop), so that it makes no difference if there are two or ten columns involved.

Selection:
d[d$Var.1==9 | d$Var.1==9 | d$Var.1==9  ,]


Does anybody know a way?

Thanks

Mit freundlichen Grüßen

Andreas Kunzler

Bundeszahnärztekammer (BZÄK)
Chausseestraße 13
10115 Berlin

Tel.: 030 40005-113
Fax:  030 40005-119

E-Mail: a.kunz...@bzaek.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pseudo F statistics with index.G1

2010-07-06 Thread Allan Engelhardt

Always use set.seed in your examples.

Running your code after set.seed(1) works fine but gives the error after 
set.seed(2).  The problem in the latter case being that there is only 
one value 7 in C and you need two or more for the index.G1 code to make 
sense.


Hope this helps a little

Allan

Example:


 set.seed(1)
 mat- diag(1,27,27)
 f- runif(sum(26:1),0,1)
 mat[lower.tri(mat)]- f
 mat- t(mat)
 mat[lower.tri(mat)]- f

 # Cluster with Agnes
 A- agnes(mat,diss=T,method=average)
 C- cutree(A,k=7)   # Value of k = the number of clusters
 F- index.G1(mat,C)
 table(C)

C
1 2 3 4 5 6 7
6 3 5 2 3 4 4

 set.seed(2)
 mat- diag(1,27,27)
 f- runif(sum(26:1),0,1)
 mat[lower.tri(mat)]- f
 mat- t(mat)
 mat[lower.tri(mat)]- f

 # Cluster with Agnes
 A- agnes(mat,diss=T,method=average)
 C- cutree(A,k=7)   # Value of k = the number of clusters
 F- index.G1(mat,C)

Error in apply(x[cl == i, ], 2, mean) :
  dim(X) must have a positive length

 table(C)

C
1 2 3 4 5 6 7
8 4 5 3 4 2 1



On 06/07/10 09:32, Kennedy wrote:

Hello,

I have done some clustering with Agnes and want to calculate the pseudo F
statistics using index.G1. It works for a low number of clusters but when i
increase the number of clusters i eventually get the following message:

Error in apply(x[cl == i, ], 2, mean) :
   dim(X) must have a positive length

The following code produces an example comparable to the data i am
clustering:

library(cluster)
library(ade4)
library(R2HTML)
library(e1071)
library(class)
library(rgl)
library(MASS)
library(clusterSim)

# Create a symmetric matrix with ones on the diagonal
mat- diag(1,27,27)
f- runif(sum(26:1),0,1)
mat[lower.tri(mat)]- f
mat- t(mat)
mat[lower.tri(mat)]- f

# Cluster with Agnes
A- agnes(mat,diss=T,method=average)
C- cutree(A,k=7)   # Value of k = the number of clusters
F- index.G1(mat,C)

The code above works for k=2:6 but then the error message appears.


Sincerely

Henrik



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pseudo F statistics with index.G1

2010-07-06 Thread Allan Engelhardt
I guess that in R you have to be explicit about what you want to do.  
You can't just drop them, so you'll have to assign them some (other) 
value.  Try which(table(C)==1) to give you the values you need to change 
and then decide what to change them to.  The SAS documentation may tell 
you what it does; otherwise,  I'd suggest you look at the source, but I 
guess that is not an option for you.  Welcome to R.


Allan

On 06/07/10 15:48, Henrik Aldberg wrote:

Thank you Allan,

As i understand it the index.G1 function does not work if one of the 
clusters in the partition only contains one object. Is there a way to 
get around this in R? In SAS the PSF function seems to ignore the 
presence of singleton clusters.



Sincerely

Henrik





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plotmath vector problem; full program enclosed

2010-07-06 Thread Allan Engelhardt



On 06/07/10 18:51, David Winsemius wrote:


Easily addressed in this case with ~ instead of -. The value of 
d provides the minus:


b1 - substitute( mu ~ d*sigma, list(d=round(dividers[1],2)) )


Neat trick!  But it gives a slightly different minus sign in the 
display, so perhaps simply


b1- substitute( mu - d*sigma, list(d=-round(dividers[1],2)) )


instead?

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] conditional dataframe search and find

2010-07-03 Thread Allan Engelhardt

 ct.df[ct.df$conc  25,time]
[1] 10 11 12 13 14 15
 ct.df[ct.df$conc  25,time][1]
 df[df$conc  25,time][1]
[1] 10

See also help(order) if conc is not ordered.

On 02/07/10 22:50, oscar linares wrote:

   time   conc
1 0 164.495456
2 1 133.671185
3 2 108.622975
4 3  88.268468
5 4  71.728126
6 5  58.287225
7 6  47.364971
8 7  38.489403
9 8  31.276998
109  25.416103
11   10  20.653462
12   11  16.783276
13   12  13.638313
14   13  11.082674
15   14   9.005927
16   15   7.318336



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems loading the twitteR package

2010-07-02 Thread Allan Engelhardt
If you are on a Unix machine try your  package manager first to see if 
it has the packages you need.  Otherwise, try


install.package(twitteR, dependencies = TRUE)

(or RCurl) and see if that helps.

Allan

On 01/07/10 22:51, lyolya wrote:

Dear all,

I cannot load the twitteR package. When I have it installed and try to load
it, I get the following response from the system:

   

library(twitteR)
 

Loading required package: RCurl
Error: package 'RCurl' could not be loaded
Además: Mensajes de aviso perdidos
In library(pkg, character.only = TRUE, logical.return = TRUE, lib.loc =
lib.loc) :
there is no package called 'RCurl'

and when I further manually install and load the RCurl package, I get this
response:

ERROR: dependencies ‘bitops’ are not available for package ‘RCurl’
* removing ‘/Users/lyolya/RCurl’

Would be happy with any piece of advice,

Thank you.

Lyolya.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Visualization of coefficients

2010-07-02 Thread Allan Engelhardt
http://addictedtor.free.fr/graphiques/RGraphGallery.php?graph=114 should 
get you started.


On 02/07/10 07:10, Wincent wrote:

Dear all,

I try to show a subset of coefficients in my presentation. It seems
that a standard table is not a good way to go. I found figure 9
(page 9) in this file (
http://www.destatis.de/jetspeed/portal/cms/Sites/destatis/Internet/DE/Content/Wissenschaftsforum/Kolloquien/VisualisierungModellierung__Beitrag,property=file.pdf
) looks pretty good. I wonder if there is any function for such plot?
Or any suggestion on how to present statistical models in a
presentation?

Thank you.




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with aggregating data across time points

2010-07-02 Thread Allan Engelhardt

On 02/07/10 16:21, Chris Beeley wrote:

Hello-

I have a dataset which basically looks like this:

Location   Sex   Date  Time   VerbalSelf harm
Violence_objects   Violence
   A 1  1-4-2007   1800  3 0
 1   3
   A 1  1-4-2007   1230  21
2   4
   D 2  2-4-2007   1100  04
0   0
...

I've put a dput of the first section of the data at the end of this
email. [...]

What I want to do is:

A) sum each of the dependent variables for each of the dates (so e.g.
in the example above for 1-4-2007 it would be 3+2=5, 0+1=1, 1+2=3, and
3+4=7 for each of the variables)
   


If 'data' is the data at the end of your email, then


 aggregate(cbind(verbal,self.harm,violence_objects,violence) ~ Date, data = 
data, FUN = sum)

  Date verbal self.harm violence_objects violence
1 01/04/07 251539
2 02/04/07 24 68   13
3 03/04/07 17130   10


is one approach.  Read help(aggregate) and don't forget the na.action= 
argument.




B) do this sum, but only in each location this time (location is the
first variable)- so the sum for 1-4-2007 in location A, sum for
1-4-2007 in location B, and so on and so on. Because this is divided
   


The basic approach could be


 aggregate(cbind(verbal,self.harm,violence_objects,violence) ~ Date + Location, 
data = data, FUN = sum)

   Date Location verbal self.harm violence_objects violence
1  01/04/07A  7 103
2  02/04/07A  8 201
3  03/04/07A  0 002
4  01/04/07B  3 201
5  02/04/07B  4 200
6  03/04/07B  4 003
7  01/04/07C  4 232
8  02/04/07C  0 042
9  03/04/07C  1 105
10 01/04/07D  7 603
11 02/04/07D  0 009
12 03/04/07D  41100
13 01/04/07E  4 300
14 02/04/07E  4 040
15 03/04/07E  8 100
16 01/04/07F  0 100
17 02/04/07F  8 201




across locations, some dates will have no data going into them and
will return 0 sums. Crucially I still want these dates to appear- so
e.g. 21-5-2008 would appear as 0 0 0 0, then 22-5-2008 might have 1 2
0 0, then 23-5-2008 0 0 0 0 again, and etc.
   


Why?

But variations on


 data2- data[!(as.numeric(data$Date)==3  data$Location==B),] # For example
 z- with(data2, tapply(verbal, list(Date,Location), FUN=sum))
 z[is.na(z)]- 0
 print(z)

   A B C D E F
 0 0 0 0 0 0 0
01/04/07 0 7 3 4 7 4 0
02/04/07 0 8 0 0 0 4 8
03/04/07 0 0 4 1 4 8 0



will perhaps work for you.

Hope this helps

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Plot map Brazil and documentation

2010-07-01 Thread Allan Engelhardt



On 01/07/10 05:43, Pablo Cerdeira wrote:

[...]
*My doubt is: what it means with see the package index?

It is cryptic statistician speak for

help(package = maps)


  Should I find a
Brazilian map on the index?*
   


No.


*
*If not, can someone help me to find some brazilian map (with states)?*
   


The basic map is

map('world', 'brazil')

For the states you'll have to find a map to download or buy.  Try 
http://www.inde.gov.br/ or Google.



*Also, can someone suggest me some good site with documentation about
plotting graphics in R? I'd like to read more about this amazing tool.*
   


The site http://addictedtor.free.fr/graphiques/ is often mentioned for 
graphis.  For more, try the links from www.r-project.org


http://cran.r-project.org/manuals.html
http://www.r-project.org/doc/bib/R-books.html
http://www.r-project.org/other-docs.html

Hope this helps

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Sweave function

2010-07-01 Thread Allan Engelhardt

In general, use cat (see help(cat)) for printing your output, e.g.

echo=FALSE=
cat(levels(mydata$Firms)[mydata$Firms], \n)
cat(mydata$Year, \n)
# etc.
@


For your specific case you may also be interested in help(xtable, 
package=xtable) along with the ...,results=tex= Sweave construct.


For an R version of the output you want, a starting point may be

for (f in levels(mydata$Firms)) {
 cat(f, \n)
 print(t(mydata[mydata$Firms == f, 2:5]))
}


which gives

electrolux
123
Year 1995 1996 1997
IIA   100  340   35
IIB45   67   99
IIC65   97   31
fiat
456
Year 1995 1996 1997
IIA76  567  453
IIB34   35   89
IIC45   66   37


Hope this helps a little.

Allan


On 01/07/10 09:08, n.via...@libero.it wrote:

Dear list,
I have a question about the interaction between R code and Latex language trough the 
Sweave function in the package utils.
What I'm  trying to do is to write a report.  Contrary to the examples shown in the 
Sweave Manual in which table already constructed by R are exported on Latex 
files, what I would like to do is to build a table in which I combine text and specific 
columns of my data frame. I will give you the following example to be much more clear.
Suppose I have a data frame like this:

FirmsYear IIA   IIB 
 IIC
electrolux1995100   45  
  65
electrolux1996340  67   
 97
electrolux199735 99 
  31
fiat1995 76
34  45
fiat1996567   
35  66
fiat1997453  89 
 37


Where IIA is the turnover of the firm, IIB is the production and IIC is the 
cost of labour.

I would like to get a table like this in the latex format this:
Firms 1
electrolux
---

   variables 1995 19971997

--
turnover 100  34035
production  4567  99
cost of labour  6597 31



I use the following code:
\documentclass[a4paper]{article}
\title{example}
\begin{document}

\maketitle
echo=F=
mydata$firms
@
variables
echo=F=
mydata$year
@


and so on. I have two problem, first I'm not able to put on the same line text 
and output of R. So  on my Latex document I get for example
variable
1995 1996 1997


and I don't want this. The secondo problem is that at the beginnig of the R 
output I get the index, namely
variable
[1] 1995 1996 1997
and I don't want to see it.
Anyone Knows how to do it or if there is another package in R that give me the 
possibility to create a Report by constructing table without any problems???




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Export data frame of high dimension to txt

2010-07-01 Thread Allan Engelhardt


On 01/07/10 08:50, Loukia Spin. wrote:

I managed with success to export a data frame of 367 columns and 37 rows to a txt file, 
but unfortunately I couldn't manage the same with the transposed data frame. 
Specifically, it seems like the notebook cannot read correctly 367 columns, 
so it reads the first 145 columns and it folds down the rest of them and consequently it 
folds down the respective data.

f=format.data.frame(rev,trim=F,digits=NULL,nsmall=3, ## rev is the data frame 
of 37 rows and 367 columns ##
justify=centre,zero.print=F,drop0trailing=T);f
   
I am not sure why you want to call format.data.frame first? A simple 
write.table(rev, ...) should work fine.  As it stands, write.table will 
convert f back to a data.frame before writing

write.table(f,file=C:/Program Files/R/R-2.11.0/genes as rows.txt,
row.names=T,sep=\t,quote=F,dec=.,col.names=NA,append=T)
   
Careful with that append=TRUE.  Try it with FALSE (the default) to make 
sure the file really is empty.  It might simply be text at the beginning 
that trips up the subsequent read.table (which I am guessing from your 
description is where the problem occurs).


Hope this helps a little.

Allan

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Stacked Restricted Boltzmann Machine

2010-07-01 Thread Allan Engelhardt
I don't know of a native implementation, but you could probably use MDP 
along with RPy or R/S Plus as suggested here: 
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628591/


Allan.

On 01/07/10 09:01, Matthew OKane wrote:

Hi,

Are there any implementations of stacked RBMs either complete or planned in
R?

Thanks,
Matthew

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >