date:20130416

Re: [R] Optimisation and NaN Errors using clm() and clmm()

2013-04-16 Thread Rune Haubo

On 15 April 2013 13:18, Thomas thomasfox...@aol.com wrote:

 Dear List,

 I am using both the clm() and clmm() functions from the R package 'ordinal'.

 I am fitting an ordinal dependent variable with 5 categories to 9 continuous 
 predictors, all of which have been normalised (mean subtracted then divided 
 by standard deviation), using a probit link function. From this global model 
 I am generating a confidence set of 200 models using clm() and the 'glmulti' 
 R package. This produces these errors:

 / model.2.10 - glmulti(as.factor(dependent) ~ 
 predictor_1*predictor_2*predictor_3*predictor_4*predictor_5*predictor_6*predictor_7*predictor_8*predictor_9,
  data = database, fitfunc = clm, link = probit, method = g, crit = aicc, 
 confsetsize = 200, marginality = TRUE)
 ...
 After 670 generations:
 Best model: 
 as.factor(dependent)~1+predictor_1+predictor_2+predictor_3+predictor_4+predictor_5+predictor_6+predictor_8+predictor_9+predictor_4:predictor_3+predictor_6:predictor_2+predictor_8:predictor_5+predictor_9:predictor_1+predictor_9:predictor_4+predictor_9:predictor_5+predictor_9:predictor_6
 Crit= 183.716706496392
 Mean crit= 202.022138576506
 Improvements in best and average IC have bebingo en below the specified goals.
 Algorithm is declared to have converged.
 Completed.
 There were 24 warnings (use warnings() to see them)
  warnings()
 Warning messages:
 1: optimization failed: step factor reduced below minimum
 2: optimization failed: step factor reduced below minimum
 3: optimization failed: step factor reduced below minimum/
 etc.


 I am then re-fitting each of the 200 models with the clmm() function, with 2 
 random factors (family nested within order). I get this error in a few of the 
 re-fitted models:

 / model.2.glmm.2 - clmm(as.factor(dependent) ~ 1 + predictor_1 + 
 predictor_2 + predictor_3 + predictor_6 + predictor_7 + predictor_8 + 
 predictor_9 + predictor_6:predictor_2 + predictor_7:predictor_2 + 
 predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + 
 predictor_9:predictor_2 + predictor_9:predictor_3 + predictor_9:predictor_6 + 
 predictor_9:predictor_7 + predictor_9:predictor_8+ (1|order/family), link = 
 probit, data = database)
  summary(model.2.glmm.2)
 
 Cumulative Link Mixed Model fitted with the Laplace approximation

 formula: as.factor(dependent) ~ 1 + predictor_1 + predictor_2 + predictor_3 + 
 predictor_6 + predictor_7 + predictor_8 + predictor_9 + 
 predictor_6:predictor_2 + predictor_7:predictor_2 +
 predictor_7:predictor_3 + predictor_8:predictor_2 + predictor_9:predictor_1 + 
 predictor_9:predictor_2 +
 predictor_9:predictor_3 + predictor_9:predictor_6 + predictor_9:predictor_7 + 
 predictor_9:predictor_8 + (1 | order/family)
 data: database

 link threshold nobs logLik AIC niter max.grad cond.H
 probit flexible 103 -65.56 173.13 58(3225) 8.13e-06 4.3e+03

 Random effects:
 Var Std.Dev
 family:order 7.493e-11 8.656e-06
 order 1.917e-12 1.385e-06
 Number of groups: family:order 12, order 4

 Coefficients:
 Estimate Std. Error z value Pr(|z|)
 predictor_1 0.40802 0.78685 0.519 0.6041
 predictor_2 0.02431 0.26570 0.092 0.9271
 predictor_3 -0.84486 0.32056 -2.636 0.0084 **
 predictor_6 0.65392 0.34348 1.904 0.0569 .
 predictor_7 0.71730 0.29596 2.424 0.0154 *
 predictor_8 -1.37692 0.75660 -1.820 0.0688 .
 predictor_9 0.15642 0.28969 0.540 0.5892
 predictor_2:predictor_6 -0.46880 0.18829 -2.490 0.0128 *
 predictor_2:predictor_7 4.97365 0.82692 6.015 1.80e-09 ***
 predictor_3:predictor_7 -1.13192 0.46639 -2.427 0.0152 *
 predictor_2:predictor_8 -5.52913 0.88476 -6.249 4.12e-10 ***
 predictor_1:predictor_9 4.28519 NA NA NA
 predictor_2:predictor_9 -0.26558 0.10541 -2.520 0.0117 *
 predictor_3:predictor_9 -1.49790 NA NA NA
 predictor_6:predictor_9 -1.31538 NA NA NA
 predictor_7:predictor_9 -4.41998 NA NA NA
 predictor_8:predictor_9 3.99709 NA NA NA
 ---
 Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

 Threshold coefficients:
 Estimate Std. Error z value
 0|1 -0.2236 0.3072 -0.728
 1|2 1.4229 0.3634 3.915
 (211 observations deleted due to missingness)
 Warning message:
 In sqrt(diag(vc)[1:npar]) : NaNs produced/


This warning is due to a (near) singular variance-covariance matrix of
the model parameters, which in turn is due to the fact that the model
converged to a boundary solution: both random effects variance
parameters are zero. If you exclude the random terms and refit the
model with clm, the variance-covariance matrix will probably be well
defined and standard errors can be computed.

Another thing is that you are fitting 17 regression parameters and 2
random effect terms (which in the end do not count) to only 103
observations. I would be worried about overfitting or perhaps even
non-fitting. I think I would also be concerned about the 211
observations that are incomplete, and I would be careful with
automatic model selection/averaging etc. on incomplete data (though I
don't know how/if glmulti actually deals with that).

Re: [R] Sorting data.frame and again sorting within data.frame

2013-04-16 Thread Katherine Gobin

Dear Sir,

Thanks a lot for your valuable input and guidance.

Regards

Katherine

--- On Mon, 15/4/13, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote:

From: Jeff Newmiller jdnew...@dcn.davis.ca.us
Subject: Re: [R] Sorting data.frame and again sorting within data.frame
To: David Winsemius dwinsem...@comcast.net, Katherine Gobin 
katherine_go...@yahoo.com
Cc: r-help@r-project.org
Date: Monday, 15 April, 2013, 5:33 PM

Yes, that would be because she converted to Date on the fly in her example, and 
so apparently did not need this reminder.
---
Jeff Newmiller                        The     .       .  Go Live...
DCN:jdnew...@dcn.davis.ca.us        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
---
Sent from my phone. Please excuse my brevity.

David Winsemius dwinsem...@comcast.net wrote:


On Apr 14, 2013, at 11:01 PM, Katherine Gobin wrote:

 Dear R forum,
 
 I have a data.frame as defied below - 
 
 df = data.frame(names = c(C, A, A, B, C, B, A, B,
C), dates = c(4/15/2013, 4/13/2013, 4/15/2013, 4/13/2013,
4/13/2013, 4/15/2013, 4/14/2013, 4/14/2013,4/14/2013 ),values
= c(10, 31, 31, 17, 11, 34, 102, 47, 29))
 
 df
   names     dates values
 1     C 4/15/2013     10
 2     A 4/13/2013     31
 3     A 4/15/2013     31
 4     B 4/13/2013     17
 5     C 4/13/2013     11
 6     B
 4/15/2013     34
 7     A 4/14/2013    102
 8     B 4/14/2013     47
 9     C 4/14/2013     29
 
 I need to sort df first on names in increasing order and then
further on dates in a decreasing order i.e. I need
 

So far no one has pointed out that these are not really Dates in the
R sense and will not sort correctly if any of the proposed methods are
applied to sequences that extend beyond6 months, i.e, until October
forward. You would be advised to convert to real Date-classed
variables.

?strptime
?as.Date


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overlay two stat_ecdf() plots

2013-04-16 Thread PIKAL Petr

Hi

Do you mean ecdf? If yes just ose add option in plot.

plot(ecdf(rnorm(100, 1,2)))
plot(ecdf(rnorm(100, 2,2)), add=TRUE, col=2)

If not please specify from where is ecdf_stat or stat_ecdf which, as you 
indicate, are the same functions.

Regrdas
Petr




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Robin Mjelle
 Sent: Monday, April 15, 2013 1:10 PM
 To: r-help@r-project.org
 Subject: [R] Overlay two stat_ecdf() plots
 
 I want to plot two scdf-plots in the same graph.
 I have two input tables with one column each:
 
  Targets - read.table(/media/, sep=, header=T) NonTargets -
  read.table(/media/..., sep=, header=T)
 
  head(Targets)
 V1
 1 3.160514
 2 6.701948
 3 4.093844
 4 1.992014
 5 1.604751
 6 2.076802
 
  head(NonTargets)
  V1
 1  3.895934
 2  1.990506
 3 -1.746919
 4 -3.451477
 5  5.156554
 6  1.195109
 
  Targets.m - melt(Targets)
  head(Targets.m)
   variablevalue
 1   V1 3.160514
 2   V1 6.701948
 3   V1 4.093844
 4   V1 1.992014
 5   V1 1.604751
 6   V1 2.076802
 
  NonTargets.m - melt(NonTargets)
  head(NonTargets.m)
   variable value
 1   V1  3.895934
 2   V1  1.990506
 3   V1 -1.746919
 4   V1 -3.451477
 5   V1  5.156554
 6   V1  1.195109
 
 
 How do I proceed to plot them in one Graph using ecdf_stat()
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Create function from string

2013-04-16 Thread Jon Olav Skoien


Dear list,

I am trying to create a function from a string, and have so far solved 
it with eval(parse()). This works well also when using the newly created 
function as an argument to another function. The trouble starts when I 
want to use it with parLapply. Below is a much simplified example:


##
fstring = x+2
FUN = function(x)  eval(parse(text = fstring))
FUN(3)

FUN2 = function(y, func)   y + func(y)
FUN2(3,FUN)

# I can also pass FUN as an argument to FUN2 when using foreach and 
parallel:


library(parallel)
library(foreach)
cl = makeCluster(2, outfile = )

ylist = list(1:3,4:6)
result = foreach(i = 1:2) %dopar% {
  FUN2(ylist[[i]], FUN)
}

# But now when I wanted to change to parLapply (actually parLapplyLB) 
fstring is not found anymore:

parLapply(cl, as.list(1:4), FUN2, func = FUN)
##

I assume there is a problem with environments, the question is how to 
solve this. The cleanest would be to substitute fstring with its content 
in FUN, but I did not figure out how. Substitute or bquote do not seem 
to do the trick, although I might not have tried them in the right way. 
Any suggestions how to solve this, either how to substitute correctly, 
or to completely avoid the eval(parse())?


Thanks,
Jon



--
Jon Olav Skøien
Joint Research Centre - European Commission
Institute for Environment and Sustainability (IES)
Land Resource Management Unit

Via Fermi 2749, TP 440,  I-21027 Ispra (VA), ITALY

jon.sko...@jrc.ec.europa.eu

Disclaimer: Views expressed in this email are those of the individual and do 
not necessarily represent official views of the European Commission.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create function from string

2013-04-16 Thread peter dalgaard

Is this what you are looking for?

 FUN = eval(bquote(function(x) .(parse(text = fstring)[[1]])))
 FUN
function (x) 
x + 2
 FUN(3)
[1] 5



On Apr 16, 2013, at 09:50 , Jon Olav Skoien wrote:

 Dear list,
 
 I am trying to create a function from a string, and have so far solved it 
 with eval(parse()). This works well also when using the newly created 
 function as an argument to another function. The trouble starts when I want 
 to use it with parLapply. Below is a much simplified example:
 
 ##
 fstring = x+2
 FUN = function(x)  eval(parse(text = fstring))
 FUN(3)
 
 FUN2 = function(y, func)   y + func(y)
 FUN2(3,FUN)
 
 # I can also pass FUN as an argument to FUN2 when using foreach and parallel:
 
 library(parallel)
 library(foreach)
 cl = makeCluster(2, outfile = )
 
 ylist = list(1:3,4:6)
 result = foreach(i = 1:2) %dopar% {
  FUN2(ylist[[i]], FUN)
 }
 
 # But now when I wanted to change to parLapply (actually parLapplyLB) fstring 
 is not found anymore:
 parLapply(cl, as.list(1:4), FUN2, func = FUN)
 ##
 
 I assume there is a problem with environments, the question is how to solve 
 this. The cleanest would be to substitute fstring with its content in FUN, 
 but I did not figure out how. Substitute or bquote do not seem to do the 
 trick, although I might not have tried them in the right way. Any suggestions 
 how to solve this, either how to substitute correctly, or to completely avoid 
 the eval(parse())?
 
 Thanks,
 Jon
 
 
 
 -- 
 Jon Olav Skøien
 Joint Research Centre - European Commission
 Institute for Environment and Sustainability (IES)
 Land Resource Management Unit
 
 Via Fermi 2749, TP 440,  I-21027 Ispra (VA), ITALY
 
 jon.sko...@jrc.ec.europa.eu
 
 Disclaimer: Views expressed in this email are those of the individual and do 
 not necessarily represent official views of the European Commission.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Create function from string

2013-04-16 Thread Jon Olav Skoien


Thanks a lot, that seems to do exactly what I need!

Best wishes,
Jon

On 16-Apr-13 10:21, peter dalgaard wrote:

Is this what you are looking for?


FUN = eval(bquote(function(x) .(parse(text = fstring)[[1]])))
FUN

function (x)
x + 2

FUN(3)

[1] 5



On Apr 16, 2013, at 09:50 , Jon Olav Skoien wrote:


Dear list,

I am trying to create a function from a string, and have so far solved it with 
eval(parse()). This works well also when using the newly created function as an 
argument to another function. The trouble starts when I want to use it with 
parLapply. Below is a much simplified example:

##
fstring = x+2
FUN = function(x)  eval(parse(text = fstring))
FUN(3)

FUN2 = function(y, func)   y + func(y)
FUN2(3,FUN)

# I can also pass FUN as an argument to FUN2 when using foreach and parallel:

library(parallel)
library(foreach)
cl = makeCluster(2, outfile = )

ylist = list(1:3,4:6)
result = foreach(i = 1:2) %dopar% {
  FUN2(ylist[[i]], FUN)
}

# But now when I wanted to change to parLapply (actually parLapplyLB) fstring 
is not found anymore:
parLapply(cl, as.list(1:4), FUN2, func = FUN)
##

I assume there is a problem with environments, the question is how to solve 
this. The cleanest would be to substitute fstring with its content in FUN, but 
I did not figure out how. Substitute or bquote do not seem to do the trick, 
although I might not have tried them in the right way. Any suggestions how to 
solve this, either how to substitute correctly, or to completely avoid the 
eval(parse())?

Thanks,
Jon



--
Jon Olav Skøien
Joint Research Centre - European Commission
Institute for Environment and Sustainability (IES)
Land Resource Management Unit

Via Fermi 2749, TP 440,  I-21027 Ispra (VA), ITALY

jon.sko...@jrc.ec.europa.eu

Disclaimer: Views expressed in this email are those of the individual and do 
not necessarily represent official views of the European Commission.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Jon Olav Skøien
Joint Research Centre - European Commission
Institute for Environment and Sustainability (IES)
Land Resource Management Unit

Via Fermi 2749, TP 440,  I-21027 Ispra (VA), ITALY

jon.sko...@jrc.ec.europa.eu

Disclaimer: Views expressed in this email are those of the individual and do 
not necessarily represent official views of the European Commission.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HMM Package parameter estimation

2013-04-16 Thread Rolf Turner



I think it's your starting values for the initial state probability 
distribution,

i.e. c(1,1,1)/3 that cause the problem.  They seem to drop you into some
sort of local maximum/stationary point, a long way from the global maximum.

Try, e.g. c(4,2,1)/7; this gives me:

 hmmFit$hmm$emissionProbs
  symbols
states 1  2
 1 0.9385018 0.06149819
 2 0.7883591 0.21164092
 3 0.2279287 0.77207131

 hmmFit$hmm$transProbs
to
from 1 2 3
   1 0.6925055 0.1239590 0.1835355
   2 0.2537700 0.5780679 0.1681621
   3 0.2455462 0.1190872 0.6353666

which look to be in reasonable agreement with the true values.
Note though that states 2 and 3 have been swapped.  This happens.

cheers,

Rolf Turner


On 16/04/13 13:13, Richard Philip wrote:

Hi,

I am having difficulties estimating the parameters of a HMM using the HMM
package. I have simulated a sequence of observations from a known HMM. When
I estimate the parameters of a HMM using these simulated observations the
parameters are not at all close to the known ones. I realise the estimated
parameters are not going to be exactly the same as the known/true
parameters, but these are nowhere close. Below is my code used. Any ideas
or possible suggestions regarding this issue would be greatly appreciated?


library(HMM)

## DECLARE PARAMETERS OF THE KNOWN MODEL
states = c(1,2,3)
symbols = c(1,2)
startProb = c(0.5,0.25,0.25)
transProb = matrix(c(0.8,0.05,0.15,0.2,0.6,0.2,0.2,0.3,0.5),3,3,TRUE)
emissionProb =  matrix(c(0.9,0.1,0.2,0.8,0.7,0.3), 3,2,TRUE)

# CREATE THE KNOWN MODEL
hmmTrue = initHMM(states, symbols, startProb, transProb , emissionProb)

# SIMULATE 1000 OBSERVATIONS OF THE KNOWN MODEL
observation = simHMM(hmmTrue, 1000)
obs = observation$observation

#ESTIMATE A MODEL USING THE OBSERVATIONS GENERATED FROM THE KNOWN MODEL
hmmInit = initHMM(states, symbols, c(1/3,1/3,1/3))
hmmFit = baumWelch(hmmInit, obs)


#The parameters of hmmTrue and hmmFit are not at all alike, why is this?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ZA unit root test lag order selection

2013-04-16 Thread Matthieu Stigler

Hi Anonymous

There are different methods to select lags in unit roots tests, the two you
mention are not fundamentally wrong, and belong to the standard methods
used, even if the IC selection is maybe now the prefered solution. Note
there is some work from Perron and Ng with a refined selection criterion
with better properties, unfortunately it hasn't been implemented in R as
far as I know.

You are not mentioning the package you use (nor the code?), I guess you use
urca? In this case, you could extract AIC/BIC with:

library(urca)
data(nporg)
gnp - na.omit(nporg[, gnp.r])
za.gnp - ur.za(gnp, model=both, lag=2)
summary(za.gnp)

logLik.ur.za - function(object,...) logLik(object@testreg) ## necessary as
AIC not directly implemented
AIC(za.gnp)

Best

Matthieu





I was wondering if anyone could help with choosing optimal lag length for
ZA
test.

There have been two lag order selection methods commonly used in the
literature:

1) The ZA paper recommends to run the test with maximum number of lags.
Then
the lag order is reduced sequentially until the longest lag is
statistically
significant;

2) One could also use AIC or SBC or other criteria to choose lag order.

I am using annual series with 22 observations. Which of the above lag order
selection procedures would be correct to apply?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] assistant

2013-04-16 Thread Adelabu Ahmmed

Dear Sir/Ma,

I Adelabu.A.A, one of the R-users from Nigeria. When am running a coxph command 
the below error was generated, and have try some idea but not going through. 
kindly please assist:

  cox1   - coxph(Surv(tmonth,status) ~ sex +  age + marital + sumassure, X)
Warning message:
In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge

  summary(cox1, conf.int=0.95, exact = TRUE)
Call:
coxph(formula = Surv(tmonth, status) ~ sex + age + marital + 
sumassure, data = X)

  n= 5958, number of events= 316 

   coef  exp(coef)   se(coef)  z Pr(|z|)   
sex  -1.418e-01  8.678e-01  1.743e-01 -0.814  0.41593   
age   1.492e-03  1.001e+00  1.369e-03  1.090  0.27593   
marital   4.283e-01  1.535e+00  2.201e-01  1.946  0.05165 . 
sumassure1   -1.553e+01  1.795e-07  3.699e+03 -0.004  0.99665   
sumassure10  -7.357e-01  4.792e-01  1.156e+00 -0.636  0.52467   
sumassure100 -1.556e+01  1.747e-07  5.556e+02 -0.028  0.97766   
sumassure1000800 -1.549e+01  1.874e-07  4.062e+03 -0.004  0.99696   
sumassure1002000 -1.564e+01  1.607e-07  2.408e+03 -0.006  0.99482   
sumassure1008000 -1.562e+01  1.650e-07  3.225e+03 -0.005  0.99614   
sumassure1008-1.541e+01  2.028e-07  7.148e+03 -0.002  0.99828   
sumassure1014673.1   -1.543e+01  1.988e-07  6.218e+03 -0.002  0.99802   
sumassure101737.031.186e+00  3.275e+00  1.418e+00  0.836  0.40288   
sumassure101850.551.054e+00  2.870e+00  1.418e+00  0.743  0.45731   
sumassure102000   4.671e-02  1.048e+00  1.416e+00  0.033  0.97369   
sumassure102 -1.525e+01  2.375e-07  3.578e+03 -0.004  0.99660   
sumassure1027251.36  -1.568e+01  1.557e-07  3.699e+03 -0.004  0.99662   
sumassure1035360.53  -1.542e+01  2.015e-07  6.961e+03 -0.002  0.99823   
sumassure1043436.77  -1.547e+01  1.905e-07  5.366e+03 -0.003  0.99770   
sumassure1043438.77  -1.547e+01  1.908e-07  1.981e+03 -0.008  0.99377   
sumassure10482402.52 -1.567e+01  1.560e-07  3.699e+03 -0.004  0.99662   
sumassure105000  -1.556e+01  1.755e-07  3.493e+03 -0.004  0.99645   
sumassure105 -1.562e+01  1.644e-07  3.293e+03 -0.005  0.99622   
sumassure1052631.57  -1.555e+01  1.764e-07  4.870e+03 -0.003  0.99745   
sumassure1056363.94  -1.498e+01  3.123e-07  7.384e+03 -0.002  0.99838   
sumassure1059480 -1.555e+01  1.763e-07  1.589e+03 -0.010  0.99219   
sumassure1073559.38  -1.551e+01  1.842e-07  5.238e+03 -0.003  0.99764   
sumassure108000   2.147e+00  8.558e+00  1.420e+00  1.512  0.13056   
sumassure108 -2.121e+00  1.200e-01  1.226e+00 -1.730  0.08367 . 
sumassure1080-1.532e+01  2.215e-07  2.657e+03 -0.006  0.99540   
sumassure1081137.2   -1.534e+01  2.182e-07  4.921e+03 -0.003  0.99751   
sumassure108591.751.126e+00  3.084e+00  1.418e+00  0.794  0.42705   
sumassure110 -1.553e+01  1.803e-07  3.699e+03 -0.004  0.99665   
sumassure11121828.15 -1.526e+01  2.351e-07  6.735e+03 -0.002  0.99819   
sumassure111417.021.042e+00  2.836e+00  1.418e+00  0.735  0.46240   
sumassure1116251.83   2.065e+00  7.889e+00  1.421e+00  1.454  0.14595   
sumassure1122821.57  -1.567e+01  1.569e-07  3.699e+03 -0.004  0.99662   
...

Concordance= 0.781  (se = 0.018 )
Rsquare= 0.119   (max possible= 0.577 )
Likelihood ratio test= 752.4  on 526 df,   p=2.95e-10
Wald test= 655.2  on 526 df,   p=0.000101
Score (logrank) test = 2262  on 526 df,   p=0


The sumassure is a sum assured amount of policy holder in insurance.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HMM Package parameter estimation

2013-04-16 Thread Ingmar Visser

It seems that indeed providing other starting values initiates iterations
to take place. However, more worrisome is that the does not seem to
converge, even when upping the number of iterations. Below I run a 2 state
model on the same as well for comparison (I have added set.seed statements
to make the output exactly reproducible).
For 2 and 3 state models I get (see code below):

 fm2
Convergence info: Log likelihood converged to within tol. (relative change)
'log Lik.' -585.7628 (df=5)
AIC:  1181.526
BIC:  1206.064
 fm3
Convergence info: 'maxit' iterations reached in EM without convergence.
'log Lik.' -585.5807 (df=11)
AIC:  1193.161
BIC:  1247.147

hth, Ingmar

set.seed(11)
# SIMULATE 1000 OBSERVATIONS OF THE KNOWN MODEL
observation = simHMM(hmmTrue, 1000)
obs = observation$observation

#ESTIMATE A MODEL USING THE OBSERVATIONS GENERATED FROM THE KNOWN MODEL
hmmInit = initHMM(states, symbols, c(4,2,1)/7 )
# # hmmFit = baumWelch(hmmInit, obs)

hmmFit = baumWelch(hmmInit, obs, maxI=200)

library(depmixS4)

m2 - depmix(obs~1,family=multinomial(identity),ns=2,nt=1000)
set.seed(12)
fm2 - fit(m2)

m3 - depmix(obs~1,family=multinomial(identity),ns=3,nt=1000)
set.seed(13)
fm3 - fit(m3)



On Tue, Apr 16, 2013 at 11:53 AM, Rolf Turner rolf.tur...@xtra.co.nzwrote:


 I think it's your starting values for the initial state probability
 distribution,
 i.e. c(1,1,1)/3 that cause the problem.  They seem to drop you into some
 sort of local maximum/stationary point, a long way from the global maximum.

 Try, e.g. c(4,2,1)/7; this gives me:

  hmmFit$hmm$emissionProbs
   symbols
 states 1  2
  1 0.9385018 0.06149819
  2 0.7883591 0.21164092
  3 0.2279287 0.77207131

  hmmFit$hmm$transProbs
 to
 from 1 2 3
1 0.6925055 0.1239590 0.1835355
2 0.2537700 0.5780679 0.1681621
3 0.2455462 0.1190872 0.6353666

 which look to be in reasonable agreement with the true values.
 Note though that states 2 and 3 have been swapped.  This happens.

 cheers,

 Rolf Turner



 On 16/04/13 13:13, Richard Philip wrote:

 Hi,

 I am having difficulties estimating the parameters of a HMM using the HMM
 package. I have simulated a sequence of observations from a known HMM.
 When
 I estimate the parameters of a HMM using these simulated observations the
 parameters are not at all close to the known ones. I realise the estimated
 parameters are not going to be exactly the same as the known/true
 parameters, but these are nowhere close. Below is my code used. Any ideas
 or possible suggestions regarding this issue would be greatly appreciated?


 library(HMM)

 ## DECLARE PARAMETERS OF THE KNOWN MODEL
 states = c(1,2,3)
 symbols = c(1,2)
 startProb = c(0.5,0.25,0.25)
 transProb = matrix(c(0.8,0.05,0.15,0.2,0.**6,0.2,0.2,0.3,0.5),3,3,TRUE)
 emissionProb =  matrix(c(0.9,0.1,0.2,0.8,0.7,**0.3), 3,2,TRUE)

 # CREATE THE KNOWN MODEL
 hmmTrue = initHMM(states, symbols, startProb, transProb , emissionProb)

 # SIMULATE 1000 OBSERVATIONS OF THE KNOWN MODEL
 observation = simHMM(hmmTrue, 1000)
 obs = observation$observation

 #ESTIMATE A MODEL USING THE OBSERVATIONS GENERATED FROM THE KNOWN MODEL
 hmmInit = initHMM(states, symbols, c(1/3,1/3,1/3))
 hmmFit = baumWelch(hmmInit, obs)


 #The parameters of hmmTrue and hmmFit are not at all alike, why is this?


 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] assistant

2013-04-16 Thread peter dalgaard

Looks like sumassure is treated as categorical. This sort of thing is usually a 
data error; it happens if one of the values can not be converted to numeric, O 
instead of 0, comma instead of period, etc.

Check summary(X), or, to investigate more specifically, things like

x - X$sumassure
table(x[is.na(as.numeric(x))])

-pd 

On Apr 16, 2013, at 11:31 , Adelabu Ahmmed wrote:

 Dear Sir/Ma,
 
 I Adelabu.A.A, one of the R-users from Nigeria. When am running a coxph 
 command the below error was generated, and have try some idea but not going 
 through. kindly please assist:
 
 cox1   - coxph(Surv(tmonth,status) ~ sex +  age + marital + sumassure, X)
 Warning message:
 In fitter(X, Y, strats, offset, init, control, weights = weights,  :
  Ran out of iterations and did not converge
 
 summary(cox1, conf.int=0.95, exact = TRUE)
 Call:
 coxph(formula = Surv(tmonth, status) ~ sex + age + marital + 
sumassure, data = X)
 
  n= 5958, number of events= 316 
 
   coef  exp(coef)   se(coef)  z Pr(|z|)   
 sex  -1.418e-01  8.678e-01  1.743e-01 -0.814  0.41593   
 age   1.492e-03  1.001e+00  1.369e-03  1.090  0.27593   
 marital   4.283e-01  1.535e+00  2.201e-01  1.946  0.05165 . 
 sumassure1   -1.553e+01  1.795e-07  3.699e+03 -0.004  0.99665   
 sumassure10  -7.357e-01  4.792e-01  1.156e+00 -0.636  0.52467   
 sumassure100 -1.556e+01  1.747e-07  5.556e+02 -0.028  0.97766   
 sumassure1000800 -1.549e+01  1.874e-07  4.062e+03 -0.004  0.99696   
 sumassure1002000 -1.564e+01  1.607e-07  2.408e+03 -0.006  0.99482   
 sumassure1008000 -1.562e+01  1.650e-07  3.225e+03 -0.005  0.99614   
 sumassure1008-1.541e+01  2.028e-07  7.148e+03 -0.002  0.99828   
 sumassure1014673.1   -1.543e+01  1.988e-07  6.218e+03 -0.002  0.99802   
 sumassure101737.031.186e+00  3.275e+00  1.418e+00  0.836  0.40288   
 sumassure101850.551.054e+00  2.870e+00  1.418e+00  0.743  0.45731   
 sumassure102000   4.671e-02  1.048e+00  1.416e+00  0.033  0.97369   
 sumassure102 -1.525e+01  2.375e-07  3.578e+03 -0.004  0.99660   
 sumassure1027251.36  -1.568e+01  1.557e-07  3.699e+03 -0.004  0.99662   
 sumassure1035360.53  -1.542e+01  2.015e-07  6.961e+03 -0.002  0.99823   
 sumassure1043436.77  -1.547e+01  1.905e-07  5.366e+03 -0.003  0.99770   
 sumassure1043438.77  -1.547e+01  1.908e-07  1.981e+03 -0.008  0.99377   
 sumassure10482402.52 -1.567e+01  1.560e-07  3.699e+03 -0.004  0.99662   
 sumassure105000  -1.556e+01  1.755e-07  3.493e+03 -0.004  0.99645   
 sumassure105 -1.562e+01  1.644e-07  3.293e+03 -0.005  0.99622   
 sumassure1052631.57  -1.555e+01  1.764e-07  4.870e+03 -0.003  0.99745   
 sumassure1056363.94  -1.498e+01  3.123e-07  7.384e+03 -0.002  0.99838   
 sumassure1059480 -1.555e+01  1.763e-07  1.589e+03 -0.010  0.99219   
 sumassure1073559.38  -1.551e+01  1.842e-07  5.238e+03 -0.003  0.99764   
 sumassure108000   2.147e+00  8.558e+00  1.420e+00  1.512  0.13056   
 sumassure108 -2.121e+00  1.200e-01  1.226e+00 -1.730  0.08367 . 
 sumassure1080-1.532e+01  2.215e-07  2.657e+03 -0.006  0.99540   
 sumassure1081137.2   -1.534e+01  2.182e-07  4.921e+03 -0.003  0.99751   
 sumassure108591.751.126e+00  3.084e+00  1.418e+00  0.794  0.42705   
 sumassure110 -1.553e+01  1.803e-07  3.699e+03 -0.004  0.99665   
 sumassure11121828.15 -1.526e+01  2.351e-07  6.735e+03 -0.002  0.99819   
 sumassure111417.021.042e+00  2.836e+00  1.418e+00  0.735  0.46240   
 sumassure1116251.83   2.065e+00  7.889e+00  1.421e+00  1.454  0.14595   
 sumassure1122821.57  -1.567e+01  1.569e-07  3.699e+03 -0.004  0.99662   
 ...
 
 Concordance= 0.781  (se = 0.018 )
 Rsquare= 0.119   (max possible= 0.577 )
 Likelihood ratio test= 752.4  on 526 df,   p=2.95e-10
 Wald test= 655.2  on 526 df,   p=0.000101
 Score (logrank) test = 2262  on 526 df,   p=0
 
 
 The sumassure is a sum assured amount of policy holder in insurance.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Splitting the Elements of character vector

2013-04-16 Thread Katherine Gobin

Dear R forum

I have a data.frame 

df = data.frame(currency_type = c(EURO_o_n, EURO_o_n, EURO_1w, EURO_1w, 
USD_o_n, USD_o_n, USD_1w, USD_1w), rates = c(0.47, 0.475, 0.461, 0.464, 
1.21, 1.19, 1.41, 1.43))

  currency_type rates
1  EURO_o_n   0.470
2  EURO_o_n   0.475
3   EURO_1w   0.461
4   EURO_1w   0.464
5   USD_o_n    1.210
6   USD_o_n    1.190
7    USD_1w    1.410
8    USD_1w    1.430


I need to split the values appearing under currency_type to obtain following 
data.frame in the original order

currency tenor   rates
EURO o_n     0.470
EURO o_n     0.475
EURO 1w  0.461 
EURO 1w  0.464
USD   o_n 1.210
USD   o_n 1.190
USD           1w      1.410
USD           1w      1.430    

Basically I need to split the currency name and tenors.

I tried

strsplit(df$currency_type, _)
Error in strsplit(df$currency_type, _) : non-character argument

Kindly guide

Katherine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting the Elements of character vector

2013-04-16 Thread Gabor Grothendieck

On Tue, Apr 16, 2013 at 8:38 AM, Katherine Gobin
katherine_go...@yahoo.com wrote:
 Dear R forum

 I have a data.frame

 df = data.frame(currency_type = c(EURO_o_n, EURO_o_n, EURO_1w, 
 EURO_1w, USD_o_n, USD_o_n, USD_1w, USD_1w), rates = c(0.47, 0.475, 
 0.461, 0.464, 1.21, 1.19, 1.41, 1.43))

   currency_type rates
 1  EURO_o_n   0.470
 2  EURO_o_n   0.475
 3   EURO_1w   0.461
 4   EURO_1w   0.464
 5   USD_o_n1.210
 6   USD_o_n1.190
 7USD_1w1.410
 8USD_1w1.430


 I need to split the values appearing under currency_type to obtain following 
 data.frame in the original order

 currency tenor   rates
 EURO o_n 0.470
 EURO o_n 0.475
 EURO 1w  0.461
 EURO 1w  0.464
 USD   o_n 1.210
 USD   o_n 1.190
 USD   1w  1.410
 USD   1w  1.430

 Basically I need to split the currency name and tenors.


Try sub:

with(df, data.frame(
   currency = sub(_.*, , currency_type),
   tenor = sub(^[^_]*_, , currency_type),
   rates)
)

--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting the Elements of character vector

2013-04-16 Thread arun

Hi,
Try:
df = data.frame(currency_type = c(EURO_o_n, EURO_o_n, EURO_1w, EURO_1w, 
USD_o_n, USD_o_n, USD_1w, USD_1w), rates = c(0.47, 0.475, 0.461, 0.464, 
1.21, 1.19, 1.41, 1.43),stringsAsFactors=FALSE)
 df$currency-unlist(lapply(str_split(df[,1],_),`[`,1))
 df$tenor-unlist(lapply(str_split(df[,1],_),function(x) 
{paste(x[-1],collapse=_)}))
 df[,c(3,4,2)]
 # currency tenor rates
#1 EURO   o_n 0.470
#2 EURO   o_n 0.475
#3 EURO    1w 0.461
#4 EURO    1w 0.464
#5  USD   o_n 1.210
#6  USD   o_n 1.190
#7  USD    1w 1.410
#8  USD    1w 1.430
A.K.



- Original Message -
From: Katherine Gobin katherine_go...@yahoo.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, April 16, 2013 8:38 AM
Subject: [R] Splitting the Elements of character vector

Dear R forum

I have a data.frame 

df = data.frame(currency_type = c(EURO_o_n, EURO_o_n, EURO_1w, EURO_1w, 
USD_o_n, USD_o_n, USD_1w, USD_1w), rates = c(0.47, 0.475, 0.461, 0.464, 
1.21, 1.19, 1.41, 1.43))

  currency_type rates
1  EURO_o_n   0.470
2  EURO_o_n   0.475
3   EURO_1w   0.461
4   EURO_1w   0.464
5   USD_o_n    1.210
6   USD_o_n    1.190
7    USD_1w    1.410
8    USD_1w    1.430


I need to split the values appearing under currency_type to obtain following 
data.frame in the original order

currency tenor   rates
EURO o_n 0.470
EURO o_n 0.475
EURO 1w  0.461 
EURO 1w  0.464
USD   o_n 1.210
USD   o_n 1.190
USD           1w      1.410
USD           1w      1.430    

Basically I need to split the currency name and tenors.

I tried

strsplit(df$currency_type, _)
Error in strsplit(df$currency_type, _) : non-character argument

Kindly guide

Katherine

    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Splitting the Elements of character vector

2013-04-16 Thread arun

HI,
You can also do this by:
library(stringr)
df2-data.frame(currency=word(str_replace(df[,1],_, ),1), 
temor=word(str_replace(df[,1],_, ),2), 
rates=df$rates,stringsAsFactors=FALSE)
 df2
#  currency temor rates
#1 EURO   o_n 0.470
#2 EURO   o_n 0.475
#3 EURO    1w 0.461
#4 EURO    1w 0.464
#5  USD   o_n 1.210
#6  USD   o_n 1.190
#7  USD    1w 1.410
#8  USD    1w 1.430
A.K.







- Original Message -
From: arun smartpink...@yahoo.com
To: Katherine Gobin katherine_go...@yahoo.com
Cc: R help r-help@r-project.org
Sent: Tuesday, April 16, 2013 9:00 AM
Subject: Re: [R] Splitting the Elements of character vector

Hi,
Try:
df = data.frame(currency_type = c(EURO_o_n, EURO_o_n, EURO_1w, EURO_1w, 
USD_o_n, USD_o_n, USD_1w, USD_1w), rates = c(0.47, 0.475, 0.461, 0.464, 
1.21, 1.19, 1.41, 1.43),stringsAsFactors=FALSE)
 df$currency-unlist(lapply(str_split(df[,1],_),`[`,1))
 df$tenor-unlist(lapply(str_split(df[,1],_),function(x) 
{paste(x[-1],collapse=_)}))
 df[,c(3,4,2)]
 # currency tenor rates
#1 EURO   o_n 0.470
#2 EURO   o_n 0.475
#3 EURO    1w 0.461
#4 EURO    1w 0.464
#5  USD   o_n 1.210
#6  USD   o_n 1.190
#7  USD    1w 1.410
#8  USD    1w 1.430
A.K.



- Original Message -
From: Katherine Gobin katherine_go...@yahoo.com
To: r-help@r-project.org
Cc: 
Sent: Tuesday, April 16, 2013 8:38 AM
Subject: [R] Splitting the Elements of character vector

Dear R forum

I have a data.frame 

df = data.frame(currency_type = c(EURO_o_n, EURO_o_n, EURO_1w, EURO_1w, 
USD_o_n, USD_o_n, USD_1w, USD_1w), rates = c(0.47, 0.475, 0.461, 0.464, 
1.21, 1.19, 1.41, 1.43))

  currency_type rates
1  EURO_o_n   0.470
2  EURO_o_n   0.475
3   EURO_1w   0.461
4   EURO_1w   0.464
5   USD_o_n    1.210
6   USD_o_n    1.190
7    USD_1w    1.410
8    USD_1w    1.430


I need to split the values appearing under currency_type to obtain following 
data.frame in the original order

currency tenor   rates
EURO o_n 0.470
EURO o_n 0.475
EURO 1w  0.461 
EURO 1w  0.464
USD   o_n 1.210
USD   o_n 1.190
USD           1w      1.410
USD           1w      1.430    

Basically I need to split the currency name and tenors.

I tried

strsplit(df$currency_type, _)
Error in strsplit(df$currency_type, _) : non-character argument

Kindly guide

Katherine

    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] converting blank cells to NAs

2013-04-16 Thread arun

Hi,
I am not sure about the problem.
If your non-numeric vector is like:
a,b,,d,e,,f

vec1-unlist(str_split(readLines(textConnection(a,b,,d,e,,f)),,))
 vec1[vec1==]- NA
 vec1
#[1] a b NA  d e NA  f

If this doesn't work, please provide an example vector.
A.K.



Thanks for the response.  That seems to do the trick as far replacing the 
empty 
cells with NA, however, the problem remains that the vector is 
not numeric.  This was the reason I wanted to replace the empty cells 
with NAs in the first place.  Forcing the vector with as.numeric 
afterwards doesn't seem to work either, I get nonsensical results.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R process slow down after a amount of time

2013-04-16 Thread Chris82

Hi R users,

I have mentioned that R is getting slower if a process with a loop runs for
a while. Is that normal?
Let's say, I have a code which produce an output file after one loop run.
Now after 10, 15 or 20 loop runs the time between the created files is
stongly increasing.
Is there maybe any data which fill some memory?


Chris



--
View this message in context: 
http://r.789695.n4.nabble.com/R-process-slow-down-after-a-amount-of-time-tp4664358.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process slow down after a amount of time

2013-04-16 Thread R. Michael Weylandt michael.weyla...@gmail.com



On Apr 16, 2013, at 9:52 AM, Chris82 rubenba...@gmx.de wrote:

 Hi R users,
 
 I have mentioned that R is getting slower if a process with a loop runs for
 a while. Is that normal?
 Let's say, I have a code which produce an output file after one loop run.
 Now after 10, 15 or 20 loop runs the time between the created files is
 stongly increasing.
 Is there maybe any data which fill some memory?

Possibly, but I were to put money on it, I'd guess there's an ever-expanding 
object problem:

x - NULL
for(i in 1:1e6) x - c(x, rnorm(1))

which is not-so-secretly quadratic and should instead be: x - rnorm(1e6)

Perhaps a small reproducible example would help us help you. 

Michael 


 
 
 Chris
 
 
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/R-process-slow-down-after-a-amount-of-time-tp4664358.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] use of simulate.Arima (forecast package)

2013-04-16 Thread Rui Barradas


Hello,

The help page is pretty clear, I think. You have to pass an object of 
class 'Arima', 'ar' or 'ets' to simulate.Arima.
See, for instance the second example in the help page for ?Arima. And 
extend it like this:



set.seed(6816)
lines(simulate(air.model, nsim = 48), col = red)


Hope this helps,

Rui Barradas

Em 15-04-2013 15:13, Stefano Sofia escreveu:

I would like to simulate some SARIMA models, e.g. a SARIMA (1,0,1)(1,0,1)[4] 
process.

I installed the package 'forecast', where the function simulate.Arima should do 
what I am trying to do.
I am not able to understand how it works
Could somebody help me with an example?

thank you
Stefano Sofia


AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche 
possono contenere informazioni confidenziali e con privilegi legali. Se non si 
è il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al 
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi 
dell’art. 6 della  DGR n. 1394/2008 si segnala che, in caso di necessità ed 
urgenza, la risposta al presente messaggio di posta elettronica può essere 
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Regione Marche may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ZA unit root test lag order selection

2013-04-16 Thread londonphd

Dear Matthieu,

Many thanks for your reply.
I was not sure what the best way forward in selecting lag length. Eventually
I wrote a function that carries out serial correl test and AIC based lag
length selections. I used urca package.

Here is what I come up with in the end:

zamod.A=ur.za(x, model=intercept,lag=j)
AIC(eval(attributes(zamod.A)$testreg)) 
# for lag order selection 
bgtest(attributes(zamod.A)$testreg,order=3)$p.value# to test for
serial correlation

the final decision on the best model comes from examining the results of AIC
and BG tests.

Thanks,
Rosh







--
View this message in context: 
http://r.789695.n4.nabble.com/ZA-unit-root-test-lag-order-selection-tp4664183p4664350.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread Hadley Wickham

Hi all,

At RStudio, we're hosting our Introduction to R Workshop this May in
two locations. As an R-help subscriber, we're offering 10% off!

* Intro to data science with R (http://goo.gl/bplg3)
  May 13-14 New York City

* Intro to data science with R (http://goo.gl/VCUFL)
  May 20-21 San Francisco Bay Area

What will you learn?

Practical skills for visualizing, transforming, and modeling data in
R. During this two-day course, you will learn how to explore and
understand data as well as how to do basic programming in R. Our
courses incorporate a mix of lectures and hands-on learning. Expect to
learn about a topic and then immediately put it into practice with a
small example. Plenty of help will be available if you get stuck. You
can read more about our training philosophy at
http://www.rstudio.com/training/philosophy.html

To see prices, precise locations and to register:

* for the NY course: http://rstudio-nyc.eventbrite.com/
* for the SF course: http://rstudio-bay.eventbrite.com/

We have limited discounts for students (66% off) and academics (33%
off) - please contact j...@rstudio.com for details. To thank the
R-help community for being such a great resource, we'd also like to
offer all R-help subscribers a 10% discount. Just enter rhelpftw as
a promotional code get 10% off!

Regards,

Hadley

PS. Would you like us to offer these courses (or others!) in your
area? Please let us know at
http://www.rstudio.com/training/workshops/

--
Chief Scientist, RStudio
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] varSelRF help

2013-04-16 Thread Thabungba Meetei

#this is my data set.
data_set-data.frame(x0=c(1,1,0,0),
x1=c(1,1,0,0),x2=c(1,1,0,0),x3=c(1,1,0,0),x4=c(1,1,0,0))
#this is my target
target-c(1,1,0,0)

rf.vs1 - varSelRF(data_set, as.factor(target), ntree = 500, ntreeIterat =
300,
   vars.drop.frac = 0.2)
rf.vs1
rf.vs1[[3]]




It is giving me only 2  significant variables, but I am expecting  5
significant variables. Please help.
I am a newbie.


Regards,
Thabung

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Spatial Ananlysis: zero.policy=TRUE doesn't work for no neighbour regions??

2013-04-16 Thread Molo

Hello,

I'm new to R and to Spatial Analysis and got a problem trying to create a
Spatial Weights Matrix.

*I us the following code to create the Neighbourslist:*

library(maptools)
library(spdep)
library(rgdal)

location_County- readShapePoly()
proj4string(location_County)- CRS(+proj=longlat ellps=WGS84)
location_nbq- poly2nb(location_County)
summary(location_nbq)

*And get this Output:*

Neighbour list object:
Number of regions: 3109 
Number of nonzero links: 18246 
Percentage nonzero weights: 0.1887671 
Average number of links: 5.868768 
4 regions with no links:
35 689 709 881
Link number distribution:

   0123456789   10   11   13   14 
   4   29   40   94  283  616 1045  703  228   51   12211 
29 least connected regions:
45 49 587 645 844 853 1206 1286 1391 1416 1456 1478 1485 1545 1546 1548 1558
1612 1621 1663 1672 1675 1760 1794 1795 2924 2925 2952 3107 with 1 link
1 most connected region:
1385 with 14 links

*As there are some regions without neighbours in my data I use the following
code to create the Weights Matrix:*

 W_Matrix- nb2listw(location_nbq, style=W, zero.policy=TRUE)
W_Matrix

*And get this Output:*

Fehler in print.listw(list(style = W, neighbours = list(c(23L, 31L, 42L : 
  regions with no neighbours found, use zero.policy=TRUE

/(Error in print.listw(list(style = W, neighbours = list(c(23L, 31L, 42L
: 
  regions with no neighbours found, use zero.policy=TRUE)/

As I use zero.policy=TRUE I just don't understand what I'm doing wrong...
My question would be: How could I create a Weights Matrix allowing for
no-neighbour areas? 

Thanks
Michael






--
View this message in context: 
http://r.789695.n4.nabble.com/Spatial-Ananlysis-zero-policy-TRUE-doesn-t-work-for-no-neighbour-regions-tp4664367.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help needed with data format required for package VIF

2013-04-16 Thread Jaap van Wyk



Hallo

Could somebody perhaps assist with my dilemma,
Package: VIF. The examples are not very clear (data is stored internally).

I wish to read a .csv file (header=TRUE) and run VIF. But I get  
nonsensical output.

I have downloaded the boston.csv file (from the referring website).
How do I run the example using this file format directly (say, using  
read.table ??


Any help is greatly appreciated.

Regards
Jacob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] matching multiple fields from a matrix

2013-04-16 Thread jercrowley

Hi Arun,

This is excellent and elegant.  I thought there had to be a relatively simple 
way to do this.  Thank you very much.
Jeremy

From: arun kirshna [via R] [mailto:ml-node+s789695n4664328...@n4.nabble.com]
Sent: Monday, April 15, 2013 10:34 PM
To: Crowley, Jeremy
Subject: Re: matching multiple fields from a matrix

HI,
May be this helps:
dat1- read.table(text=
site1 depth1 year1 site2 depth2 year2
10 30 1860 NA NA NA
NA NA NA 50 30 1860
10 20 1850  11 20 1850
11 25 1950  12 25 1960
10 NA 1870  12 30 1960
11 25 1880  15 22  1890
14 22 1890  14 25 1880
,sep=,header=TRUE,stringsAsFactors=FALSE)

res-merge(dat1[,1:3],dat1[,4:6],by.x=c(depth1,year1),by.y=c(depth2,year2))
 names(res)[1:2]- gsub(\\d+file:///\\d+,,names(res))[1:2]
 na.omit(res)
#  depth year site1 site2
#120 18501011
#222 18901415
#325 18801114
#430 18601050
A.K.

- Original Message -
From: jercrowley [hidden 
email]/user/SendEmail.jtp?type=nodenode=4664328i=0
To: [hidden email]/user/SendEmail.jtp?type=nodenode=4664328i=1
Cc:
Sent: Monday, April 15, 2013 5:07 PM
Subject: [R] matching multiple fields from a matrix

I have been trying many ways to match 2 separate fields in a matrix.  Here is
a simplified version of the matrix:

site1depth1year1site2depth2year2
10301860NANANA
NANANA50301860

Basically I am trying to identify the sites which have a common year and
depth from 2 datasets.  What I would like to do is match all of the year1
field to year2 field and the depth1 field and to depth2 field.  Then I would
like to output site1, site2, depth, and year.

I have been trying if loops, which(), isTRUE(), etc. but I have not come up
with anything that works.

Any help would be greatly appreciated.

Jeremy

--
View this message in context: 
http://r.789695.n4.nabble.com/matching-multiple-fields-from-a-matrix-tp4664309.html
Sent from the R help mailing list archive at Nabble.com.

__
[hidden email]/user/SendEmail.jtp?type=nodenode=4664328i=2 mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
[hidden email]/user/SendEmail.jtp?type=nodenode=4664328i=3 mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

If you reply to this email, your message will be added to the discussion below:
http://r.789695.n4.nabble.com/matching-multiple-fields-from-a-matrix-tp4664309p4664328.html
To unsubscribe from matching multiple fields from a matrix, click 
herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4664309code=amNyb3dsZXkyQG10ZWNoLmVkdXw0NjY0MzA5fDEwMzU2Mjk5ODI=.
NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespacebreadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context: 
http://r.789695.n4.nabble.com/matching-multiple-fields-from-a-matrix-tp4664309p4664376.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread Bert Gunter

Hadley:

I don't think this is appropriate. Think of what it would be like if everyone 
shilled their R training and consulting wares here. 

Bert

Sent from my iPhone -- please excuse typos.

On Apr 16, 2013, at 8:09 AM, Hadley Wickham h.wick...@gmail.com wrote:

 Hi all,
 
 At RStudio, we're hosting our Introduction to R Workshop this May in
 two locations. As an R-help subscriber, we're offering 10% off!
 
 * Intro to data science with R (http://goo.gl/bplg3)
  May 13-14 New York City
 
 * Intro to data science with R (http://goo.gl/VCUFL)
  May 20-21 San Francisco Bay Area
 
 What will you learn?
 
 Practical skills for visualizing, transforming, and modeling data in
 R. During this two-day course, you will learn how to explore and
 understand data as well as how to do basic programming in R. Our
 courses incorporate a mix of lectures and hands-on learning. Expect to
 learn about a topic and then immediately put it into practice with a
 small example. Plenty of help will be available if you get stuck. You
 can read more about our training philosophy at
 http://www.rstudio.com/training/philosophy.html
 
 To see prices, precise locations and to register:
 
 * for the NY course: http://rstudio-nyc.eventbrite.com/
 * for the SF course: http://rstudio-bay.eventbrite.com/
 
 We have limited discounts for students (66% off) and academics (33%
 off) - please contact j...@rstudio.com for details. To thank the
 R-help community for being such a great resource, we'd also like to
 offer all R-help subscribers a 10% discount. Just enter rhelpftw as
 a promotional code get 10% off!
 
 Regards,
 
 Hadley
 
 PS. Would you like us to offer these courses (or others!) in your
 area? Please let us know at
 http://www.rstudio.com/training/workshops/
 
 --
 Chief Scientist, RStudio
 http://had.co.nz/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Overlay two stat_ecdf() plots

2013-04-16 Thread David Winsemius


On Apr 16, 2013, at 12:45 AM, PIKAL Petr wrote:

 Hi
 
 Do you mean ecdf? If yes just ose add option in plot.
 
 plot(ecdf(rnorm(100, 1,2)))
 plot(ecdf(rnorm(100, 2,2)), add=TRUE, col=2)
 
 If not please specify from where is ecdf_stat or stat_ecdf which, as you 
 indicate, are the same functions.

It has the appearance of a ggplot2 function, so I think this student has not 
yet grasped that there needs to be a call to ggplot to set up the dataframework 
to which `stat_ecdf` will then be added (with the overloaded + operator) as 
a layer.

 
 Regrdas
 Petr
 
 
 
 
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Robin Mjelle
 Sent: Monday, April 15, 2013 1:10 PM
 To: r-help@r-project.org
 Subject: [R] Overlay two stat_ecdf() plots
 
 I want to plot two scdf-plots in the same graph.
 I have two input tables with one column each:
 
 Targets - read.table(/media/, sep=, header=T) NonTargets -
 read.table(/media/..., sep=, header=T)
 
 head(Targets)
V1
 1 3.160514
 2 6.701948
 3 4.093844
 4 1.992014
 5 1.604751
 6 2.076802
 
 head(NonTargets)
 V1
 1  3.895934
 2  1.990506
 3 -1.746919
 4 -3.451477
 5  5.156554
 6  1.195109
 
 Targets.m - melt(Targets)
 head(Targets.m)
  variablevalue
 1   V1 3.160514
 2   V1 6.701948
 3   V1 4.093844
 4   V1 1.992014
 5   V1 1.604751
 6   V1 2.076802
 
 NonTargets.m - melt(NonTargets)
 head(NonTargets.m)
  variable value
 1   V1  3.895934
 2   V1  1.990506
 3   V1 -1.746919
 4   V1 -3.451477
 5   V1  5.156554
 6   V1  1.195109
 
 
 How do I proceed to plot them in one Graph using ecdf_stat()
 
  [[alternative HTML version deleted]]
 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] odfWeave: Some questions about potential formatting options

2013-04-16 Thread Paul Miller

Hi Milan and Max,

Thanks to each of you for your reply to my post. Thus far, I've managed to find 
answers to some of the questions I asked initially.

I am now able to control the justification of the leftmost column in my tables, 
as well as to add borders to the top and bottom. I also downloaded Milan's 
revised version of odfWeave at the link below, and found that it does a nice 
job of controlling column widths.

http://nalimilan.perso.neuf.fr/transfert/odfWeave.tar.gz

There are some other things I'm still struggling with though. 

1. Is it possible to get odfTableCaption and odfFigureCaption to make the 
titles they produce bold? I understand it might be possible to accomplish this 
by changing something in the styles but am not sure what. If someone can give 
me a hint, I can likely do the rest.

2. Is there any way to get odfFigureCaption to put titles at the top of the 
figure instead of the bottom? I've noticed that odfTableCaption is able to do 
this but apparently not odfFigureCaption.

3. Is it possible to add special characters to the output? Below is a sample 
Kaplan-Meier analysis. There's a footnote in there that reads Note: X2(1) = 
xx.xx, p = .. Is there any way to make the X a lowercase Chi and to 
superscript the 2? I did quite a bit of digging on this topic. It sounds like 
it might be difficult, especially if one is using Windows as I am.

Thanks,

Paul 

##
 Get data 
##

 Load packages 

require(survival)
require(MASS)

 Sample analysis 

attach(gehan)
gehan.surv - survfit(Surv(time, cens) ~ treat, data= gehan, conf.type = 
log-log)
print(gehan.surv)

survTable - summary(gehan.surv)$table
survTable - data.frame(Treatment = rownames(survTable), survTable, 
row.names=NULL)
survTable - subset(survTable, select = -c(records, n.max))

##
 odfWeave 
##

 Load odfWeave 

require(odfWeave)

 Modify StyleDefs 

currentDefs - getStyleDefs()

currentDefs$firstColumn$type - Table Column
currentDefs$firstColumn$columnWidth - 5 cm
currentDefs$secondColumn$type - Table Column
currentDefs$secondColumn$columnWidth - 3 cm

currentDefs$ArialCenteredBold$fontSize - 10pt
currentDefs$ArialNormal$fontSize - 10pt
currentDefs$ArialCentered$fontSize - 10pt
currentDefs$ArialHighlight$fontSize - 10pt

currentDefs$ArialLeftBold - currentDefs$ArialCenteredBold
currentDefs$ArialLeftBold$textAlign - left

currentDefs$cgroupBorder - currentDefs$lowerBorder
currentDefs$cgroupBorder$topBorder - 0.0007in solid #00

setStyleDefs(currentDefs)

 Modify ImageDefs 

imageDefs - getImageDefs()
imageDefs$dispWidth - 5.5
imageDefs$dispHeight- 5.5
setImageDefs(imageDefs)

 Modify Styles 

currentStyles - getStyles()
currentStyles$figureFrame - frameWithBorders
setStyles(currentStyles)

 Set odt table styles 

tableStyles - tableStyles(survTable, useRowNames = FALSE, header = )
tableStyles$headerCell[1,] - cgroupBorder
tableStyles$header[,1] - ArialLeftBold
tableStyles$text[,1] - ArialNormal
tableStyles$cell[2,] - lowerBorder

 Weave odt source file 

fp - N:/Studies/HCRPC1211/Report/odfWeaveTest/
inFile - paste(fp, testWeaveIn.odt, sep=)
outFile - paste(fp, testWeaveOut.odt, sep=)
odfWeave(inFile, outFile)

##
 Contents of .odt source file 
##

Here is a sample Kaplan-Meier table.

testKMTable, echo=FALSE, results = xml=
odfTableCaption(“A Sample Kaplan-Meier Analysis Table”)
odfTable(survTable, useRowNames = FALSE, digits = 3,
colnames = c(Treatment, Number, Events, Median, 95% LCL, 95% UCL),
colStyles = c(firstColumn, secondColumn, secondColumn, 
secondColumn, secondColumn, secondColumn),
styles = tableStyles)
odfCat(“Note: X2(1) = xx.xx, p = .”)
@

Here is a sample Kaplan-Meier graph.

testKMFig, echo=FALSE, fig = TRUE=
odfFigureCaption(A Sample Kaplan-Meier Analysis Graph, label = Figure)
plot(gehan.surv, xlab = Time, ylab= Survivorship)
@

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Strange error with log-normal models

2013-04-16 Thread Noah Silverman

Hi,

I have some data, that when plotted looks very close to a log-normal 
distribution.  My goal is to build a regression model to test how this variable 
responds to several independent variables.  

To do this, I want to use the fitdistr tool from the MASS package to see how 
well my data fits the actual distribution, and also build a generalized linear 
model using the glm command.


The summary of my data is:  

   Min. 1st Qu.  MedianMean 3rd Qu.Max. 
 0.  0.  0.  0.8617  0.8332 55.5600 

So, no missing values, no negative values.

When I try to use the fitdistr command, I get an error that I don't understand:
m - fitdistr(y, densfun=lognormal)

Error in fitdistr(y, densfun = lognormal) :  need positive values to fit a 
log-Normal



When I try to build a simple model, I also get an error:

l - glm(y~ x, family=gaussian(link=log))

Error in eval(expr, envir, enclos) :  cannot find valid starting values: please 
specify some



Can anyone offer some suggestions?


Thanks!

--
Noah Silverman, M.S.
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] the joy of spreadsheets (off-topic)

2013-04-16 Thread Sarah Goslee

Given that we occasionally run into problems with comparing Excel
results to R results, and other spreadsheet-induced errors, I thought
this might be of interest.

http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems

The punchline:

If this error turns out to be an actual mistake Reinhart-Rogoff made,
well, all I can hope is that future historians note that one of the
core empirical points providing the intellectual foundation for the
global move to austerity in the early 2010s was based on someone
accidentally not updating a row formula in Excel.

Ouch.

(Note: I know nothing about the site, the author of the article, or
the study in question. I was pointed to it by someone else. But if
true: highly problematic.)

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange error with log-normal models

2013-04-16 Thread Duncan Murdoch


On 16/04/2013 1:19 PM, Noah Silverman wrote:

Hi,

I have some data, that when plotted looks very close to a log-normal 
distribution.  My goal is to build a regression model to test how this variable 
responds to several independent variables.

To do this, I want to use the fitdistr tool from the MASS package to see how 
well my data fits the actual distribution, and also build a generalized linear 
model using the glm command.


The summary of my data is:

Min. 1st Qu.  MedianMean 3rd Qu.Max.
  0.  0.  0.  0.8617  0.8332 55.5600

So, no missing values, no negative values.

When I try to use the fitdistr command, I get an error that I don't understand:
m - fitdistr(y, densfun=lognormal)

Error in fitdistr(y, densfun = lognormal) :  need positive values to fit a 
log-Normal


You have zeros in your data.  The lognormal distribution never takes on 
the value zero.


If they are zero because of rounding (e.g. 0.001 would be recorded as 
zero), and there aren't too many of them, then replacing the zeros with 
a small positive value (e.g. half the smallest non-zero value) might 
make sense.  But your median is zero, so at least half of your 
observations are zero.


You need to come up with a better model than lognormal.

Duncan Murdoch





When I try to build a simple model, I also get an error:

l - glm(y~ x, family=gaussian(link=log))

Error in eval(expr, envir, enclos) :  cannot find valid starting values: please 
specify some



Can anyone offer some suggestions?


Thanks!

--
Noah Silverman, M.S.
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] efficiently diff two data frames

2013-04-16 Thread Liviu Andronic

Dear all,
What is the quickest and most efficient way to diff two data frames,
so as to obtain a vector of indices (or logical) for rows/columns that
differ in the two data frames?  For example,
 Xe - head(mtcars)
 Xf - head(mtcars)
 Xf[2:4,3:5] - 55
 all.equal(Xe, Xf)
[1] Component 3: Mean relative difference: 0.6863118
[2] Component 4: Mean relative difference: 0.4728435
[3] Component 5: Mean relative difference: 14.23546

I could use all.equal(), but it only returns human readable info that
cannot be easily used programmatically. It also gives no info on the
rows. Another way would be to:
require(prob)
 setdiff(Xe, Xf)
mpg cyl disp  hp dratwt  qsec vs am gear carb
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  144
Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  141
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  031

But again this doesn't return subsetting indices, nor any info on hte
columns. Any suggestions on how to approach this?

Regards ,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to change the date into an interval of date?

2013-04-16 Thread arun

Hi,
Please check your dput().
By using your dput() output, I am getting:
$patient_id
[1] 2 2 2 2 3 3 3 3

$responsed_at
[1] 14755 14797 14835 14883 14755 14789 14826 14857

$number
[1] 1 2 3 4 1 2 3 4

$score
[1] 1 1 2 3 1 5 4 5

$.Names
[1] patient_id   responsed_at number   scores  

$row.names
[1] NA -8

$class
[1] data.frame

It looks like the image from shared also showed the same output.  I am not 
using RStudio.  So, I don't know what is wrong.


#the dput should be:dat1- structure(list(patient_id = c(2,2,2,2,3,3,3,3),

   responsed_at = 
c(14755,14797,14835,14883,14755,14789,14826,14857),
   number = c(1,2,3,4,1,2,3,4), score=c(1,1,2,3,1,5,4,5)),  
   .Names = c(patient_id,responsed_at, number, 
scores),row.names=c(NA,-8L),class = data.frame)
dat1
#  patient_id responsed_at number scores
#1  2    14755  1  1
#2  2    14797  2  1
#3  2    14835  3  2
#4  2    14883  4  3
#5  3    14755  1  1
#6  3    14789  2  5
#7  3    14826  3  4
#8  3    14857  4  5


library(zoo)
dat1$responsed_at-as.Date(dat1$responsed_at)
 dat1
#  patient_id responsed_at number scores
#1  2   2010-05-26  1  1
#2  2   2010-07-07  2  1
#3  2   2010-08-14  3  2
#4  2   2010-10-01  4  3
#5  3   2010-05-26  1  1
#6  3   2010-06-29  2  5
#7  3   2010-08-05  3  4
#8  3   2010-09-05  4  5
 str(dat1)
#'data.frame':    8 obs. of  4 variables:
# $ patient_id  : num  2 2 2 2 3 3 3 3
# $ responsed_at: Date, format: 2010-05-26 2010-07-07 ...
# $ number  : num  1 2 3 4 1 2 3 4
# $ scores  : num  1 1 2 3 1 5 4 5
A.K.





 From: GUANGUAN LUO guanguan...@gmail.com
To: arun smartpink...@yahoo.com 
Sent: Tuesday, April 16, 2013 10:49 AM
Subject: Re: how to change the date into an interval of date?
 


hi,
 dput(head(data,8))
structure(list(patient_id = c(2,2,2,2,3,3,3,3), 
   responsed_at = 
c(14755,14797,14835,14883,14755,14789,14826,14857), 
   number = c(1,2,3,4,1,2,3,4), score=c(1,1,2,3,1,5,4,5),  
   .Names = c(patient_id, 
  responsed_at, number, scores),  class = data.frame))

like this? 
i use R studio, there are 4 windows , window of results is the output. 


2013/4/16 arun smartpink...@yahoo.com

HI,
Please dput() your dataset as in my previous reply.  This is image and it is 
twice or thrice the work for me to convert it to readable form.

http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example


Also,  you didn't answer to my question:
I didn't understand which one is window of results and which is my 
tables.






 From: GUANGUAN LUO guanguan...@gmail.com
To: smartpink...@yahoo.com 
Sent: Tuesday, April 16, 2013 10:10 AM
Subject: Re: how to change the date into an interval of date?
 


  
patient_id
number
response_id
session_id
responsed_at
login
clinique_basdai.fatigue
















































































1 2 1 77 2 14755 3002 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2 2 2 1258 61 14797 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3 2 3 2743 307 14835 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4 2 4 4499 562 14883 3002 6 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5 2 5 6224 809 14916 3002 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6 2 6 7708 1024 14949 3002 3 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7 2 7 9475 1224 14985 3002 3 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8 2 8 11362 1458 15020 3002 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9 2 9 13417 1688 15055 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10 2 10 15365 1959 15090 3002 4 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11 2 11 17306 2211 15126 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12 2 12 19073 2449 15160 3002 3 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13 2 13 20679 2677 15193 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14 2 14 22294 2883 15228 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15 2 15 24097 3082 15265 3002 5 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16 2 16 25670 3304 15299 3002 5

[R] Path Diagram

2013-04-16 Thread Laura Thomas

Hi All,

Apologies if this has been answered somewhere else, but I have been
searching for an answer all day and not been able to find one.

I am trying to plot a path diagram for a CFA I have run, I have installed
Rgraphviz and run the following:

pathDiagram(cfa, min.rank='item1, item2, item3, item4, item5, item6,
item7, item8, item9, item10, item11, item12', max.rank='SMP, AAAS',
file='documents')

I get the following message and output:

Running  dot -Tpdf -o documents.pdf  documents.dot

digraph cfa {
  rankdir=LR;
  size=8,8;
  node [fontname=Helvetica fontsize=14 shape=box];
  edge [fontname=Helvetica fontsize=10];
  center=1;
  {rank=min item1 item2 item3 item4 item5 item6 item7
item8 item9 item10 item11 item12}
  {rank=max SMP AAAS}
  SMP [shape=ellipse]
  AAAS [shape=ellipse]
  SMP - item1 [label=smp0];
  SMP - item3 [label=smp1];
  SMP - item4 [label=smp2];
  SMP - item6 [label=smp3];
  SMP - item8 [label=smp4];
  SMP - item10 [label=smp5];
  SMP - item11 [label=smp6];
  AAAS - item2 [label=aaas0];
  AAAS - item5 [label=aaas1];
  AAAS - item7 [label=aaas2];
  AAAS - item9 [label=aaas3];
  AAAS - item12 [label=aaas4];
}

How do I get to see the graph?

Many thanks,

Laura

Laura Thomas
PhD Student- Sport and Exercise Psychology
Department of Sport and Exercise
Penglais Campus
Aberystywth University
Aberystwyth

01970621947
l...@aber.ac.uk
www.aber.ac.uk/en/sport-exercise/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] I don't understand the 'order' function

2013-04-16 Thread Julio Sergio

I thought I've understood the 'order' function, using simple examples like:

   order(c(5,4,-2))
   [1] 3 2 1

However, I arrived to the following example:

   order(c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)) 
   [1]  8  9 10  7 11  6  5  4  3  2  1

and I was completely perplexed!
Shouldn't the output vector be  11 10 9 8 7 6 4 1 2 3 5 ?
Do I have a damaged version of R?

I became still more astonished when I used the sort function and got the 
right answer: 

   sort(c(2465, 2255, 2085, 1545, 1335, 1210,  920,  210,  210,  505, 1045))
   [1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465
since 'sort' documentation claims to be using 'order' to establish the right 
order.

Please help me to understand all this!

  Thanks,

  -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testInstalledBasic / testInstalledPackages

2013-04-16 Thread Marc Schwartz


On Apr 16, 2013, at 11:44 AM, Trina Patel trinarpa...@gmail.com wrote:

 Hi,
 
 I installed R 3.0.0 on a Windows 2008 Server.
 
 When I submitted the following code in R64,
 library(tools)
 testInstalledBasic(scope=devel)
 
 I get the following message in the R Console:
 library(tools)
 testInstalledBasic(scope=devel)
 running tests of consistency of as/is.*
 creating ‘isas-tests.R’
  running code in ‘isas-tests.R’
  comparing ‘isas-tests.Rout’ to ‘isas-tests.Rout.save’ ...2550a2551
 
 running tests of random deviate generation -- fails occasionally
  running code in ‘p-r-random-tests.R’
  comparing ‘p-r-random-tests.Rout’ to ‘p-r-random-tests.Rout.save’ ... OK
 running tests of primitives
  running code in ‘primitives.R’
 running regexp regression tests
  running code in ‘utf8-regex.R’
 running tests to possibly trigger segfaults
 creating ‘no-segfault.R’
  running code in ‘no-segfault.R’
 Warning message:
 running command 'diff -bw
 C:\Users\TRINA_~1\AppData\Local\Temp\Rtmp2FwZXW\Rdiffa1a88562f12b
 C:\Users\TRINA_~1\AppData\Local\Temp\Rtmp2FwZXW\Rdiffb1a8848c57620'
 had status 1
 
 When I compare the isas-tests.Rout  to isas-tests.Rout.save, as well
 as the two diff files listed above, it seems that there is one extra
 empty line in isas-tests.Rout.save. Is there any way to fix this error
 without modifying the isas-tests.Rout.save file?
 
 Next I submitted the following code,
 testInstalledPackages(scope=base)
 
 and got the message below in my R console:
 testInstalledPackages(scope=base)
 Testing examples for package ‘base’
 Testing examples for package ‘tools’
  comparing ‘tools-Ex.Rout’ to ‘tools-Ex.Rout.save’ ...
 621c621
  [1] 0cce1e42ef3fb133940946534fcf8896
 ---
 [1] eb723b61539feef013de476e68b5c50a
 
 When comparing the files tools-ex.rout and tools-ex-rout.save, it
 seems this difference indicates an error in the md5sums for the file
 C:\Program Files\R\R-3.0.0\COPYING. Does this indicate a problem with
 my installation? Looking at the file C:\Program Files\R\R-3.0.0\MD5,
 leads me to suspect there might be an error in the test itself.
 
 
 Thanks for the help!


See:

  
http://cran.r-project.org/doc/manuals/r-release/R-admin.html#Testing-a-Windows-Installation

from the R Installation and Administration Manual. Try running:

  Sys.setenv(LC_COLLATE = C, LANGUAGE = en)

before you run the tests.

You might also want to have a look at:

  https://github.com/marcschwartz/R-IQ-OQ

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread Sarah Goslee

Hi Julio,

On Tue, Apr 16, 2013 at 1:51 PM, Julio Sergio julioser...@gmail.com wrote:
 I thought I've understood the 'order' function, using simple examples like:

order(c(5,4,-2))
[1] 3 2 1

 However, I arrived to the following example:

order(c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045))
[1]  8  9 10  7 11  6  5  4  3  2  1

 and I was completely perplexed!
 Shouldn't the output vector be  11 10 9 8 7 6 4 1 2 3 5 ?
 Do I have a damaged version of R?

Your version of R is fine; your understanding is damaged. :)

order() returns the element indices for each position. So in your
example, the sorted version of the vector would have element 8 in the
first place, element 9 in the second place, and element 1 in the last
place. order() is not the same as rank().

See:
x - c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)
order(x)
x[order(x)]
rank(x) # what you seem to expect

Sarah

-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread Rui Barradas


Hello,

Inline.

Em 16-04-2013 18:51, Julio Sergio escreveu:

I thought I've understood the 'order' function, using simple examples like:

order(c(5,4,-2))
[1] 3 2 1

However, I arrived to the following example:

order(c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045))
[1]  8  9 10  7 11  6  5  4  3  2  1

and I was completely perplexed!
Shouldn't the output vector be  11 10 9 8 7 6 4 1 2 3 5 ?


No, why should it?
Try assigning the output of order and see what happens to the vector.


x - c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)
(o - order(x) )
x[o]  # Allright


Hope this helps,

Rui Barradas


Do I have a damaged version of R?

I became still more astonished when I used the sort function and got the
right answer:

sort(c(2465, 2255, 2085, 1545, 1335, 1210,  920,  210,  210,  505, 1045))
[1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465
since 'sort' documentation claims to be using 'order' to establish the right
order.

Please help me to understand all this!

   Thanks,

   -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread arun

Hi,
vec1- c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)
vec1[order(vec1)]
 #[1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465
order(vec1)
 #[1]  8  9 10  7 11  6  5  4  3  2  1
sort(vec1,index.return=TRUE)
#$x
 #[1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465

#$ix
# [1]  8  9 10  7 11  6  5  4  3  2  1
A.K.



- Original Message -
From: Julio Sergio julioser...@gmail.com
To: r-h...@stat.math.ethz.ch
Cc: 
Sent: Tuesday, April 16, 2013 1:51 PM
Subject: [R] I don't understand the 'order' function

I thought I've understood the 'order' function, using simple examples like:

   order(c(5,4,-2))
   [1] 3 2 1

However, I arrived to the following example:

   order(c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)) 
   [1]  8  9 10  7 11  6  5  4  3  2  1

and I was completely perplexed!
Shouldn't the output vector be  11 10 9 8 7 6 4 1 2 3 5 ?
Do I have a damaged version of R?

I became still more astonished when I used the sort function and got the 
right answer: 

   sort(c(2465, 2255, 2085, 1545, 1335, 1210,  920,  210,  210,  505, 1045))
   [1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465
since 'sort' documentation claims to be using 'order' to establish the right 
order.

Please help me to understand all this!

  Thanks,

  -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread Duncan Murdoch


On 16/04/2013 1:51 PM, Julio Sergio wrote:

I thought I've understood the 'order' function, using simple examples like:

order(c(5,4,-2))
[1] 3 2 1

However, I arrived to the following example:

order(c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045))
[1]  8  9 10  7 11  6  5  4  3  2  1

and I was completely perplexed!
Shouldn't the output vector be  11 10 9 8 7 6 4 1 2 3 5 ?
Do I have a damaged version of R?


You are probably confusing order() and rank().  What we want is that

x[order(x)]

is in increasing order.   This is the inverse permutation of what 
rank(x) gives, so (if there are no ties) rank(x)[order(x)]  and

order(x)[rank(x)] should both give 1:length(x).

Duncan




I became still more astonished when I used the sort function and got the
right answer:

sort(c(2465, 2255, 2085, 1545, 1335, 1210,  920,  210,  210,  505, 1045))
[1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465
since 'sort' documentation claims to be using 'order' to establish the right
order.

Please help me to understand all this!

   Thanks,

   -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-16 Thread John Kane

When in doubt, assume the spreadsheet is wrong. I suggested this to someone
have a problem with R vs Excel results a while ago. When I checked back with
him -- there was a spreadsheet error.

I think a t-shirt with the motto Friends don't let friends use
spreadsheets[1] sounds like a good idea. Unfortunately I am not artistic
enough to do a design.

1. Slight paraphrase of J. D Cryer's statement
http://homepage.cs.uiowa.edu/~jcryer/JSMTalk2001.pdf

John Kane
Kingston ON Canada

-Original Message-
From: sarah.gos...@gmail.com
Sent: Tue, 16 Apr 2013 13:25:57 -0400
To: r-help@r-project.org
Subject: [R] the joy of spreadsheets (off-topic)

Given that we occasionally run into problems with comparing Excel
results to R results, and other spreadsheet-induced errors, I thought
this might be of interest.

http://www.nextnewdeal.net/rortybomb/researchers-finally-replicated-reinhart-rogoff-and-there-are-serious-problems

The punchline:

If this error turns out to be an actual mistake Reinhart-Rogoff made,
well, all I can hope is that future historians note that one of the
core empirical points providing the intellectual foundation for the
global move to austerity in the early 2010s was based on someone
accidentally not updating a row formula in Excel.

Ouch.

(Note: I know nothing about the site, the author of the article, or
the study in question. I was pointed to it by someone else. But if
true: highly problematic.)

Sarah

--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread John Kane

 -Original Message-
 From: gunter.ber...@gene.com
 Sent: Tue, 16 Apr 2013 09:43:14 -0700
 To: h.wick...@gmail.com
 Subject: Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF
 May 20-21

 Hadley:

 I don't think this is appropriate. Think of what it would be like if
 everyone shilled their R training and consulting wares here.

They do.

John Kane
Kingston ON Canada

 Bert

 Sent from my iPhone -- please excuse typos.

 On Apr 16, 2013, at 8:09 AM, Hadley Wickham h.wick...@gmail.com wrote:

 Hi all,

 At RStudio, we're hosting our Introduction to R Workshop this May in
 two locations. As an R-help subscriber, we're offering 10% off!

 * Intro to data science with R (http://goo.gl/bplg3)
  May 13-14 New York City

 * Intro to data science with R (http://goo.gl/VCUFL)
  May 20-21 San Francisco Bay Area

 What will you learn?

 Practical skills for visualizing, transforming, and modeling data in
 R. During this two-day course, you will learn how to explore and
 understand data as well as how to do basic programming in R. Our
 courses incorporate a mix of lectures and hands-on learning. Expect to
 learn about a topic and then immediately put it into practice with a
 small example. Plenty of help will be available if you get stuck. You
 can read more about our training philosophy at
 http://www.rstudio.com/training/philosophy.html

 To see prices, precise locations and to register:

 * for the NY course: http://rstudio-nyc.eventbrite.com/
 * for the SF course: http://rstudio-bay.eventbrite.com/

 We have limited discounts for students (66% off) and academics (33%
 off) - please contact j...@rstudio.com for details. To thank the
 R-help community for being such a great resource, we'd also like to
 offer all R-help subscribers a 10% discount. Just enter rhelpftw as
 a promotional code get 10% off!

 Regards,

 Hadley

 PS. Would you like us to offer these courses (or others!) in your
 area? Please let us know at
 http://www.rstudio.com/training/workshops/

 --
 Chief Scientist, RStudio
 http://had.co.nz/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread Julio Sergio

Julio Sergio juliosergio at gmail.com writes:

 
 I thought I've understood the 'order' function, using simple examples like:

Thanks to you all!... As Sarah said, what was damaged was my understanding ( 
;-) )... and as Duncan said, I was confusing 'order' with 'rank',
thanks! Now I understand the 'order' function.

  -Sergio

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread Barry Rowlingson

On Tue, Apr 16, 2013 at 5:43 PM, Bert Gunter gunter.ber...@gene.com wrote:
 Hadley:

 I don't think this is appropriate. Think of what it would be like if everyone 
 shilled their R training and consulting wares here.

Everyone does, don't they? A search on Nabble shows up regular
postings from XLSolutions, Mango used to post (not seen anything in a
while) and Revo sneak the odd commercial in Dave Smith's updates.

I don't see anything about non-commercial postings being banned from
R-help, but they do seem to be against the spirit of R-help.

I suspect commercials sneak in under under 'announcements' in the
R-help documentation:

R-help: The ‘main’ R mailing list, for [...] announcements (not
covered by ‘R-announce’ or ‘R-packages’, see above)

As with everything R, if it bothers the maintainers, then they'll put
a stop to it. We users matter not...

Barry

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread Ted Harding

[See in-line below[

On 16-Apr-2013 17:51:41 Julio Sergio wrote:
 I thought I've understood the 'order' function, using simple examples like:
 
order(c(5,4,-2))
[1] 3 2 1
 
 However, I arrived to the following example:
 
order(c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)) 
[1]  8  9 10  7 11  6  5  4  3  2  1
 
 and I was completely perplexed!
 Shouldn't the output vector be  11 10 9 8 7 6 4 1 2 3 5 ?
 Do I have a damaged version of R?

I think the simplest explanation can be given as:

S - c(2465, 2255, 2085, 1545, 1335, 1210, 920, 210, 210, 505, 1045)
cbind(Index=1:length(S), S, Order=order(S), Sort=sort(S))
  IndexS Order Sort
 [1,] 1 2465 8  210
 [2,] 2 2255 9  210
 [3,] 3 208510  505
 [4,] 4 1545 7  920
 [5,] 5 133511 1045
 [6,] 6 1210 6 1210
 [7,] 7  920 5 1335
 [8,] 8  210 4 1545
 [9,] 9  210 3 2085
[10,]10  505 2 2255
[11,]11 1045 1 2465

showing that the value of 'order' for any one of the numbers
is the Index (position) of that number in the original series,
placed in the position that number occupies in the sorted series.
(With a tie for S[8] = S[9] = 210).

For example: which one of S occurs in 5th position in the sorted
series? It is the 11th of S (1045).

 I became still more astonished when I used the sort function and got the 
 right answer: 
 
 sort(c(2465, 2255, 2085, 1545, 1335, 1210,  920,  210,  210,  505, 1045))
 [1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465
 since 'sort' documentation claims to be using 'order' to establish the right 
 order.

Indeed, once you have order(S), you know which element of S to put in
each position of the sorted order:

  S[order(S)]
  [1]  210  210  505  920 1045 1210 1335 1545 2085 2255 2465

Does this help to explain it?
Ted.

 Please help me to understand all this!
 
   Thanks,
 
   -Sergio.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-
E-Mail: (Ted Harding) ted.hard...@wlandres.net
Date: 16-Apr-2013  Time: 19:12:21
This message was sent by XFMail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficiently diff two data frames

2013-04-16 Thread Rui Barradas


Hello,

Maybe Petr Savicky's answer in the link

https://stat.ethz.ch/pipermail/r-help/2012-February/304830.html

can lead you to what you want.
I've changed his function a bit in order  to return a logical vector 
into the rows where different rows return TRUE.


setdiffDF2 - function(A, B){
f - function(X, Y)
!duplicated(rbind(Y, X))[nrow(Y) + 1:nrow(X)]
ix1 - f(A, B)
ix2 - f(B, A)
ix1  ix2
}
ix - setdiffDF2(Xe, Xf)
Xe[ix,]
Xf[ix,]


Note that this gives no information on the columns.
Hope this helps,

Rui Barradas

Em 16-04-2013 18:42, Liviu Andronic escreveu:

Dear all,
What is the quickest and most efficient way to diff two data frames,
so as to obtain a vector of indices (or logical) for rows/columns that
differ in the two data frames?  For example,

Xe - head(mtcars)
Xf - head(mtcars)
Xf[2:4,3:5] - 55
all.equal(Xe, Xf)

[1] Component 3: Mean relative difference: 0.6863118
[2] Component 4: Mean relative difference: 0.4728435
[3] Component 5: Mean relative difference: 14.23546

I could use all.equal(), but it only returns human readable info that
cannot be easily used programmatically. It also gives no info on the
rows. Another way would be to:
require(prob)

setdiff(Xe, Xf)

 mpg cyl disp  hp dratwt  qsec vs am gear carb
Mazda RX4 Wag  21.0   6  160 110 3.90 2.875 17.02  0  144
Datsun 710 22.8   4  108  93 3.85 2.320 18.61  1  141
Hornet 4 Drive 21.4   6  258 110 3.08 3.215 19.44  1  031

But again this doesn't return subsetting indices, nor any info on hte
columns. Any suggestions on how to approach this?

Regards ,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread Mark Leeds

Hi Bert: given what Hadley and Rstudio have provided to the R-community,
what's the big deal of
letting people know about a class. It's the ideal place to send the notice.
and yes, as Barry
and John said, every other commercial entity does send to the R-list.


Mark





On Tue, Apr 16, 2013 at 2:11 PM, Barry Rowlingson 
b.rowling...@lancaster.ac.uk wrote:

 On Tue, Apr 16, 2013 at 5:43 PM, Bert Gunter gunter.ber...@gene.com
 wrote:
  Hadley:
 
  I don't think this is appropriate. Think of what it would be like if
 everyone shilled their R training and consulting wares here.

 Everyone does, don't they? A search on Nabble shows up regular
 postings from XLSolutions, Mango used to post (not seen anything in a
 while) and Revo sneak the odd commercial in Dave Smith's updates.

 I don't see anything about non-commercial postings being banned from
 R-help, but they do seem to be against the spirit of R-help.

 I suspect commercials sneak in under under 'announcements' in the
 R-help documentation:

 R-help: The main R mailing list, for [...] announcements (not
 covered by R-announce or R-packages, see above)

 As with everything R, if it bothers the maintainers, then they'll put
 a stop to it. We users matter not...

 Barry

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-16 Thread Frank Harrell

What a terrific article.  Thanks for sharing!  The more we critically 
examine how research is actually done the more frightened we become.


Frank

--
Frank E Harrell Jr Professor and Chairman  School of Medicine
   Department of Biostatistics Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread Hadley Wickham

Hi Bert,

We are following the mailing list guidelines to the best of our
knowledge (e.g.
http://r.789695.n4.nabble.com/R-development-master-class-NYC-Dec-12-13-td4037031.html#a4038699).
It's our belief (as shared by others) that advertising our courses
falls under the general aegis of helping people learn R.

Our goal is for RStudio to be a net positive to R the community. We
support the R foundation, R user groups, do a lot of teaching for
free, and develop a lot of open-source software like the RStudio IDE,
shiny, ggplot2 and devtools. Public courses help fuel our development
and hence benefit the R community.

Hadley

On Tue, Apr 16, 2013 at 11:43 AM, Bert Gunter gunter.ber...@gene.com wrote:
Hadley:

I don't think this is appropriate. Think of what it would be like if everyone
shilled their R training and consulting wares here.

Bert

Sent from my iPhone -- please excuse typos.

On Apr 16, 2013, at 8:09 AM, Hadley Wickham h.wick...@gmail.com wrote:

Hi all,

At RStudio, we're hosting our Introduction to R Workshop this May in
two locations. As an R-help subscriber, we're offering 10% off!

* Intro to data science with R (http://goo.gl/bplg3)
May 13-14 New York City

* Intro to data science with R (http://goo.gl/VCUFL)
May 20-21 San Francisco Bay Area

What will you learn?

Practical skills for visualizing, transforming, and modeling data in
R. During this two-day course, you will learn how to explore and
understand data as well as how to do basic programming in R. Our
courses incorporate a mix of lectures and hands-on learning. Expect to
learn about a topic and then immediately put it into practice with a
small example. Plenty of help will be available if you get stuck. You
can read more about our training philosophy at
http://www.rstudio.com/training/philosophy.html

To see prices, precise locations and to register:

* for the NY course: http://rstudio-nyc.eventbrite.com/
* for the SF course: http://rstudio-bay.eventbrite.com/

We have limited discounts for students (66% off) and academics (33%
off) - please contact j...@rstudio.com for details. To thank the
R-help community for being such a great resource, we'd also like to
offer all R-help subscribers a 10% discount. Just enter rhelpftw as
a promotional code get 10% off!

Regards,

Hadley

PS. Would you like us to offer these courses (or others!) in your
area? Please let us know at
http://www.rstudio.com/training/workshops/

--
Chief Scientist, RStudio
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
Chief Scientist, RStudio
http://had.co.nz/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] avoid losing data.frame attributes on cbind()

2013-04-16 Thread Liviu Andronic

Dear all,
How should I add several variables to a data frame without losing the
attributes of the df? Consider the following:
 require(Hmisc)
 Xa - iris
 label(Xa, self=T) - Some df label
 str(Xa)
'data.frame':   150 obs. of  5 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
 - attr(*, label)= chr Some df label
 Xb - round(iris[,1:2])
 names(Xb) - c(var1,'var2')
 Xc - cbind(Xa, Xb)
 #the attribute is now gone
 str(Xc)
'data.frame':   150 obs. of  7 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
 $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
 $ var1: num  5 5 5 5 5 5 5 5 4 5 ...
 $ var2: num  4 3 3 3 4 4 3 3 3 3 ...


In such cases, when I want to plug some variables from 2nd df into the
1st df, how should I proceed without losing the attributes of the 1st
data frame. And, if possible, I'm looking for something nicer than:
for(i in names(Xb)) Xa[ , i] - Xb[ , i]

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-16 Thread John Kane

I tend to live in fear that some spreadsheet calculating a drug dose for me 
will use my telephone number rather than my weight.

John Kane
Kingston ON Canada


 -Original Message-
 From: f.harr...@vanderbilt.edu
 Sent: Tue, 16 Apr 2013 13:20:46 -0500
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] the joy of spreadsheets (off-topic)
 
 What a terrific article.  Thanks for sharing!  The more we critically
 examine how research is actually done the more frightened we become.
 
 Frank
 
 --
 Frank E Harrell Jr Professor and Chairman  School of Medicine
 Department of Biostatistics Vanderbilt University
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Model ranking (AICc, BIC, QIC) with coxme regression

2013-04-16 Thread Rémi Lesmerises

Hi,

I'm actually trying to rank a set of candidate models with an information 
criterion (AICc, QIC, BIC). The problem I have is that I use mixed-effect cox 
regression only available with the package {coxme} (see the example below).

#Model1
spring.cox - coxme (Surv(start, stop, Real_rand) ~ 
strata(Paired)+R4+R3+R2+(R3|Individual), spring)


I've already found some explications in this forum to adjust QIC on coxph 
object (see the following lines, thanks to M. Basille), but it doesn't work on 
coxme...

QIC - function(mod, ...)
    UseMethod(QIC)

QIC.coxph - function(spring.cox, details = FALSE) {
    trace - sum(diag(solve(spring.cox$naive.var) %*% spring.cox$var))
    quasi - spring.cox$loglik[2]
    return(-2*quasi + 2*trace)
}

The only thing that I can't find in the coxme output to use these previous 
commands is the naive.var, that we can obtain in coxph regression by 
specifying robust=TRUE in the argument list:

spring.cox - coxph (Surv(start, stop, Real_rand) ~ strata(Paired)+R4+R3+R2, 
spring, robust=T)


But coxph doesn't allow inclusion of interaction between two random variables 
(R3|Individual), and it's why I have to use coxme.
I found a new update in R-forge to improve {coxme} 
(r-forge.r-project.org/scm/viewvc.php/pkg/R/dredge.R?view=logroot=mumin), but 
I did not understand all it works and I'm not sure it fixes my problems...

Is there someone that can help me with that?

Rémi Lesmerises, biol. M.Sc.,

Candidat Ph.D. en Biologie
Université du Québec à Rimouski
300, allée des Ursulines
Rimouski, Qc., G5L 3A1
Tél.: 1 800 511-3382  #1241
remilesmeri...@yahoo.ca
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread William Dunlap

I think Duncan said that order and rank were inverses (if there are no ties).  
order() has
period 2 so order(order(x)) is also rank(x) if there are no ties.  E.g.,

 data.frame(x, o1=order(x), o2=order(order(x)), o3=order(order(order(x))), 
 o4=order(order(order(order(x, rank=rank(x))
  x o1 o2 o3 o4 rank
1  2465  8 11  8 11 11.0
2  2255  9 10  9 10 10.0
3  2085 10  9 10  9  9.0
4  1545  7  8  7  8  8.0
5  1335 11  7 11  7  7.0
6  1210  6  6  6  6  6.0
7   920  5  4  5  4  4.0
8   210  4  1  4  1  1.5
9   210  3  2  3  2  1.5
10  505  2  3  2  3  3.0
11 1045  1  5  1  5  5.0

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Julio Sergio
 Sent: Tuesday, April 16, 2013 11:10 AM
 To: r-h...@stat.math.ethz.ch
 Subject: Re: [R] I don't understand the 'order' function
 
 Julio Sergio juliosergio at gmail.com writes:
 
 
  I thought I've understood the 'order' function, using simple examples like:
 
 Thanks to you all!... As Sarah said, what was damaged was my understanding (
 ;-) )... and as Duncan said, I was confusing 'order' with 'rank',
 thanks! Now I understand the 'order' function.
 
   -Sergio
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] need help with R

2013-04-16 Thread John Kane

Of course but you should carefully read the guidelines (see bottom of post and 
it is a good idea to read 
Reproducibility
https://github.com/hadley/devtools/wiki/Reproducibility
 
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

for some useful suggestions on how to describe a problem and lay out your code. 
 

Welcome

John Kane
Kingston ON Canada


 -Original Message-
 From: sam.ting...@gmail.com
 Sent: Mon, 15 Apr 2013 18:09:25 +1200
 To: r-help@r-project.org
 Subject: [R] need help with R
 
 hey there
 
 can i email questions to this address to get help with using R
 
 thanks sam
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoid losing data.frame attributes on cbind()

2013-04-16 Thread arun

HI,
Not sure if this helps:
library(plyr)
res-mutate(Xa,var1=round(Sepal.Length),var2=round(Sepal.Width))
str(res)
#'data.frame':    150 obs. of  7 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1 1 1 1 1 
1 1 ...
# $ var1    : num  5 5 5 5 5 5 5 5 4 5 ...
# $ var2    : num  4 3 3 3 4 4 3 3 3 3 ...
 #- attr(*, label)= chr Some df label
A.K.



- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help r-h...@stat.math.ethz.ch
Cc: 
Sent: Tuesday, April 16, 2013 2:24 PM
Subject: [R] avoid losing data.frame attributes on cbind()

Dear all,
How should I add several variables to a data frame without losing the
attributes of the df? Consider the following:
 require(Hmisc)
 Xa - iris
 label(Xa, self=T) - Some df label
 str(Xa)
'data.frame':    150 obs. of  5 variables:
$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species     : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
- attr(*, label)= chr Some df label
 Xb - round(iris[,1:2])
 names(Xb) - c(var1,'var2')
 Xc - cbind(Xa, Xb)
 #the attribute is now gone
 str(Xc)
'data.frame':    150 obs. of  7 variables:
$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species     : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
$ var1        : num  5 5 5 5 5 5 5 5 4 5 ...
$ var2        : num  4 3 3 3 4 4 3 3 3 3 ...


In such cases, when I want to plug some variables from 2nd df into the
1st df, how should I proceed without losing the attributes of the 1st
data frame. And, if possible, I'm looking for something nicer than:
for(i in names(Xb)) Xa[ , i] - Xb[ , i]

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Path Diagram

2013-04-16 Thread John Fox

Dear Laura,

This works for me. Is dot on your system path?

Best,
 John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Laura Thomas
 Sent: Tuesday, April 16, 2013 1:17 PM
 To: r-help@r-project.org
 Subject: [R] Path Diagram
 
 Hi All,
 
 Apologies if this has been answered somewhere else, but I have been
 searching for an answer all day and not been able to find one.
 
 I am trying to plot a path diagram for a CFA I have run, I have
 installed
 Rgraphviz and run the following:
 
 pathDiagram(cfa, min.rank='item1, item2, item3, item4, item5, item6,
 item7, item8, item9, item10, item11, item12', max.rank='SMP, AAAS',
 file='documents')
 
 I get the following message and output:
 
 Running  dot -Tpdf -o documents.pdf  documents.dot
 
 digraph cfa {
   rankdir=LR;
   size=8,8;
   node [fontname=Helvetica fontsize=14 shape=box];
   edge [fontname=Helvetica fontsize=10];
   center=1;
   {rank=min item1 item2 item3 item4 item5 item6 item7
 item8 item9 item10 item11 item12}
   {rank=max SMP AAAS}
   SMP [shape=ellipse]
   AAAS [shape=ellipse]
   SMP - item1 [label=smp0];
   SMP - item3 [label=smp1];
   SMP - item4 [label=smp2];
   SMP - item6 [label=smp3];
   SMP - item8 [label=smp4];
   SMP - item10 [label=smp5];
   SMP - item11 [label=smp6];
   AAAS - item2 [label=aaas0];
   AAAS - item5 [label=aaas1];
   AAAS - item7 [label=aaas2];
   AAAS - item9 [label=aaas3];
   AAAS - item12 [label=aaas4];
 }
 
 How do I get to see the graph?
 
 Many thanks,
 
 Laura
 
 Laura Thomas
 PhD Student- Sport and Exercise Psychology
 Department of Sport and Exercise
 Penglais Campus
 Aberystywth University
 Aberystwyth
 
 01970621947
 l...@aber.ac.uk
 www.aber.ac.uk/en/sport-exercise/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with handling of attributes in xmlToList in XML package

2013-04-16 Thread santiago gil

Are my emails getting through?

2013/4/14 santiago gil sg.c...@gmail.com:
 Hello all,

 I have a problem with the way attributes are dealt with in the
 function xmlToList(), and I haven't been able to figure it out for
 days now.

 Say I have a document (produced by nmap) like this:

 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
 address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
 portsport protocol=tcp portid=135state state=open
 reason=syn-ack reason_ttl=127/service name=msrpc
 product=Microsoft Windows RPC ostype=Windows method=probed
 conf=10cpecpe:/o:microsoft:windows/cpe/service/port
 port protocol=tcp portid=139state state=open
 reason=syn-ack reason_ttl=127/service name=netbios-ssn
 method=probed conf=10//port
 /ports
 times srtt=647 rttvar=71 to=10/
 /host'

 I want to store this as a list of lists, so I do:

 mytree-xmlTreeParse(mydoc)
 myroot-xmlRoot(mytree)
 mylist-xmlToList(myroot)

 Now my problem is that when I want to fetch the attributes of the
 services running of each port, the behavior is not consistent:

 mylist[[ports]][[1]][[service]]$.attrs[name]
name
 msrpc
 mylist[[ports]][[2]][[service]]$.attrs[name]
 Error in trash_list[[ports]][[2]][[service]]$.attrs :
   $ operator is invalid for atomic vectors

 I understand that the way they are dfined in the documnt is not the
 same, but I think there still should be a consistent behavior. I've
 tried many combination of parameters for xmlTreeParse() but nothing
 has helped me. I can't find a way to call up the name of the service
 consistently regardless of whether the node has children or not. Any
 tips?

 All the best,


 S.G.

 --
 ---
 http://barabasilab.neu.edu/people/gil/



-- 
---
http://barabasilab.neu.edu/people/gil/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] 10% off Intro R training from RStudio: NYC May 13-14, SF May 20-21

2013-04-16 Thread Michael Weylandt

On Apr 16, 2013, at 12:43, Bert Gunter gunter.ber...@gene.com wrote:

 Hadley:
 
 I don't think this is appropriate. Think of what it would be like if everyone 
 shilled their R training and consulting wares here. 

Echoing others, this seems an accepted practice on the lists, endorsed at least 
in one instance by Peter Dalgaard: 
https://stat.ethz.ch/pipermail/r-help/2011-November/295496.html. 

Similarly,  Dirk has sent brief announcements for Rcpp training on the Rcpp 
list: http://permalink.gmane.org/gmane.comp.lang.r.rcpp/2334 I believe on 
R-SIG-HPC as well, but I don't have a link handy. 

List moderators will -- and have -- stepped in when it gets spammy: 
http://comments.gmane.org/gmane.comp.lang.r.hpc/1338

Michael
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with handling of attributes in xmlToList in XML package

2013-04-16 Thread David Winsemius

Yes. This is the third such copy. You can view them all in the Archive, 
starting with the first one:

https://stat.ethz.ch/pipermail/r-help/2013-April/351504.html

On Apr 16, 2013, at 11:49 AM, santiago gil wrote:

 Are my emails getting through?
 
 2013/4/14 santiago gil sg.c...@gmail.com:
 Hello all,
 
 I have a problem with the way attributes are dealt with in the
 function xmlToList(), and I haven't been able to figure it out for
 days now.
 
 Say I have a document (produced by nmap) like this:
 
 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
portsport protocol=tcp portid=135state state=open
 reason=syn-ack reason_ttl=127/service name=msrpc
 product=Microsoft Windows RPC ostype=Windows method=probed
 conf=10cpecpe:/o:microsoft:windows/cpe/service/port
port protocol=tcp portid=139state state=open
 reason=syn-ack reason_ttl=127/service name=netbios-ssn
 method=probed conf=10//port
/ports
times srtt=647 rttvar=71 to=10/
/host'
 
 I want to store this as a list of lists, so I do:
 
 mytree-xmlTreeParse(mydoc)
 myroot-xmlRoot(mytree)
 mylist-xmlToList(myroot)
 
 Now my problem is that when I want to fetch the attributes of the
 services running of each port, the behavior is not consistent:
 
 mylist[[ports]][[1]][[service]]$.attrs[name]
   name
 msrpc
 mylist[[ports]][[2]][[service]]$.attrs[name]
 Error in trash_list[[ports]][[2]][[service]]$.attrs :
  $ operator is invalid for atomic vectors
 
 I understand that the way they are dfined in the documnt is not the
 same, but I think there still should be a consistent behavior. I've
 tried many combination of parameters for xmlTreeParse() but nothing
 has helped me. I can't find a way to call up the name of the service
 consistently regardless of whether the node has children or not. Any
 tips?
 
 All the best,
 
 
 S.G.
 
 --
 ---
 http://barabasilab.neu.edu/people/gil/
 
 
 
 -- 
 ---
 http://barabasilab.neu.edu/people/gil/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with handling of attributes in xmlToList in XML package

2013-04-16 Thread Eva Prieto Castro

Hi, Santiago:

Yes, your e-mail has been received. I'm sorry, I can't solve your question.

Regards.

Eva

--- El mar, 16/4/13, santiago gil sg.c...@gmail.com escribió:

De: santiago gil sg.c...@gmail.com
Asunto: Re: [R] Problem with handling of attributes in xmlToList in XML package
Para: r-help@r-project.org
Fecha: martes, 16 de abril, 2013 20:49

Are my emails getting through?

2013/4/14 santiago gil sg.c...@gmail.com:
 Hello all,

 I have a problem with the way attributes are dealt with in the
 function xmlToList(), and I haven't been able to figure it out for
 days now.

 Say I have a document (produced by nmap) like this:

 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
     address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
     portsport protocol=tcp portid=135state state=open
 reason=syn-ack reason_ttl=127/service name=msrpc
 product=Microsoft Windows RPC ostype=Windows method=probed
 conf=10cpecpe:/o:microsoft:windows/cpe/service/port
     port protocol=tcp portid=139state state=open
 reason=syn-ack reason_ttl=127/service name=netbios-ssn
 method=probed conf=10//port
     /ports
     times srtt=647 rttvar=71 to=10/
     /host'

 I want to store this as a list of lists, so I do:

 mytree-xmlTreeParse(mydoc)
 myroot-xmlRoot(mytree)
 mylist-xmlToList(myroot)

 Now my problem is that when I want to fetch the attributes of the
 services running of each port, the behavior is not consistent:

 mylist[[ports]][[1]][[service]]$.attrs[name]
    name
 msrpc
 mylist[[ports]][[2]][[service]]$.attrs[name]
 Error in trash_list[[ports]][[2]][[service]]$.attrs :
   $ operator is invalid for atomic vectors

 I understand that the way they are dfined in the documnt is not the
 same, but I think there still should be a consistent behavior. I've
 tried many combination of parameters for xmlTreeParse() but nothing
 has helped me. I can't find a way to call up the name of the service
 consistently regardless of whether the node has children or not. Any
 tips?

 All the best,


 S.G.

 --
 ---
 http://barabasilab.neu.edu/people/gil/



-- 
---
http://barabasilab.neu.edu/people/gil/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange error with log-normal models

2013-04-16 Thread Thomas Lumley

On Wed, Apr 17, 2013 at 5:19 AM, Noah Silverman noahsilver...@ucla.eduwrote:

 Hi,

 I have some data, that when plotted looks very close to a log-normal
 distribution.  My goal is to build a regression model to test how this
 variable responds to several independent variables.


 [snip]

When I try to build a simple model, I also get an error:

 l - glm(y~ x, family=gaussian(link=log))

 Error in eval(expr, envir, enclos) :  cannot find valid starting values:
 please specify some


Duncan has described the problems with the lognormal.  I will just point
out that this 'simple model' is not lognormal.  It is a model with normal
errors and log link, ie.

y ~ N(mu, sigma^2)
log(mu) = x \beta


-thomas

-- 
Thomas Lumley
Professor of Biostatistics
University of Auckland

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] converting blank cells to NAs

2013-04-16 Thread David Winsemius


On Apr 16, 2013, at 6:38 AM, arun wrote:

 Hi,
 I am not sure about the problem.
 If your non-numeric vector is like:
 a,b,,d,e,,f
 
 vec1-unlist(str_split(readLines(textConnection(a,b,,d,e,,f)),,))
  vec1[vec1==]- NA
  vec1
 #[1] a b NA  d e NA  f
 
 If this doesn't work, please provide an example vector.
 A.K.
 
 
 
 Thanks for the response.  That seems to do the trick as far replacing the 
 empty 
 cells with NA, however, the problem remains that the vector is 
 not numeric.  This was the reason I wanted to replace the empty cells 
 with NAs in the first place.  Forcing the vector with as.numeric 
 afterwards doesn't seem to work either, I get nonsensical results.

In R there are actully multiple version of NA and in hte case of character 
objects the reserved name is `NA_character_` , . not NA. You can also use 
`is.na-`

#Method: `is.na-`
 vec - sample(c(letters[1:5], ), 20, repl=TRUE)
 vec
 [1] d a b   d b c d   b b e   c 
a a
 is.na(vec) - vec==
 vec
 [1] d a b NA  NA  NA  d b c d NA  b b e NA  c NA  NA  
a a

---
Method: assign NA_character_

 vec - sample(c(letters[1:5], ), 20, repl=TRUE)
 vec
 [1] e c e b e c a d b d d   d b d   e e 
a  
 vec[vec==] -  NA_character_
 vec
 [1] e c e b e c a d b d d NA  d b d NA  e e 
a NA 

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Singular design matrix in rq

2013-04-16 Thread Jonathan Greenberg

Quantreggers:

I'm trying to run rq() on a dataset I posted at:
https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
(it's a 1500kb csv file named singular.csv) and am getting the following
error:

mydata - read.csv(singular.csv)
fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
 Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix

Any ideas what might be causing this or, more importantly, suggestions for
how to solve this?  I'm just trying to fit a smoothed hull to the top of
the data cloud (hence the large df).

Thanks!

--jonathan


-- 
Jonathan A. Greenberg, PhD
Assistant Professor
Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
Department of Geography and Geographic Information Science
University of Illinois at Urbana-Champaign
607 South Mathews Avenue, MC 150
Urbana, IL 61801
Phone: 217-300-1924
http://www.geog.illinois.edu/~jgrn/
AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with handling of attributes in xmlToList in XML package

2013-04-16 Thread Ben Tupper

Hi,

On Apr 16, 2013, at 2:49 PM, santiago gil wrote:
 
 2013/4/14 santiago gil sg.c...@gmail.com:
 Hello all,
 
 I have a problem with the way attributes are dealt with in the
 function xmlToList(), and I haven't been able to figure it out for
 days now.
 

I have not used xmlToList(), but I find what you try below works if you specify 
useInternalNodes = TRUE in your invocation of xmlTreeParse.  Often that is the 
solution for many issues with xml.  Also, I have found it best to write a 
relatively generic getter style function.  So, in the example below I have 
written a function called getPortAttr - it will get attributes for the child 
node you name.  I used your example as the defaults: service is the child to 
query and name is the attribute to retrieve from that child.  It's a heck of 
a lot easier to write a function than building the longish parse strings with 
lots of [[this]][[and]][[that]] stuff, and it is reusable to boot.

Cheers,
Ben

library(XML)

mydoc - 'host starttime=1365204834 endtime=1365205860
 status state=up reason=echo-reply reason_ttl=127/
 address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
 ports
  port protocol=tcp portid=135
   state state=open reason=syn-ack reason_ttl=127/
   service name=msrpc product=Microsoft Windows RPC ostype=Windows 
method=probed conf=10
cpecpe:/o:microsoft:windows/cpe
   /service
  /port
  port protocol=tcp portid=139
   state state=open reason=syn-ack reason_ttl=127/
   service name=netbios-ssn method=probed conf=10/
  /port
 /ports
 times srtt=647 rttvar=71 to=10/
/host'
   
mytree-xmlTreeParse(mydoc, useInternalNodes = TRUE)
myroot-xmlRoot(mytree)

myports - myroot[[ports]][port]


getPortAttr - function(x, child = service, attr = name) {
   kid - x[[child]]
   att - xmlAttrs(kid)[[attr]]
   att
}
portNames - sapply(myports, getPortAttr)
# portNames
# port  port 
#  msrpc netbios-ssn 
portReason - sapply(myports, getPortAttr, child = state, attr = reason)
# portReason
# port  port 
#syn-ack syn-ack 










 Say I have a document (produced by nmap) like this:
 
 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
portsport protocol=tcp portid=135state state=open
 reason=syn-ack reason_ttl=127/service name=msrpc
 product=Microsoft Windows RPC ostype=Windows method=probed
 conf=10cpecpe:/o:microsoft:windows/cpe/service/port
port protocol=tcp portid=139state state=open
 reason=syn-ack reason_ttl=127/service name=netbios-ssn
 method=probed conf=10//port
/ports
times srtt=647 rttvar=71 to=10/
/host'
 
 I want to store this as a list of lists, so I do:
 
 mytree-xmlTreeParse(mydoc)
 myroot-xmlRoot(mytree)
 mylist-xmlToList(myroot)
 
 Now my problem is that when I want to fetch the attributes of the
 services running of each port, the behavior is not consistent:
 
 mylist[[ports]][[1]][[service]]$.attrs[name]
   name
 msrpc
 mylist[[ports]][[2]][[service]]$.attrs[name]
 Error in trash_list[[ports]][[2]][[service]]$.attrs :
  $ operator is invalid for atomic vectors
 
 I understand that the way they are dfined in the documnt is not the
 same, but I think there still should be a consistent behavior. I've
 tried many combination of parameters for xmlTreeParse() but nothing
 has helped me. I can't find a way to call up the name of the service
 consistently regardless of whether the node has children or not. Any
 tips?
 
 All the best,
 
 
 S.G.
 
 --
 ---
 http://barabasilab.neu.edu/people/gil/
 
 
 
 -- 
 ---
 http://barabasilab.neu.edu/people/gil/
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange error with log-normal models

2013-04-16 Thread Noah Silverman

@Duncan, You make a very good point.  Somehow I overlooked that 0 is not 
positive.  I guess that rules out the log normal model.

My challenge here is  finding the right model for this data.  Originally it was 
a nice count of students.  Relatively easy to model with a zero inflated 
Poisson model.  The resulting residuals seemed reasonable.

However, I was then instructed to change the count of students to a rate 
which was calculated as students / population (Each school has its own 
population.)) This is now no longer a count variable, but a proportion between 
0 and 1.  

This rate (students/population) is no longer Poisson, but is certainly not 
normal either.  So, I'm a bit lost as to the appropriate distribution to 
represent it.

Any thoughts?


--
Noah Silverman, M.S.
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095

On Apr 16, 2013, at 12:44 PM, Thomas Lumley tlum...@uw.edu wrote:

 On Wed, Apr 17, 2013 at 5:19 AM, Noah Silverman noahsilver...@ucla.edu 
 wrote:
 Hi,
 
 I have some data, that when plotted looks very close to a log-normal 
 distribution.  My goal is to build a regression model to test how this 
 variable responds to several independent variables.
 
  [snip]
 
 When I try to build a simple model, I also get an error:
 
 l - glm(y~ x, family=gaussian(link=log))
 
 Error in eval(expr, envir, enclos) :  cannot find valid starting values: 
 please specify some
 
 
 Duncan has described the problems with the lognormal.  I will just point out 
 that this 'simple model' is not lognormal.  It is a model with normal errors 
 and log link, ie.
 
 y ~ N(mu, sigma^2)
 log(mu) = x \beta
 
 
 -thomas
 
 -- 
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process slow down after a amount of time

2013-04-16 Thread Marc Girondot


Le 16/04/13 15:52, Chris82 a écrit :

Hi R users,

I have mentioned that R is getting slower if a process with a loop runs for
a while. Is that normal?
Let's say, I have a code which produce an output file after one loop run.
Now after 10, 15 or 20 loop runs the time between the created files is
stongly increasing.
Is there maybe any data which fill some memory?


Chris

I try your idea but I don't find anyt time difference. Could you more 
explicit ?


Marc

# loop time ##

tm - rep(Sys.time(), 1000)
k - 1

for (i in 1:1e7) {
  if (i%%1==0) {
tm[k] - Sys.time()
k - k+1
  }
}
plot(1:999, diff(tm), bty=n, type=l, ylim=c(0, 0.05))

--
__
Marc Girondot, Pr

Laboratoire Ecologie, Systématique et Evolution
Equipe de Conservation des Populations et des Communautés
CNRS, AgroParisTech et Université Paris-Sud 11 , UMR 8079
Bâtiment 362
91405 Orsay Cedex, France

Tel:  33 1 (0)1.69.15.72.30   Fax: 33 1 (0)1.69.15.73.53
e-mail: marc.giron...@u-psud.fr
Web: http://www.ese.u-psud.fr/epc/conservation/Marc.html
Skype: girondot

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange error with log-normal models

2013-04-16 Thread Marc Schwartz

Noah,

You might want to look at beta regression, using the betareg package on CRAN. 
There is a JSS paper here that you might find helpful:

  http://www.jstatsoft.org/v34/i02/paper

along with the vignettes for the package:

  http://cran.r-project.org/web/packages/betareg/vignettes/betareg.pdf

  http://cran.r-project.org/web/packages/betareg/vignettes/betareg-ext.pdf


Regards,

Marc Schwartz

On Apr 16, 2013, at 3:20 PM, Noah Silverman noahsilver...@ucla.edu wrote:

 @Duncan, You make a very good point.  Somehow I overlooked that 0 is not 
 positive.  I guess that rules out the log normal model.
 
 My challenge here is  finding the right model for this data.  Originally it 
 was a nice count of students.  Relatively easy to model with a zero inflated 
 Poisson model.  The resulting residuals seemed reasonable.
 
 However, I was then instructed to change the count of students to a rate 
 which was calculated as students / population (Each school has its own 
 population.)) This is now no longer a count variable, but a proportion 
 between 0 and 1.  
 
 This rate (students/population) is no longer Poisson, but is certainly not 
 normal either.  So, I'm a bit lost as to the appropriate distribution to 
 represent it.
 
 Any thoughts?
 
 
 --
 Noah Silverman, M.S.
 UCLA Department of Statistics
 8117 Math Sciences Building
 Los Angeles, CA 90095
 
 On Apr 16, 2013, at 12:44 PM, Thomas Lumley tlum...@uw.edu wrote:
 
 On Wed, Apr 17, 2013 at 5:19 AM, Noah Silverman noahsilver...@ucla.edu 
 wrote:
 Hi,
 
 I have some data, that when plotted looks very close to a log-normal 
 distribution.  My goal is to build a regression model to test how this 
 variable responds to several independent variables.
 
 [snip]
 
 When I try to build a simple model, I also get an error:
 
 l - glm(y~ x, family=gaussian(link=log))
 
 Error in eval(expr, envir, enclos) :  cannot find valid starting values: 
 please specify some
 
 
 Duncan has described the problems with the lognormal.  I will just point out 
 that this 'simple model' is not lognormal.  It is a model with normal errors 
 and log link, ie.
 
 y ~ N(mu, sigma^2)
 log(mu) = x \beta
 
 
-thomas
 
 -- 
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoid losing data.frame attributes on cbind()

2013-04-16 Thread arun

Hi,
Another method would be:
Xc- Xa
 Xc$var1-NA; Xc$var2- NA
Xc[]- append(as.list(Xa),as.list(Xb))
str(Xc)
#'data.frame':    150 obs. of  7 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1 1 1 1 1 
1 1 ...
# $ var1    : num  5 5 5 5 5 5 5 5 4 5 ...
# $ var2    : num  4 3 3 3 4 4 3 3 3 3 ...
# - attr(*, label)= chr Some df label
A.K.



- Original Message -
From: arun smartpink...@yahoo.com
To: Liviu Andronic landronim...@gmail.com
Cc: R help r-help@r-project.org
Sent: Tuesday, April 16, 2013 2:40 PM
Subject: Re: [R] avoid losing data.frame attributes on cbind()

HI,
Not sure if this helps:
library(plyr)
res-mutate(Xa,var1=round(Sepal.Length),var2=round(Sepal.Width))
str(res)
#'data.frame':    150 obs. of  7 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1 1 1 1 1 
1 1 ...
# $ var1    : num  5 5 5 5 5 5 5 5 4 5 ...
# $ var2    : num  4 3 3 3 4 4 3 3 3 3 ...
 #- attr(*, label)= chr Some df label
A.K.



- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help r-h...@stat.math.ethz.ch
Cc: 
Sent: Tuesday, April 16, 2013 2:24 PM
Subject: [R] avoid losing data.frame attributes on cbind()

Dear all,
How should I add several variables to a data frame without losing the
attributes of the df? Consider the following:
 require(Hmisc)
 Xa - iris
 label(Xa, self=T) - Some df label
 str(Xa)
'data.frame':    150 obs. of  5 variables:
$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species     : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
- attr(*, label)= chr Some df label
 Xb - round(iris[,1:2])
 names(Xb) - c(var1,'var2')
 Xc - cbind(Xa, Xb)
 #the attribute is now gone
 str(Xc)
'data.frame':    150 obs. of  7 variables:
$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species     : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
$ var1        : num  5 5 5 5 5 5 5 5 4 5 ...
$ var2        : num  4 3 3 3 4 4 3 3 3 3 ...


In such cases, when I want to plug some variables from 2nd df into the
1st df, how should I proceed without losing the attributes of the 1st
data frame. And, if possible, I'm looking for something nicer than:
for(i in names(Xb)) Xa[ , i] - Xb[ , i]

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] avoid losing data.frame attributes on cbind()

2013-04-16 Thread arun

Just to add:
 Xc[]- append(Xa,Xb) #should also work
str(Xc)
#'data.frame':    150 obs. of  7 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1 1 1 1 1 
1 1 ...
# $ var1    : num  5 5 5 5 5 5 5 5 4 5 ...
# $ var2    : num  4 3 3 3 4 4 3 3 3 3 ...
# - attr(*, label)= chr Some df label
A.K.




- Original Message -
From: arun smartpink...@yahoo.com
To: Liviu Andronic landronim...@gmail.com
Cc: R help r-help@r-project.org
Sent: Tuesday, April 16, 2013 4:45 PM
Subject: Re: [R] avoid losing data.frame attributes on cbind()

Hi,
Another method would be:
Xc- Xa
 Xc$var1-NA; Xc$var2- NA
Xc[]- append(as.list(Xa),as.list(Xb))
str(Xc)
#'data.frame':    150 obs. of  7 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1 1 1 1 1 
1 1 ...
# $ var1    : num  5 5 5 5 5 5 5 5 4 5 ...
# $ var2    : num  4 3 3 3 4 4 3 3 3 3 ...
# - attr(*, label)= chr Some df label
A.K.



- Original Message -
From: arun smartpink...@yahoo.com
To: Liviu Andronic landronim...@gmail.com
Cc: R help r-help@r-project.org
Sent: Tuesday, April 16, 2013 2:40 PM
Subject: Re: [R] avoid losing data.frame attributes on cbind()

HI,
Not sure if this helps:
library(plyr)
res-mutate(Xa,var1=round(Sepal.Length),var2=round(Sepal.Width))
str(res)
#'data.frame':    150 obs. of  7 variables:
# $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
# $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
# $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
# $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
# $ Species : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1 1 1 1 1 
1 1 ...
# $ var1    : num  5 5 5 5 5 5 5 5 4 5 ...
# $ var2    : num  4 3 3 3 4 4 3 3 3 3 ...
 #- attr(*, label)= chr Some df label
A.K.



- Original Message -
From: Liviu Andronic landronim...@gmail.com
To: r-help r-h...@stat.math.ethz.ch
Cc: 
Sent: Tuesday, April 16, 2013 2:24 PM
Subject: [R] avoid losing data.frame attributes on cbind()

Dear all,
How should I add several variables to a data frame without losing the
attributes of the df? Consider the following:
 require(Hmisc)
 Xa - iris
 label(Xa, self=T) - Some df label
 str(Xa)
'data.frame':    150 obs. of  5 variables:
$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species     : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
- attr(*, label)= chr Some df label
 Xb - round(iris[,1:2])
 names(Xb) - c(var1,'var2')
 Xc - cbind(Xa, Xb)
 #the attribute is now gone
 str(Xc)
'data.frame':    150 obs. of  7 variables:
$ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4 4.6 5 4.4 4.9 ...
$ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9 3.4 3.4 2.9 3.1 ...
$ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 ...
$ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4 0.3 0.2 0.2 0.1 ...
$ Species     : Factor w/ 3 levels setosa,versicolor,..: 1 1 1 1
1 1 1 1 1 1 ...
$ var1        : num  5 5 5 5 5 5 5 5 4 5 ...
$ var2        : num  4 3 3 3 4 4 3 3 3 3 ...


In such cases, when I want to plug some variables from 2nd df into the
1st df, how should I proceed without losing the attributes of the 1st
data frame. And, if possible, I'm looking for something nicer than:
for(i in names(Xb)) Xa[ , i] - Xb[ , i]

Regards,
Liviu

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange error with log-normal models

2013-04-16 Thread peter dalgaard


On Apr 16, 2013, at 22:20 , Noah Silverman wrote:

 @Duncan, You make a very good point.  Somehow I overlooked that 0 is not 
 positive.  I guess that rules out the log normal model.
 
 My challenge here is  finding the right model for this data.  Originally it 
 was a nice count of students.  Relatively easy to model with a zero inflated 
 Poisson model.  The resulting residuals seemed reasonable.
 
 However, I was then instructed to change the count of students to a rate 
 which was calculated as students / population (Each school has its own 
 population.)) This is now no longer a count variable, but a proportion 
 between 0 and 1.  
 
 This rate (students/population) is no longer Poisson, but is certainly not 
 normal either.  So, I'm a bit lost as to the appropriate distribution to 
 represent it.
 
 Any thoughts?
 

Off the cuff: Could it be more natural to model as a ZIP with log(pop) as an 
offset on the log-lambda scale? 

 
 --
 Noah Silverman, M.S.
 UCLA Department of Statistics
 8117 Math Sciences Building
 Los Angeles, CA 90095
 
 On Apr 16, 2013, at 12:44 PM, Thomas Lumley tlum...@uw.edu wrote:
 
 On Wed, Apr 17, 2013 at 5:19 AM, Noah Silverman noahsilver...@ucla.edu 
 wrote:
 Hi,
 
 I have some data, that when plotted looks very close to a log-normal 
 distribution.  My goal is to build a regression model to test how this 
 variable responds to several independent variables.
 
 [snip]
 
 When I try to build a simple model, I also get an error:
 
 l - glm(y~ x, family=gaussian(link=log))
 
 Error in eval(expr, envir, enclos) :  cannot find valid starting values: 
 please specify some
 
 
 Duncan has described the problems with the lognormal.  I will just point out 
 that this 'simple model' is not lognormal.  It is a model with normal errors 
 and log link, ie.
 
 y ~ N(mu, sigma^2)
 log(mu) = x \beta
 
 
-thomas
 
 -- 
 Thomas Lumley
 Professor of Biostatistics
 University of Auckland
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] testInstalledBasic / testInstalledPackages

2013-04-16 Thread Trina Patel

Hi Marc,

Thank you for the links to all the resources, I will be sure to review
them in detail.

As for running,
Sys.setenv(LC_COLLATE = C, LANGUAGE = en)

I'm sorry that I forgot to mention that I did set the above
enviornmental variables as specified. Both within R, as suggested in
your email, and also by adding them as system enviornmental variables
(as required on our Windows 2008 Server environment).


Thanks again,
Trina Patel

On Tue, Apr 16, 2013 at 10:52 AM, Marc Schwartz marc_schwa...@me.com wrote:

 On Apr 16, 2013, at 11:44 AM, Trina Patel trinarpa...@gmail.com wrote:

 Hi,

 I installed R 3.0.0 on a Windows 2008 Server.

 When I submitted the following code in R64,
 library(tools)
 testInstalledBasic(scope=devel)

 I get the following message in the R Console:
 library(tools)
 testInstalledBasic(scope=devel)
 running tests of consistency of as/is.*
 creating ‘isas-tests.R’
  running code in ‘isas-tests.R’
  comparing ‘isas-tests.Rout’ to ‘isas-tests.Rout.save’ ...2550a2551

 running tests of random deviate generation -- fails occasionally
  running code in ‘p-r-random-tests.R’
  comparing ‘p-r-random-tests.Rout’ to ‘p-r-random-tests.Rout.save’ ... OK
 running tests of primitives
  running code in ‘primitives.R’
 running regexp regression tests
  running code in ‘utf8-regex.R’
 running tests to possibly trigger segfaults
 creating ‘no-segfault.R’
  running code in ‘no-segfault.R’
 Warning message:
 running command 'diff -bw
 C:\Users\TRINA_~1\AppData\Local\Temp\Rtmp2FwZXW\Rdiffa1a88562f12b
 C:\Users\TRINA_~1\AppData\Local\Temp\Rtmp2FwZXW\Rdiffb1a8848c57620'
 had status 1

 When I compare the isas-tests.Rout  to isas-tests.Rout.save, as well
 as the two diff files listed above, it seems that there is one extra
 empty line in isas-tests.Rout.save. Is there any way to fix this error
 without modifying the isas-tests.Rout.save file?

 Next I submitted the following code,
 testInstalledPackages(scope=base)

 and got the message below in my R console:
 testInstalledPackages(scope=base)
 Testing examples for package ‘base’
 Testing examples for package ‘tools’
  comparing ‘tools-Ex.Rout’ to ‘tools-Ex.Rout.save’ ...
 621c621
  [1] 0cce1e42ef3fb133940946534fcf8896
 ---
 [1] eb723b61539feef013de476e68b5c50a

 When comparing the files tools-ex.rout and tools-ex-rout.save, it
 seems this difference indicates an error in the md5sums for the file
 C:\Program Files\R\R-3.0.0\COPYING. Does this indicate a problem with
 my installation? Looking at the file C:\Program Files\R\R-3.0.0\MD5,
 leads me to suspect there might be an error in the test itself.


 Thanks for the help!


 See:

   
 http://cran.r-project.org/doc/manuals/r-release/R-admin.html#Testing-a-Windows-Installation

 from the R Installation and Administration Manual. Try running:

   Sys.setenv(LC_COLLATE = C, LANGUAGE = en)

 before you run the tests.

 You might also want to have a look at:

   https://github.com/marcschwartz/R-IQ-OQ

 Regards,

 Marc Schwartz


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] the joy of spreadsheets (off-topic)

2013-04-16 Thread Jim Lemon


On 04/17/2013 03:25 AM, Sarah Goslee wrote:

...
Ouch.

(Note: I know nothing about the site, the author of the article, or
the study in question. I was pointed to it by someone else. But if
true: highly problematic.)

Sarah

There seem to be three major problems described here, and only one is 
marginally related to Excel (and similar spreadsheets). Cherry picking 
data is all too common. Almost anyone who reviews papers for publication 
will have encountered it, and there are excellent books describing 
examples that have had great influence on public policy.


Similarly, applying obscure and sometimes inappropriate statistical 
methods that produce the desired results when nothing else will appears 
with depressing frequency.


The final point does relate to Excel and any application that hides what 
is going on to the casual observer. I will treasure this URL to give to 
anyone who chastises my moaning when I have to perform some task in 
Excel. It is not an error in the application (although these certainly 
exist) but a salutory caution to those who think that if a reasonable 
looking number appears in a cell, it must be the correct answer. I have 
found not one, but two such errors in the simple calculation of a 
birthday age from the date of birth and date of death.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Understanding why a GAM can't suppress an intercept

2013-04-16 Thread Andrew Crane-Droesch

  Dear List,

 I've just tried to specify a GAM without an intercept -- I've got one 
 of the (rare) cases where it is appropriate for E(y) - 0 as X -0.  
 Naively running a GAM with the -1 appended to the formula and the 
 calling predict.gam, I see that the model isn't behaving as expected.

 I don't understand why this would be.  Google turns up this old R help 
 thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html

 Simon writes:

 *Smooth terms are constrained to sum to zero over the covariate
 values. **
 **This is an identifiability constraint designed to avoid
 confounding with **
 **the intercept (particularly important if you have more than one
 smooth). *
 If you remove the intercept from you model altogether (m2) then the
 smooth will still sum to zero over the covariate values, which in
 your
 case will mean that the smooth is quite a long way from the data.
 When
 you include the intercept (m1) then the intercept is effectively
 shifting the constrained curve up towards the data, and you get a
 nice fit.

 Why?  I haven't read Simon's book in great detail, though I have read 
 Ruppert et al.'s Semiparametric Regression.  I don't see a reason why 
 a penalized spline model shouldn't equal the intercept (or zero) when 
 all of the regressors equals zero.

 Is anyone able to help with a bit of intuition?  Or relevant passages 
 from a good description of why this would be the case?

 Furthermore, why does the -1 formula specification work if it 
 doesn't work as intended by for example lm?

 Many thanks,
 Andrew





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Understanding why a GAM can't have an intercept

2013-04-16 Thread Andrew Crane-Droesch

Dear List,

I've just tried to specify a GAM without an intercept -- I've got one of 
the (rare) cases where it is appropriate for E(y) - 0 as X -0.  
Naively running a GAM with the -1 appended to the formula and the 
calling predict.gam, I see that the model isn't behaving as expected.

I don't understand why this would be.  Google turns up this old R help 
thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html

Simon writes:

*Smooth terms are constrained to sum to zero over the covariate
values. **
**This is an identifiability constraint designed to avoid
confounding with **
**the intercept (particularly important if you have more than one
smooth). *
If you remove the intercept from you model altogether (m2) then the
smooth will still sum to zero over the covariate values, which in your
case will mean that the smooth is quite a long way from the data. When
you include the intercept (m1) then the intercept is effectively
shifting the constrained curve up towards the data, and you get a
nice fit.

Why?  I haven't read Simon's book in great detail, though I have read 
Ruppert et al.'s Semiparametric Regression.  I don't see a reason why a 
penalized spline model shouldn't equal the intercept (or zero) when all 
of the regressors equals zero.

Is anyone able to help with a bit of intuition?  Or relevant passages 
from a good description of why this would be the case?

Furthermore, why does the -1 formula specification work if it doesn't 
work as intended by for example lm?

Many thanks,
Andrew




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] I don't understand the 'order' function

2013-04-16 Thread Julio Sergio

William Dunlap wdunlap at tibco.com writes:

 
 I think Duncan said that order and rank were inverses (if there are no 
ties).  order() has
 period 2 so order(order(x)) is also rank(x) if there are no ties.  E.g.,
 

Thanks William! This is very interesting. So, applying order two times I can 
have a rank index for each element.

Thanks,
  -Sergio.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Understanding why a GAM can't have an intercept

2013-04-16 Thread Andrew Crane-Droesch

please deleter this thread -- wrong title
On 04/16/2013 02:35 PM, Andrew Crane-Droesch wrote:
 Dear List,

 I've just tried to specify a GAM without an intercept -- I've got one 
 of the (rare) cases where it is appropriate for E(y) - 0 as X -0.  
 Naively running a GAM with the -1 appended to the formula and the 
 calling predict.gam, I see that the model isn't behaving as expected.

 I don't understand why this would be.  Google turns up this old R help 
 thread: http://r.789695.n4.nabble.com/GAM-without-intercept-td4645786.html

 Simon writes:

 *Smooth terms are constrained to sum to zero over the covariate
 values. **
 **This is an identifiability constraint designed to avoid
 confounding with **
 **the intercept (particularly important if you have more than one
 smooth). *
 If you remove the intercept from you model altogether (m2) then the
 smooth will still sum to zero over the covariate values, which in
 your
 case will mean that the smooth is quite a long way from the data.
 When
 you include the intercept (m1) then the intercept is effectively
 shifting the constrained curve up towards the data, and you get a
 nice fit.

 Why?  I haven't read Simon's book in great detail, though I have read 
 Ruppert et al.'s Semiparametric Regression.  I don't see a reason why 
 a penalized spline model shouldn't equal the intercept (or zero) when 
 all of the regressors equals zero.

 Is anyone able to help with a bit of intuition?  Or relevant passages 
 from a good description of why this would be the case?

 Furthermore, why does the -1 formula specification work if it 
 doesn't work as intended by for example lm?

 Many thanks,
 Andrew





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Singular design matrix in rq

2013-04-16 Thread William Dunlap

Have you looked at the result of
  bs(raw_data[,i], df=15)
?  If there are not many unique values in the input there
will be a lot of NaN's in the output (because there are
repeated knots) and those NaN's will cause rq() to give
that message.

E.g.,
 d - data.frame(y=sin(1:100), x4=rep(1:4,each=25), x50=rep(1:50,each=2))
 rq(data=d, y ~ bs(x4, df=15), tau=.8) # using x50 works
Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 with(d, bs(x4, df=15))
   1 2 3 4 5 6 7 8 9 10 11  12  13  14  15
  [1,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  [2,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  [3,] 0 0 1 0 0 0 0 0 0  0  0   0   0   0   0
  ...
[98,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
 [99,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
[100,] 0 0 0 0 0 0 0 0 0  0  0 NaN NaN NaN NaN
attr(,degree)
[1] 3
attr(,knots)
7.692308% 15.38462% 23.07692% 30.76923% 38.46154% 
1 1 1 2 2 
46.15385% 53.84615% 61.53846% 69.23077% 76.92308% 
2 3 3 3 4 
84.61538% 92.30769% 
4 4 
attr(,Boundary.knots)
[1] 1 4
attr(,intercept)
[1] FALSE
attr(,class)
[1] bs basis  matrix

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Jonathan Greenberg
 Sent: Tuesday, April 16, 2013 12:58 PM
 To: r-help; Roger Koenker
 Subject: [R] Singular design matrix in rq
 
 Quantreggers:
 
 I'm trying to run rq() on a dataset I posted at:
 https://docs.google.com/file/d/0B8Kij67bij_ASUpfcmJ4LTFEUUk/edit?usp=sharing
 (it's a 1500kb csv file named singular.csv) and am getting the following
 error:
 
 mydata - read.csv(singular.csv)
 fit_spl - rq(raw_data[,1] ~ bs(raw_data[,i],df=15),tau=1)
  Error in rq.fit.br(x, y, tau = tau, ...) : Singular design matrix
 
 Any ideas what might be causing this or, more importantly, suggestions for
 how to solve this?  I'm just trying to fit a smoothed hull to the top of
 the data cloud (hence the large df).
 
 Thanks!
 
 --jonathan
 
 
 --
 Jonathan A. Greenberg, PhD
 Assistant Professor
 Global Environmental Analysis and Remote Sensing (GEARS) Laboratory
 Department of Geography and Geographic Information Science
 University of Illinois at Urbana-Champaign
 607 South Mathews Avenue, MC 150
 Urbana, IL 61801
 Phone: 217-300-1924
 http://www.geog.illinois.edu/~jgrn/
 AIM: jgrn307, MSN: jgrn...@hotmail.com, Gchat: jgrn307, Skype: jgrn3007
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] plot 2 y axis

2013-04-16 Thread Ye Lin

Hi,

I want to plot two variables on the same graph but with two y axis just
like what you can do in Excel. I searched online that seems like you can
not achieve that in ggplot. So is there anyway I can do it in a nice way in
basic plot?

Suppose my data looks like this:

WeightHeight   Date
0.1   0.31
0.2  0.42
0.3  0.83
0.6   1  4

I want to haveDateas X axis ,Weight as the left y axis and Height as
the right y axis.

Thanks.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with handling of attributes in xmlToList in XML package

2013-04-16 Thread santiago gil

I apologize for the multiple posting then, it's just that I received
those emails saying that my post was awaiting approval and more than
four days went by without news. Sorry for the lack of patience.

Thank you very much, Ben. Indeed that's how I've been doing it so far,
but I have accrued too many reasons not to work with the XML object
any more and move all my coding to a list formulation.

I wonder what you mean with

 [...] but I find what you try below works if you specify useInternalNodes = 
 TRUE in your invocation of xmlTreeParse

Actually, the output error that I included happens when I use
useInternalNodes=T (my bad).  If I use useInternalNodes=F I get

 mylist[[ports]][[2]][[service]]$.attrs[name]
NULL

The useInternalNodes clause has proven fatally dangerous for me
before. If I parse a tree with useInternalNodes=T, save the workspace,
close R and reopen it, load the workspace and try to read the tree, it
will completely crash my computer, which has already cost me too many
lost days of work. On the other hand, useInternalNodes=F will result
in any xml operation being ridiculously slow. So the intention was to
move everything to a more R-friendly object like a list.

Any tips?

Best,


Santiago

2013/4/16 Ben Tupper btup...@bigelow.org:
 Hi,

 On Apr 16, 2013, at 2:49 PM, santiago gil wrote:

 2013/4/14 santiago gil sg.c...@gmail.com:
 Hello all,

 I have a problem with the way attributes are dealt with in the
 function xmlToList(), and I haven't been able to figure it out for
 days now.


 I have not used xmlToList(), but I find what you try below works if you 
 specify useInternalNodes = TRUE in your invocation of xmlTreeParse.  Often 
 that is the solution for many issues with xml.  Also, I have found it best to 
 write a relatively generic getter style function.  So, in the example below I 
 have written a function called getPortAttr - it will get attributes for the 
 child node you name.  I used your example as the defaults: service is the 
 child to query and name is the attribute to retrieve from that child.  It's 
 a heck of a lot easier to write a function than building the longish parse 
 strings with lots of [[this]][[and]][[that]] stuff, and it is reusable to 
 boot.

 Cheers,
 Ben

 library(XML)

 mydoc - 'host starttime=1365204834 endtime=1365205860
  status state=up reason=echo-reply reason_ttl=127/
  address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
  ports
   port protocol=tcp portid=135
state state=open reason=syn-ack reason_ttl=127/
service name=msrpc product=Microsoft Windows RPC ostype=Windows 
 method=probed conf=10
 cpecpe:/o:microsoft:windows/cpe
/service
   /port
   port protocol=tcp portid=139
state state=open reason=syn-ack reason_ttl=127/
service name=netbios-ssn method=probed conf=10/
   /port
  /ports
  times srtt=647 rttvar=71 to=10/
 /host'

 mytree-xmlTreeParse(mydoc, useInternalNodes = TRUE)
 myroot-xmlRoot(mytree)

 myports - myroot[[ports]][port]


 getPortAttr - function(x, child = service, attr = name) {
kid - x[[child]]
att - xmlAttrs(kid)[[attr]]
att
 }
 portNames - sapply(myports, getPortAttr)
 # portNames
 # port  port
 #  msrpc netbios-ssn
 portReason - sapply(myports, getPortAttr, child = state, attr = reason)
 # portReason
 # port  port
 #syn-ack syn-ack










 Say I have a document (produced by nmap) like this:

 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
portsport protocol=tcp portid=135state state=open
 reason=syn-ack reason_ttl=127/service name=msrpc
 product=Microsoft Windows RPC ostype=Windows method=probed
 conf=10cpecpe:/o:microsoft:windows/cpe/service/port
port protocol=tcp portid=139state state=open
 reason=syn-ack reason_ttl=127/service name=netbios-ssn
 method=probed conf=10//port
/ports
times srtt=647 rttvar=71 to=10/
/host'

 I want to store this as a list of lists, so I do:

 mytree-xmlTreeParse(mydoc)
 myroot-xmlRoot(mytree)
 mylist-xmlToList(myroot)

 Now my problem is that when I want to fetch the attributes of the
 services running of each port, the behavior is not consistent:

 mylist[[ports]][[1]][[service]]$.attrs[name]
   name
 msrpc
 mylist[[ports]][[2]][[service]]$.attrs[name]
 Error in trash_list[[ports]][[2]][[service]]$.attrs :
  $ operator is invalid for atomic vectors

 I understand that the way they are dfined in the documnt is not the
 same, but I think there still should be a consistent behavior. I've
 tried many combination of parameters for xmlTreeParse() but nothing
 has helped me. I can't find a way to call up the name of the service
 consistently regardless of whether the node has children or not. Any
 tips?

 All the best,


 S.G.

 --
 ---
 http://barabasilab.neu.edu/people/gil/



 --

Re: [R] plot 2 y axis

2013-04-16 Thread Jim Lemon


On 04/17/2013 08:35 AM, Ye Lin wrote:

Hi,

I want to plot two variables on the same graph but with two y axis just
like what you can do in Excel. I searched online that seems like you can
not achieve that in ggplot. So is there anyway I can do it in a nice way in
basic plot?

Suppose my data looks like this:

WeightHeight   Date
0.1   0.31
0.2  0.42
0.3  0.83
0.6   1  4

I want to haveDateas X axis ,Weight as the left y axis and Height as
the right y axis.


Hi Ye lin,
Try this (yldat is your data above as a data frame):

library(plotrix)
twoord.plot(yldat$Date,yldat$Height,yldat$Weight,
 lylim=c(0,1.04),rylim=c(0,0.61),
 xtickpos=1:4,xticklab=1:4)

That should get you started.

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Strange error with log-normal models

2013-04-16 Thread Ben Bolker

peter dalgaard pdalgd at gmail.com writes:


 On Apr 16, 2013, at 22:20 , Noah Silverman wrote:
 

  My challenge here is finding the right model for this data.
 Originally it was a nice count of students.  Relatively easy to
 model with a zero inflated Poisson model.  The resulting residuals
 seemed reasonable.  

 [snip]
 
 Off the cuff: Could it be more natural to model as a ZIP with log(pop) as
an offset on the log-lambda scale? 
 
  

  I agree.
  
This was cross-posted to StackOverflow (broken URL:
http://stackoverflow.com/questions/16046726/
   regression-for-a-rate-variable-in-r ), where I made
that suggestion.

  I don't know that cross-posting to r-help lists and StackOverflow
is anywhere expressly forbidden (cross-posting *among* r lists is
ruled out in the Posting Guide), but I'd prefer people didn't
(because of this kind of wasted/duplicated effort).

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Q-Q Plot for comparing two unequal data sets

2013-04-16 Thread Janh Anni

Hello All,

Would anyone be able to help me understand how R computes a
quantile-quantile plot for comparing two data samples with unequal sample
sizes?  Normally, the procedure should be to rearrange the larger data
sample into n equally-spaced parts using interpolation, where n is the
sample size of the smaller sample, and then plot the matching data pairs.  I
tried using different plotting position formulas for the interpolation but
cannot reproduce what R is plotting.  Thanks in advance.

Regards
Janh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] failed to download vegan

2013-04-16 Thread Elaine Kuo

Hello,

This is Elaine.

I am using R 3.0 to download package vegan but failed.
The warning message is

package vegan successfully unpacked and MD5 sums checked
Warning: unable to move temporary installation
C:\Users\elaine\Documents\R\win-library\3.0\file16c82da53b1b\vegan to
C:\Users\elaine\Documents\R\win-library\3.0\vegan

I cannot find the folder \file16c82da53b1b\ below
C:\Users\elaine\Documents\R\win-library\3.0
Please kindly help how to download or move the vegan to the folder it
should be in.

Thank you very much
Elaine

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Q-Q Plot for comparing two unequal data sets

2013-04-16 Thread Michael Weylandt



On Apr 16, 2013, at 20:12, Janh Anni annij...@gmail.com wrote:

 Hello All,
 
 Would anyone be able to help me understand how R computes a
 quantile-quantile plot for comparing two data samples with unequal sample
 sizes?  Normally, the procedure should be to rearrange the larger data
 sample into n equally-spaced parts using interpolation, where n is the
 sample size of the smaller sample, and then plot the matching data pairs.  I
 tried using different plotting position formulas for the interpolation but
 cannot reproduce what R is plotting.  Thanks in advance.

If you type qqplot at the prompt you'll be given the code and can review it for 
yourself. It's also available online for your viewing pleasure. 

http://svn.r-project.org/R/trunk/src/library/stats/R/qqplot.R

It seems the key is the approx (linear interpolation) function, but you can 
work out the details. 

Michael


 
 Regards
 Janh
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] failed to download vegan

2013-04-16 Thread Elaine Kuo

Hello All,

I manually moved the vegan.zip to C:\Users\elaine\Documents\R\
win-library\3.0\vegan.
Then unzipping the file.
It worked to require vegan

Elaine


On Wed, Apr 17, 2013 at 8:56 AM, Elaine Kuo elaine.kuo...@gmail.com wrote:

 Hello,

 This is Elaine.

 I am using R 3.0 to download package vegan but failed.
 The warning message is

 package vegan successfully unpacked and MD5 sums checked
 Warning: unable to move temporary installation
 C:\Users\elaine\Documents\R\win-library\3.0\file16c82da53b1b\vegan to
 C:\Users\elaine\Documents\R\win-library\3.0\vegan

 I cannot find the folder \file16c82da53b1b\ below
 C:\Users\elaine\Documents\R\win-library\3.0
 Please kindly help how to download or move the vegan to the folder it
 should be in.

 Thank you very much
 Elaine




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Problem with handling of attributes in xmlToList in XML package

2013-04-16 Thread Ben Tupper

Hi,

On Apr 16, 2013, at 6:39 PM, santiago gil wrote:
 
 Thank you very much, Ben. Indeed that's how I've been doing it so far,
 but I have accrued too many reasons not to work with the XML object
 any more and move all my coding to a list formulation.
 
 I wonder what you mean with
 
 [...] but I find what you try below works if you specify useInternalNodes = 
 TRUE in your invocation of xmlTreeParse
 
 Actually, the output error that I included happens when I use
 useInternalNodes=T (my bad).  

My bad right back at you.  It doesn't work here now (and didn't before I 
guess).  I can't explain why xmlToList splits the two nodes so differently.  
That's another good reason for me to shy away from it.

 If I use useInternalNodes=F I get
 
 mylist[[ports]][[2]][[service]]$.attrs[name]
 NULL
 
 The useInternalNodes clause has proven fatally dangerous for me
 before. If I parse a tree with useInternalNodes=T, save the workspace,
 close R and reopen it, load the workspace and try to read the tree, it
 will completely crash my computer, which has already cost me too many
 lost days of work. On the other hand, useInternalNodes=F will result
 in any xml operation being ridiculously slow. So the intention was to
 move everything to a more R-friendly object like a list.

My experience with the XML package seems to be quite different from yours 
regarding useInternalNodes = TRUE/FALSE.  I get satisfactory and stable 
performance with useInternalNodes = TRUE, so your experience is very puzzling 
to me.  I never save workspaces - heck, I'm not sure what XML does with the 
external pointers in that case.  Can you save an address and expect to get the 
same address later?  Instead I save the xml formed data using saveXML which 
dumps to a nicely formed text file.. 

I guess I'm not much help!  You might want to contact the maintainer of XML 
with a small example, such as the one you posted.  He has been very responsive 
and help to me in the past.

Cheers,
Ben

 Best,
 
 
 Santiago
 
 2013/4/16 Ben Tupper btup...@bigelow.org:
 Hi,
 
 On Apr 16, 2013, at 2:49 PM, santiago gil wrote:
 
 2013/4/14 santiago gil sg.c...@gmail.com:
 Hello all,
 
 I have a problem with the way attributes are dealt with in the
 function xmlToList(), and I haven't been able to figure it out for
 days now.
 
 
 I have not used xmlToList(), but I find what you try below works if you 
 specify useInternalNodes = TRUE in your invocation of xmlTreeParse.  Often 
 that is the solution for many issues with xml.  Also, I have found it best 
 to write a relatively generic getter style function.  So, in the example 
 below I have written a function called getPortAttr - it will get attributes 
 for the child node you name.  I used your example as the defaults: service 
 is the child to query and name is the attribute to retrieve from that 
 child.  It's a heck of a lot easier to write a function than building the 
 longish parse strings with lots of [[this]][[and]][[that]] stuff, and it is 
 reusable to boot.
 
 Cheers,
 Ben
 
 library(XML)
 
 mydoc - 'host starttime=1365204834 endtime=1365205860
 status state=up reason=echo-reply reason_ttl=127/
 address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
 ports
  port protocol=tcp portid=135
   state state=open reason=syn-ack reason_ttl=127/
   service name=msrpc product=Microsoft Windows RPC ostype=Windows 
 method=probed conf=10
cpecpe:/o:microsoft:windows/cpe
   /service
  /port
  port protocol=tcp portid=139
   state state=open reason=syn-ack reason_ttl=127/
   service name=netbios-ssn method=probed conf=10/
  /port
 /ports
 times srtt=647 rttvar=71 to=10/
 /host'
 
 mytree-xmlTreeParse(mydoc, useInternalNodes = TRUE)
 myroot-xmlRoot(mytree)
 
 myports - myroot[[ports]][port]
 
 
 getPortAttr - function(x, child = service, attr = name) {
   kid - x[[child]]
   att - xmlAttrs(kid)[[attr]]
   att
 }
 portNames - sapply(myports, getPortAttr)
 # portNames
 # port  port
 #  msrpc netbios-ssn
 portReason - sapply(myports, getPortAttr, child = state, attr = reason)
 # portReason
 # port  port
 #syn-ack syn-ack
 
 
 
 
 
 
 
 
 
 
 Say I have a document (produced by nmap) like this:
 
 mydoc - 'host starttime=1365204834 endtime=1365205860status 
 state=up reason=echo-reply reason_ttl=127/
   address addr=XXX.XXX.XXX.XXX addrtype=ipv4/
   portsport protocol=tcp portid=135state state=open
 reason=syn-ack reason_ttl=127/service name=msrpc
 product=Microsoft Windows RPC ostype=Windows method=probed
 conf=10cpecpe:/o:microsoft:windows/cpe/service/port
   port protocol=tcp portid=139state state=open
 reason=syn-ack reason_ttl=127/service name=netbios-ssn
 method=probed conf=10//port
   /ports
   times srtt=647 rttvar=71 to=10/
   /host'
 
 I want to store this as a list of lists, so I do:
 
 mytree-xmlTreeParse(mydoc)
 myroot-xmlRoot(mytree)
 mylist-xmlToList(myroot)
 
 Now my problem is that when I want to fetch the attributes of the
 services running of each port, the behavior is not

Re: [R] Q-Q Plot for comparing two unequal data sets

2013-04-16 Thread Janh Anni

Hello Michael,

Thanks for that information.

Regards

Janh


On Tue, Apr 16, 2013 at 9:13 PM, Michael Weylandt 
michael.weyla...@gmail.com wrote:



 On Apr 16, 2013, at 20:12, Janh Anni annij...@gmail.com wrote:

 Hello All,

 Would anyone be able to help me understand how R computes a
 quantile-quantile plot for comparing two data samples with unequal sample
 sizes?  Normally, the procedure should be to rearrange the larger data
 sample into n equally-spaced parts using interpolation, where n is the
 sample size of the smaller sample, and then plot the matching data pairs.
  I
 tried using different plotting position formulas for the interpolation but
 cannot reproduce what R is plotting.  Thanks in advance.


 If you type qqplot at the prompt you'll be given the code and can review
 it for yourself. It's also available online for your viewing pleasure.

 http://svn.r-project.org/R/trunk/src/library/stats/R/qqplot.R

 It seems the key is the approx (linear interpolation) function, but you
 can work out the details.

 Michael



 Regards
 Janh

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R question

2013-04-16 Thread arun

HI Philippos,

Try this:
dat1- read.csv(Validation_data_set3.csv,sep=,,stringsAsFactors=FALSE) 
#converted to csv
str(dat1)
#'data.frame':    12573 obs. of  17 variables:
# $ Removed.AGC  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.SST  : chr   46.1658 41.2566 
14.0931 ...
# $ Removed.Kurtosis : num  NA NA NA NA 5.38 ...
# $ Removed.Skewness : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.QC17999  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.QC16200  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.SST.AGC  : chr   46.1658 41.2566 
14.0931 ...
# $ Removed.Kurtosis.Skewness    : num  NA NA NA NA 5.38 ...
# $ Removed.AGC.QC16200  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.AGC.QC17999  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.AGC.QC17999.3.stdevs : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.AGC.QC17999.less.than.1  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.SST.AGC.QC17999  : chr   46.1658 41.2566 
14.0931 ...
# $ Removed.SST.AGC.QC16200  : chr   46.1658 41.2566 
14.0931 ...
# $ Removed.SST.AGC.Kurtosis.Skewness    : chr  ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC17999: chr  ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC16200: chr  ...


#Found these characters in columns that are not numeric

do.call(rbind,lapply(dat1,function(x) {x1- 
x[is.character(x)];x1[grepl(\\#,x1)]}))
 #     [,1]  [,2]  [,3] 
#Removed.SST   #DIV/0! #DIV/0! #DIV/0!
#Removed.SST.AGC   #DIV/0! #DIV/0! #DIV/0!
#Removed.SST.AGC.QC17999   #DIV/0! #DIV/0! #DIV/0!
#Removed.SST.AGC.QC16200   #DIV/0! #DIV/0! #DIV/0!
#Removed.SST.AGC.Kurtosis.Skewness #DIV/0! #DIV/0! #DIV/0!
#Removed.SST.AGC.Kurtosis.Skewness.QC17999 #DIV/0! #DIV/0! #DIV/0!
#Removed.SST.AGC.Kurtosis.Skewness.QC16200 #DIV/0! #DIV/0! #DIV/0!
#  [,4] 
#Removed.SST   #DIV/0!
#Removed.SST.AGC   #DIV/0!
#Removed.SST.AGC.QC17999   #DIV/0!
#Removed.SST.AGC.QC16200   #DIV/0!
#Removed.SST.AGC.Kurtosis.Skewness #DIV/0!
#Removed.SST.AGC.Kurtosis.Skewness.QC17999 #DIV/0!
#Removed.SST.AGC.Kurtosis.Skewness.QC16200 #DIV/0!


dat2-as.data.frame(sapply(dat1,function(x) { 
x[is.character(x)][grep(\\#,x[is.character(x)])]- NA; x1- as.numeric(x)}))
 str(dat2)
#'data.frame':    12573 obs. of  17 variables:
# $ Removed.AGC  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.SST  : num  NA 46.17 41.26 14.09 5.38 
...
# $ Removed.Kurtosis : num  NA NA NA NA 5.38 ...
# $ Removed.Skewness : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.QC17999  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.QC16200  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.SST.AGC  : num  NA 46.17 41.26 14.09 5.38 
...
# $ Removed.Kurtosis.Skewness    : num  NA NA NA NA 5.38 ...
# $ Removed.AGC.QC16200  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.AGC.QC17999  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.AGC.QC17999.3.stdevs : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.AGC.QC17999.less.than.1  : num  65.67 46.17 41.26 14.09 
5.38 ...
# $ Removed.SST.AGC.QC17999  : num  NA 46.17 41.26 14.09 5.38 
...
# $ Removed.SST.AGC.QC16200  : num  NA 46.17 41.26 14.09 5.38 
...
# $ Removed.SST.AGC.Kurtosis.Skewness    : num  NA NA NA NA 5.38 ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC17999: num  NA NA NA NA 5.38 ...
# $ Removed.SST.AGC.Kurtosis.Skewness.QC16200: num  NA NA NA NA 5.38 ...

head(dat2,3)
#  Removed.AGC Removed.SST Removed.Kurtosis Removed.Skewness Removed.QC17999
#1 65.6738  NA   NA  65.6738 65.6738
#2 46.1658 46.1658   NA  46.1658 46.1658
#3 41.2566 41.2566   NA  41.2566 41.2566
 # Removed.QC16200 Removed.SST.AGC Removed.Kurtosis.Skewness Removed.AGC.QC16200
#1 65.6738  NA    NA 65.6738
#2 46.1658 46.1658    NA 46.1658
#3 41.2566 41.2566    NA 41.2566
 # Removed.AGC.QC17999 Removed.AGC.QC17999.3.stdevs
#1 65.6738  65.6738
#2 46.1658  46.1658
#3 41.2566

[R] Unsubscribe please

2013-04-16 Thread Bert Verleysen (beverconsult)



Verstuurd vanaf mijn iPad
Bert Verleysen
00 32 (0)477 874 272
www.beverconsult.be

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Change the default resolution for plotting figures?

2013-04-16 Thread jtang


Hi,
I want to save a plot in the windows device as png and the default  
resolution is 72dpi. Is it possible to increase the default resolution  
to for example 300 dpi?
I have thought of using function png(..., res=300), but the problem is  
that the figure produced this way looks different than the one shown  
in the windows device. One notable difference is the missing of some  
ticks in the x axis. Therefore I would rather to produce the figure in  
a window device and then save it as a png. Unfortunately in the device  
window there is no such an option to change the resolution.

Little information can be found so far. Any ideas are appreciated!

Best,
Jing

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] normalizePath

2013-04-16 Thread yvonne young


maybe something wrong with your R_LIBS(it should be R_LIBS=dir/R-
3.0.0/lib64/)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Change the default resolution for plotting figures?

2013-04-16 Thread Janesh Devkota

I have been using the following so far without having any problems:

dev.copy(png,sample.png,width=8, height=10, units=in,res=500)
dev.off()




On Tue, Apr 16, 2013 at 6:32 PM, jt...@mappi.helsinki.fi wrote:

 Hi,
 I want to save a plot in the windows device as png and the default
 resolution is 72dpi. Is it possible to increase the default resolution to
 for example 300 dpi?
 I have thought of using function png(..., res=300), but the problem is
 that the figure produced this way looks different than the one shown in the
 windows device. One notable difference is the missing of some ticks in the
 x axis. Therefore I would rather to produce the figure in a window device
 and then save it as a png. Unfortunately in the device window there is no
 such an option to change the resolution.
 Little information can be found so far. Any ideas are appreciated!

 Best,
 Jing

 __**
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/**listinfo/r-helphttps://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/**
 posting-guide.html http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Unsubscribe please

2013-04-16 Thread Pascal Oettli


Hi,

Do it yourself:
https://stat.ethz.ch/mailman/listinfo/r-help

Hint:
Bbottom of the page (To unsubscribe from R-help)

Regards,
Pascal


On 04/17/2013 06:33 AM, Bert Verleysen (beverconsult) wrote:



Verstuurd vanaf mijn iPad
Bert Verleysen
00 32 (0)477 874 272
www.beverconsult.be

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Merge

2013-04-16 Thread arun

Hi Farnoosh,
YOu can use either ?merge() or ?join()
DataA- read.table(text=
ID     v1     
1     10
2     1 
3     22
4     15
5     3 
6     6 
7     8 
,sep=,header=TRUE)

DataB- read.table(text=
ID v2
2 yes
5 no
7 yes
,sep=,header=TRUE,stringsAsFactors=FALSE)

merge(DataA,DataB,by=ID,all.x=TRUE)
#  ID v1   v2
#1  1 10 NA
#2  2  1  yes
#3  3 22 NA
#4  4 15 NA
#5  5  3   no
#6  6  6 NA
#7  7  8  yes
 library(plyr)
 join(DataA,DataB,by=ID,type=left)
#  ID v1   v2
#1  1 10 NA
#2  2  1  yes
#3  3 22 NA
#4  4 15 NA
#5  5  3   no
#6  6  6 NA
#7  7  8  yes
A.K.






 From: farnoosh sheikhi farnoosh...@yahoo.com
To: smartpink...@yahoo.com smartpink...@yahoo.com 
Sent: Wednesday, April 17, 2013 12:52 AM
Subject: Merge
 


Hi Arun,

I want to merge a data set with another data frame with 2 columns and keep the 
sample size of the DataA.

DataA  DataB  DataCombine 
ID v1  ID V2  ID v1 v2 
1 10  2 yes  1 10 NA 
2 1  5 no  2 1 yes 
3 22  7 yes  3 22 NA 
4 15 4 15 NA 
5 3 5 3 no 
6 6 6 6 NA 
7 8 7 8 yes 


Thanks a lot for your help and time.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Transformation of a variable in a dataframe

2013-04-16 Thread jpm miao

HI,
   I have a dataframe with two variable A, B. I transform the two variable
and name them as C, D and save it in a dataframe  dfcd. However, I wonder
why can't I call them by dfcd$C and dfcd$D?

   Thanks,

Miao

 A=c(1,2,3)
 B=c(4,6,7)
 dfab-data.frame(A,B)
 C=dfab[A]*2
 D=dfab[B]*3
 dfcd-data.frame(C,D)
 dfcd
  A  B
1 2 12
2 4 18
3 6 21
 dfcd$C
NULL
 dfcd$A
[1] 2 4 6

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Transformation of a variable in a dataframe

2013-04-16 Thread Pascal Oettli


Hi,

Because a column name exists for C and D:

 colnames(C)
[1] A
 colnames(D)
[1] B

One possibility:
 A=c(1,2,3)
 B=c(4,6,7)
 dfab-data.frame(A,B)
 C=dfab$A*2
 D=dfab$B*3
 dfcd-data.frame(C,D)
 dfcd
  C  D
1 2 12
2 4 18
3 6 21
 dfcd$C
[1] 2 4 6

HTH,
Pascal


On 04/17/2013 02:33 PM, jpm miao wrote:

HI,
I have a dataframe with two variable A, B. I transform the two variable
and name them as C, D and save it in a dataframe  dfcd. However, I wonder
why can't I call them by dfcd$C and dfcd$D?

Thanks,

Miao


A=c(1,2,3)
B=c(4,6,7)
dfab-data.frame(A,B)
C=dfab[A]*2
D=dfab[B]*3
dfcd-data.frame(C,D)
dfcd

   A  B
1 2 12
2 4 18
3 6 21

dfcd$C

NULL

dfcd$A

[1] 2 4 6

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

96 matches

Mail list logo