Re: [R-es] desviacion estandard

2015-11-08 Thread Rubén Fernández-Casal
La desviación típica no depende de la escala. Si incluyes valores que se
repiten o que tienen poca variabilidad sería de esperar que pase eso,
aunque sea en uno de los extremos...

Un saludo, Rubén.
El 7/11/2015 9:43, "Albert Montolio"  escribió:

> Hola chic@s,
>
> tengo una pregunta teórica. Tengo la evolución de una variable en función
> del tiempo. Hay 145 valores. Los primeros 1 son 0, y los demás son
> crecientes. Calculo la desviacion estandard con R, contemplando las 145
> muestras (incluyendo los 0), y las 132 muestras (sin incluir los ceros).
>
> Me da que la desviación estandard sin contemplar los 0 es mayor. Como
> puede ser? no le veo el sentido.
>
> Adjunto cálculos en excel. En principio, si quito el mínimo de la serie,
> los datos tendrian que estar mas comprimidos no?
>
> --
>
>
> *Albert Montolio Aguado*
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Help scatterplot3d

2015-11-08 Thread Jim Lemon
Hi Julian,
As I don't have access to "datos", I had to make it up. The following does
what I expected.

library(scatterplot3d)
#datos<-read.csv("C:\\prueba.csv",sep=",",header=TRUE)
#str(datos)
datos<-data.frame(Bx=runif(40),e=runif(40),t=runif(40))
scatterplot3d(datos)
s3d<- scatterplot3d(datos, type = "h", color = "blue", angle = 55,
 scale.y = 0.7, pch = 16, main = "title")
my.lm <- lm(datos$Bx ~ datos$e + datos$t)
s3d$plane3d(my.lm)

Jim


On Sat, Nov 7, 2015 at 11:18 AM, John Kane  wrote:

>
> Please do not post in HTML. Your post is gibberish.
>
> Also please have a look at
>
> http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> and/or http://adv-r.had.co.nz/Reproducibility.html  for some suggestions
> on asking questions in R-help.
> John Kane
> Kingston ON Canada
>
>
> > -Original Message-
> > From: r-help@r-project.org
> > Sent: Fri, 6 Nov 2015 14:20:22 + (UTC)
> > To: r-help@r-project.org
> > Subject: [R] Help scatterplot3d
> >
> > Hi, I'm running this script:
> >
> library(scatterplot3d)datos<-read.csv("C:\\prueba.csv",sep=",",header=TRUE)str(datos)scatterplot3d(datos)
> > s3d<- scatterplot3d(datos, type = "h", color = "blue", angle = 55,
> > scale.y = 0.7, pch = 16, main = "title”)
> > my.lm <- lm(datos$Bx ~ datos$e + datos$t) s3d$plane3d(my.lm)
> >  I need to plot the experimental data ("datos") and the regression plane
> > given by "my.lm" in the same figure.The script plots "datos" but it
> > doesn't add the plot of the regression plane. Sometimes I get a message
> > like "s3d objet not found".
> > Thanks a lot.
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> 
> GET FREE 5GB EMAIL - Check out spam free email with many cool features!
> Visit http://www.inbox.com/email to find out more!
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] datos dependientes o independientes

2015-11-08 Thread Carlos J. Gil Bellosta
Hola, ¿qué tal?

No, no son independientes y estrictamente, no podrías usar el test de
Student. Aunque nunca he visto que hayan despedido a nadie por usarlo
sin que se cumplan las hipótesis de partida.

Una vez vi un pequeño artículo que trataba exactamente tu problema,
pero no lo he ubicado. Puedes probar con cosas como

http://www.lancs.ac.uk/~killick/Pub/KillickEckley2011.pdf

que son un poco más generales.

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

El día 8 de noviembre de 2015, 12:22, Albert Montolio
 escribió:
> Hola chic@s,
>
> tengo una del volumen de negocio en internt en espanha desde enero 1996
> hasta diciembre 2008. Quiero saber si la media del periode 1996-2000 y la
> media del periodo 2001-2008 son iguales. Para ello quiero realizar un
> contraste de hipotesis con R.
>
> Mi pregunta es, son datos dependientes o independientes? Creo que son datos
> independientes, ya que en el primer periodo y en el segundo, los meses son
> diferentes. Estoy en lo cierto?
>
> Si es asi, que analisis deberia hacer en R. ANOVA de un factor, de
> multiples factores? test t para datos relacionados no creo...
>
> Muchas gracias.
>
>
>
> --
>
>
> *Albert Montolio Aguado*
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


[R] LDA select number of topics

2015-11-08 Thread srecko joksimovic
Hi all,

I've seen recently this great post by Nikita Murzintcev
http://rpubs.com/nikita-moor/107657. If I understood correctly, according
to Griffiths (2004) I should select 11 topics? But, it seems that other
metrics suggest quite different number of topics?

I mean, 11 topics is about the right number, however, besides it works
better in my case, how do I know which metric to rely on? That is, if I
want to report this in a paper, can I simply say that I relied on Griffiths
(2004), without explaining why not Arun (2010), for example?

Thanks,


dda_topics.pdf
Description: Adobe PDF document
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Prefix

2015-11-08 Thread David Winsemius

> On Nov 8, 2015, at 4:05 PM, Val  wrote:
> 
> HI all,
> 
> DF <- read.table(textConnection(" X1   X2 X3   TIME
> Alex1  0 0   1960
> Alexa  0 01920
> Abbot  0 0  0
> Smith Alex1  Alexa2012
> Carla Alex1  01996
> JackySmith   Abbot2013
> Jack   0 Jacky2014
> Almo Jack Carla 2015   "),header = TRUE)

I would suggests using stringsAsFactors=FALSE
> 
> 
> I want to add the time  variable as prefix to the first column  (X1)
> and I did it as follow,
> 
> DF$X4 <- as.character(paste(DF$TIME,DF$X1 ,sep="_"))
> DF
> 
> All names in column two (X1) and three  (X3) are in column one. so I just
> want bring that prefix to column three and two, as well but I could not do
> that one.
> 
> Here is the final output  that  I would like to have.
> 
>  X1   X2 X3
> 1960_Alex  0   0
> 1920_Alexa 0  0
> 0_Abbot  0  0
> 2012_Smith 1960_Alex 1920_Alexa
> 1996_Carla  1960_Alex 0
> 2013_Jacky2012_Smith 0_Abbot
> 2014_Jack   0  2013_Jacky
> 2015_Almo2014_Jack  1996_Carla

If you follow my suggestion above, tehn these two lines produce vectors that 
may be of some use:

> paste(DF$TIME[match(DF$X2,DF$X1)], DF$X2, sep="_")
[1] "NA_0"   "NA_0"   "NA_0"   "1960_Alex1" "1960_Alex1"
[6] "2012_Smith" "NA_0"   "2014_Jack" 
> paste(DF$TIME[match(DF$X3,DF$X1)], DF$X3, sep="_")
[1] "NA_0"   "NA_0"   "NA_0"   "1920_Alexa" "NA_0"  
[6] "0_Abbot""2013_Jacky" “1996_Carla"


> 
> 
> Your help is appreciated in advance
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Alternatives for explicit for() loops

2015-11-08 Thread jim holtman
You need to take a close look at the function incomb that you are
creating.  I see what appears to be a constant value ("*(
gamma((1/beta)+1))*((alpha)^(-(1/beta)))") being computed that you might
only have to compute once before the function.  You are also referencing
many variables (m, LED, j, ...) that are not being passed in that are in
the global environment; it is probably better to pass them in to the
function.  I am not sure what 'pbapply' is doing for you since I see this
is new to the code that you first sent out.

I would be good it you told us what the function is trying to do; you are
showing us how you want to do it, not what you want to do.  Are there other
ways of doing it?  If speed is your problem, then consider the "Rcpp"
package and write the function is C++ which might be faster, but again,
take a look at what you are doing to see if there are other ways.  I don't
have time to dig into the code, since there is a lack of comments, to
understand why you are using, e.g., 'choose', 'prod', etc.).  There are
probably a lot of ways of speeding up the code, if could tell us what you
want to accomplish.


Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

On Sun, Nov 8, 2015 at 4:48 AM, Maram SAlem 
wrote:

> Thanks all for replying.
>
> In fact I've used the the Rprof() function and found out that the
> incomb() function (in my code above)  takes about 80% of the time, but I
> didn't figure out which part of the function is causing the delay. So I
> thought that this may be due to the for() loops.
> I MUST run this code for rather large values of n and m, so is there any
> way that can help me do that without having to wait for more than three
> days to reach an output. N.B. I'll have to repeat these runs for may be 70
> or 80 times , and this means HUGE time
>
> I'd appreciate any sort of help.
> Thanks in advance.
>
> Maram Salem
>
> On 6 November 2015 at 16:54, jim holtman  wrote:
>
>> If you have code that is running for a long time, then take a small case
>> that only runs for 5-10 minutes and turn on the RProfiler so that you can
>> see where you are spending your time.  In most cases, it is probably not
>> the 'for' loops that are causing the problem, but some function/calculation
>> you are doing within the loop that is consuming the time, and until you
>> determine what section of code that is, is it hard to tell exactly what the
>> problem is, much less the solution.
>>
>>
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>> Tell me what you want to do, not how you want to do it.
>>
>> On Wed, Nov 4, 2015 at 9:09 AM, Maram SAlem 
>> wrote:
>>
>>> Hi Jim,
>>>
>>> Thanks a lot for replying.
>>>
>>> In fact I'm trying to run a simulation study that enables me to
>>> calculate the Bayes risk of a sampling plan selected from progressively
>>> type-II censored Weibull model. One of the steps involves evaluating the
>>> expected test time, which is a rather complicated formula that involves
>>> nested multiple summations where the counters of the summation signs are
>>> dependent, that's why I thought of I should create the incomb() function
>>> inside the loop, or may be I didn't figure out how to relate its arguments
>>> to the ones inside the loop had I created it outside it.  I'm trying to
>>> create a matrix of all the possible combinations involved in the summations
>>> and then use the apply() function on each row of that matrix. The problem
>>> is that the code I wrote works perfectly well for rather small values of
>>> the sample size,n, and the censoring number, m (for example, n=8,m=4),but
>>> when n and m are increased (say, n=25,m=15) the code keeps on running for
>>> days with no output. That's why I thought I should try to avoid explicit
>>> loops as much as possible, so I did my best in this regard but still the
>>> code takes too long to execute,(more than three days), thus, i believe
>>> there must be something wrong.
>>>
>>> Here's the full code:
>>>
>>> library(pbapply)
>>> f1 <- function(n, m) {
>>>stopifnot(n > m)
>>>r0 <- t(diff(combn(n-1, m-1)) - 1L)
>>>r1 <- rep(seq(from=0, len=n-m+1), choose( seq(to=m-2, by=-1,
>>> len=n-m+1), m-2))
>>>cbind(r0[, ncol(r0):1, drop=FALSE], r1, deparse.level=0)
>>> }
>>> simpfun<- function (x,n,m,p,alpha,beta)
>>>   {
>>>   a<-factorial(n-m)/(prod((factorial(x)))*(factorial((n-m)- sum(x
>>>   b <-  ((m-1):1)
>>>   c<- a*((p)^(sum(x)))*((1-p)^(((m-1)*(n-m))- sum(x%*%(as.matrix(b)
>>> d <- n - cumsum(x) - (1:(m-1))
>>>   e<- n*(prod(d))*c
>>> LD<-list()
>>>for (i in 1:(m-1))  {
>>>LD[[i]]<-seq(0,x[i],1)
>>>}
>>>LD[[m]]<-seq(0,(n-m-sum(x)),1)
>>>LED<-expand.grid (LD)
>>>LED<-as.matrix(LED)
>>>store1<-numeric(nrow(LED))
>>> for (j in 1:length(store1) )
>>>  {
>>> 

[R] Prefix

2015-11-08 Thread Val
HI all,

DF <- read.table(textConnection(" X1   X2 X3   TIME
Alex1  0 0   1960
Alexa  0 01920
Abbot  0 0  0
Smith Alex1  Alexa2012
Carla Alex1  01996
JackySmith   Abbot2013
Jack   0 Jacky2014
Almo Jack Carla 2015   "),header = TRUE)


I want to add the time  variable as prefix to the first column  (X1)
and I did it as follow,

DF$X4 <- as.character(paste(DF$TIME,DF$X1 ,sep="_"))
DF

All names in column two (X1) and three  (X3) are in column one. so I just
want bring that prefix to column three and two, as well but I could not do
that one.

Here is the final output  that  I would like to have.

  X1   X2 X3
1960_Alex  0   0
1920_Alexa 0  0
 0_Abbot  0  0
2012_Smith 1960_Alex 1920_Alexa
1996_Carla  1960_Alex 0
 2013_Jacky2012_Smith 0_Abbot
 2014_Jack   0  2013_Jacky
 2015_Almo2014_Jack  1996_Carla


Your help is appreciated in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternatives for explicit for() loops

2015-11-08 Thread Boris Steipe
While I fully agree with Jim's comments, you may also need to understand the 
notion of time complexity in algorithm analysis. All the mentioned speed-ups 
are basically linear, in the sense that they accelerate a single step of your 
algorithm. However if your algorithm has combinatorial complexity, any amount 
of linear speed-up may only increase the tractable problem size by a trivial 
amount. I had proposed earlier that you write a small set of test examples and 
properly analyze the complexity - how does run time scale with your problem 
size. That gives you a rational basis to analyze 
 - how the improvements in your code affect the run time
 - how the improvements in your code affect the tractable problem size.

If this shows you that the problem won't go away, then some thoughts about 
optimization for computationally hard problems will be in order. These may 
include substituting a heuristic for an exact algorithm, or approximating 
solutions by stochastic sampling in large spaces.


B.
PS: and for real now: format your code, add comments, don't post in HTML, 
create an MWE ...




On Nov 8, 2015, at 6:13 PM, jim holtman  wrote:

> You need to take a close look at the function incomb that you are
> creating.  I see what appears to be a constant value ("*(
> gamma((1/beta)+1))*((alpha)^(-(1/beta)))") being computed that you might
> only have to compute once before the function.  You are also referencing
> many variables (m, LED, j, ...) that are not being passed in that are in
> the global environment; it is probably better to pass them in to the
> function.  I am not sure what 'pbapply' is doing for you since I see this
> is new to the code that you first sent out.
> 
> I would be good it you told us what the function is trying to do; you are
> showing us how you want to do it, not what you want to do.  Are there other
> ways of doing it?  If speed is your problem, then consider the "Rcpp"
> package and write the function is C++ which might be faster, but again,
> take a look at what you are doing to see if there are other ways.  I don't
> have time to dig into the code, since there is a lack of comments, to
> understand why you are using, e.g., 'choose', 'prod', etc.).  There are
> probably a lot of ways of speeding up the code, if could tell us what you
> want to accomplish.
> 
> 
> Jim Holtman
> Data Munger Guru
> 
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
> 
> On Sun, Nov 8, 2015 at 4:48 AM, Maram SAlem 
> wrote:
> 
>> Thanks all for replying.
>> 
>> In fact I've used the the Rprof() function and found out that the
>> incomb() function (in my code above)  takes about 80% of the time, but I
>> didn't figure out which part of the function is causing the delay. So I
>> thought that this may be due to the for() loops.
>> I MUST run this code for rather large values of n and m, so is there any
>> way that can help me do that without having to wait for more than three
>> days to reach an output. N.B. I'll have to repeat these runs for may be 70
>> or 80 times , and this means HUGE time
>> 
>> I'd appreciate any sort of help.
>> Thanks in advance.
>> 
>> Maram Salem
>> 
>> On 6 November 2015 at 16:54, jim holtman  wrote:
>> 
>>> If you have code that is running for a long time, then take a small case
>>> that only runs for 5-10 minutes and turn on the RProfiler so that you can
>>> see where you are spending your time.  In most cases, it is probably not
>>> the 'for' loops that are causing the problem, but some function/calculation
>>> you are doing within the loop that is consuming the time, and until you
>>> determine what section of code that is, is it hard to tell exactly what the
>>> problem is, much less the solution.
>>> 
>>> 
>>> Jim Holtman
>>> Data Munger Guru
>>> 
>>> What is the problem that you are trying to solve?
>>> Tell me what you want to do, not how you want to do it.
>>> 
>>> On Wed, Nov 4, 2015 at 9:09 AM, Maram SAlem 
>>> wrote:
>>> 
 Hi Jim,
 
 Thanks a lot for replying.
 
 In fact I'm trying to run a simulation study that enables me to
 calculate the Bayes risk of a sampling plan selected from progressively
 type-II censored Weibull model. One of the steps involves evaluating the
 expected test time, which is a rather complicated formula that involves
 nested multiple summations where the counters of the summation signs are
 dependent, that's why I thought of I should create the incomb() function
 inside the loop, or may be I didn't figure out how to relate its arguments
 to the ones inside the loop had I created it outside it.  I'm trying to
 create a matrix of all the possible combinations involved in the summations
 and then use the apply() function on each row of that matrix. The problem
 is that the code I wrote works perfectly well for rather small 

Re: [R] ggplot2 different Y axis scales

2015-11-08 Thread David Doyle
Thank you to Dennis and Jeff,

The scales = "free_y" did exactly what I needed.  Just in case some one
else has the same problem, the code is below.

Take Care
David

p <- ggplot(data = SS, aes(x=Year, y=Sulfate, col=Detections)) +
  geom_point(aes(shape=Detections))  +

  ##sets the colors
  scale_colour_manual(values=c("black","red")) +

  #location of the legend
  theme(legend.position=c("none")) +

  #sets the line color, type and size
  geom_line(colour="black", linetype="dotted", size=0.5) +
  ylab("Sulfate (mg/L)") +
  ##Graph title
  ggtitle("Figure 6-30
  Sandstone Sulfate Time Series")

## does the graph using the Well IDs as the different wells.
p + facet_grid(scales = "free_y",Well ~ .)


On Sat, Nov 7, 2015 at 7:48 AM, Dennis Murphy  wrote:

> As Jeff mentioned, you can use scales = "free_y" to allow different
> y-scales for the response in each facet, but you do not have the
> ability to control the ranges of the y-scales in each facet. That is
> controlled by the training process for scales in ggplot2. Generally
> speaking, it should be pretty close to what you want, but may not be
> ideal.
>
> Dennis
>
> On Fri, Nov 6, 2015 at 2:04 PM, David Doyle 
> wrote:
> > Hello Everyone,
> >
> > I'm using the following code to plot sulfate concentrations vs. time for
> > several groundwater wells at one time.  Normally I need the scales to all
> > be the same but in the case of sulfate I need to use a different scale
> for
> > each well.  This is because some of my wells have very high / wide ranges
> > (MW04 ranges from 4 - 3,000) where some have very small ranges (MW06
> ranges
> > from 13 - 34).
> >
> > Is there a way that I can  have qqplot2 automatically scale each well or
> a
> > way I could enter a scale range.?  For example I would like MW04 to have
> a
> > Y axis scale from 0 - 3,000 and MW06 to have a Y axis scale from 0 - 40
> >
> > I am using
> > RStudio version 0.99.484
> > R i386 3.2.2
> > ggplott2 ver 1.0.1
> > in a Windows 7 environment.
> >
> > Thank you for your time
> > David Doyle
> >
> >
> > library(ggplot2)
> > SS <-read.csv("http://doylesdartden.com/Stats/SS.csv;, sep=",")
> >
> > #Sets whic are detections and nondetects
> > SS$Detections <- ifelse(SS$D_Sulfate==1, "Detected", "NonDetect")
> > png(file="Sulfate.png",width=2400,height=3000,res=300)
> > #does the plot
> > p <- ggplot(data = SS, aes(x=Year, y=Sulfate, col=Detections)) +
> >   geom_point(aes(shape=Detections))  +
> >
> >   ##sets the colors
> >   scale_colour_manual(values=c("black","red")) +
> >
> >   #location of the legend
> >   theme(legend.position=c("none")) +
> >
> >   #sets the line color, type and size
> >   geom_line(colour="black", linetype="dotted", size=0.5) +
> >   ylab("Sulfate (mg/L)") +
> >   ##Graph title
> >   ggtitle("Figure 6-30
> >   Sandstone Sulfate Time Series")
> >
> > ## does the graph using the Well IDs as the different wells.
> > p + facet_grid(Well ~ .)
> > dev.off()
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] NULL dev.lis()

2015-11-08 Thread Thomas Adams
All,

I have previous built R from source many times, generally, without
problems. However on my new Ubuntu 15.04 Linux system with R 3.2.2 when I
run the command dev.list() I get:

> dev.list()
NULL

At the completion of running ./configure, I have

R is now configured for x86_64-pc-linux-gnu

  Source directory:  .
  Installation directory:/usr/local

  C compiler:gcc -std=gnu99  -g -O2
  Fortran 77 compiler:   gfortran  -g -O2

  C++ compiler:  g++  -g -O2
  C++ 11 compiler:   g++  -std=c++11 -g -O2
  Fortran 90/95 compiler:gfortran -g -O2
  Obj-C compiler:

  Interfaces supported:  X11
  External libraries:readline, zlib, lzma, PCRE, curl
  Additional capabilities:   PNG, JPEG, TIFF, NLS, cairo, ICU
  Options enabled:   shared BLAS, R profiling

  Capabilities skipped:
  Options not enabled:   memory profiling

  Recommended packages:  yes

This issue is causing me problems with spplot, which I have posted on
r-sig-geo. R and the display of all other graphics seems to be fine,
otherwise. My previous installations of R would yield:

> dev.list()
X11cairo
   2

And I had no problems with spplot. Any thoughts?

Regards,
Tom

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Alternatives for explicit for() loops

2015-11-08 Thread Maram SAlem
Thanks all for replying.

In fact I've used the the Rprof() function and found out that the incomb()
function (in my code above)  takes about 80% of the time, but I didn't
figure out which part of the function is causing the delay. So I thought
that this may be due to the for() loops.
I MUST run this code for rather large values of n and m, so is there any
way that can help me do that without having to wait for more than three
days to reach an output. N.B. I'll have to repeat these runs for may be 70
or 80 times , and this means HUGE time

I'd appreciate any sort of help.
Thanks in advance.

Maram Salem

On 6 November 2015 at 16:54, jim holtman  wrote:

> If you have code that is running for a long time, then take a small case
> that only runs for 5-10 minutes and turn on the RProfiler so that you can
> see where you are spending your time.  In most cases, it is probably not
> the 'for' loops that are causing the problem, but some function/calculation
> you are doing within the loop that is consuming the time, and until you
> determine what section of code that is, is it hard to tell exactly what the
> problem is, much less the solution.
>
>
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
>
> On Wed, Nov 4, 2015 at 9:09 AM, Maram SAlem 
> wrote:
>
>> Hi Jim,
>>
>> Thanks a lot for replying.
>>
>> In fact I'm trying to run a simulation study that enables me to calculate
>> the Bayes risk of a sampling plan selected from progressively type-II
>> censored Weibull model. One of the steps involves evaluating the expected
>> test time, which is a rather complicated formula that involves nested
>> multiple summations where the counters of the summation signs are
>> dependent, that's why I thought of I should create the incomb() function
>> inside the loop, or may be I didn't figure out how to relate its arguments
>> to the ones inside the loop had I created it outside it.  I'm trying to
>> create a matrix of all the possible combinations involved in the summations
>> and then use the apply() function on each row of that matrix. The problem
>> is that the code I wrote works perfectly well for rather small values of
>> the sample size,n, and the censoring number, m (for example, n=8,m=4),but
>> when n and m are increased (say, n=25,m=15) the code keeps on running for
>> days with no output. That's why I thought I should try to avoid explicit
>> loops as much as possible, so I did my best in this regard but still the
>> code takes too long to execute,(more than three days), thus, i believe
>> there must be something wrong.
>>
>> Here's the full code:
>>
>> library(pbapply)
>> f1 <- function(n, m) {
>>stopifnot(n > m)
>>r0 <- t(diff(combn(n-1, m-1)) - 1L)
>>r1 <- rep(seq(from=0, len=n-m+1), choose( seq(to=m-2, by=-1,
>> len=n-m+1), m-2))
>>cbind(r0[, ncol(r0):1, drop=FALSE], r1, deparse.level=0)
>> }
>> simpfun<- function (x,n,m,p,alpha,beta)
>>   {
>>   a<-factorial(n-m)/(prod((factorial(x)))*(factorial((n-m)- sum(x
>>   b <-  ((m-1):1)
>>   c<- a*((p)^(sum(x)))*((1-p)^(((m-1)*(n-m))- sum(x%*%(as.matrix(b)
>> d <- n - cumsum(x) - (1:(m-1))
>>   e<- n*(prod(d))*c
>> LD<-list()
>>for (i in 1:(m-1))  {
>>LD[[i]]<-seq(0,x[i],1)
>>}
>>LD[[m]]<-seq(0,(n-m-sum(x)),1)
>>LED<-expand.grid (LD)
>>LED<-as.matrix(LED)
>>store1<-numeric(nrow(LED))
>> for (j in 1:length(store1) )
>>  {
>> incomb<-function(x,alpha,beta) {
>>
>>  g<-((-1)^(sum(LED[j,])))*(gamma((1/beta)+1))*((alpha)^(-(1/beta)))
>> h <- choose(x, LED[j,-m])
>>ik<-prod(h)*choose((n-m-sum(x)),LED[j,m])
>> lm<-cumsum(LED[j,-m]) + (1:(m-1))
>> plm<-prod(lm)
>>gil<-g*ik/(plm)
>>  hlm<-numeric(sum(LED[j,])+(m-1))
>>  dsa<-length(hlm)
>>   for (i in 1:dsa)
>> {
>>  ppp<- sum(LED[j,])+(m-1)
>>   hlm[i]<-
>>  (choose(ppp,i))*((-1)^(i))*((i+1)^((-1)*((1/beta)+1)))
>>  }
>>   shl<-gil*(sum(hlm)+1)
>>   return (shl)
>>   }
>>store1[j]<-incomb(x,alpha=0.2,beta=2)
>>   }
>> val1<- sum(store1)*e
>> return(val1)
>> }
>>
>> va<-pbapply(s,1,simpfun,n=6,m=4,p=0.3,alpha=0.2,beta=2)
>> EXP<-sum(va)
>>
>>
>>
>> Any help would be greatly appreciated.
>> Thanks a lot  for your time.
>>
>> Best Regards,
>> Maram Salem
>>
>>
>> On 2 November 2015 at 00:27, jim holtman  wrote:
>>
>>> Why are you recreating the incomb function within the loop instead of
>>> defining it outside the loop?  Also you are referencing several variables
>>> that are global (e.g., m & j); you should be passing these in as parameters
>>> to the function.
>>>
>>>
>>> Jim Holtman
>>> Data Munger Guru
>>>
>>> What is the problem that you are trying to solve?
>>> Tell me 

[R-es] datos dependientes o independientes

2015-11-08 Thread Albert Montolio
Hola chic@s,

tengo una del volumen de negocio en internt en espanha desde enero 1996
hasta diciembre 2008. Quiero saber si la media del periode 1996-2000 y la
media del periodo 2001-2008 son iguales. Para ello quiero realizar un
contraste de hipotesis con R.

Mi pregunta es, son datos dependientes o independientes? Creo que son datos
independientes, ya que en el primer periodo y en el segundo, los meses son
diferentes. Estoy en lo cierto?

Si es asi, que analisis deberia hacer en R. ANOVA de un factor, de
multiples factores? test t para datos relacionados no creo...

Muchas gracias.



-- 


*Albert Montolio Aguado*

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] desviacion estandard

2015-11-08 Thread Carlos J. Gil Bellosta
Hola, ¿qué tal?

Lo que te pasa no es tan raro:

set.seed(1234)
muestra <- abs(rnorm(100))
sd(muestra)
#[1] 0.5811866

muestra.ceros <- c(muestra, rep(0, 10))
sd(muestra.ceros)
#[1] 0.03196273

En una muestra de números positivos, añadir un cero (sobre todo si
está lejos de la media) sube la varianza. Si añado otro, posiblemente
también. Pero cuando añado muchísimos ceros, la varianza tiende a
cero.

Si luego los quito, me quedo con la original: ¡la varianza crece a
pesar de que la muestra está "más comprimida"!

Un saludo,

Carlos J. Gil Bellosta
http://www.datanalytics.com

P.D.: La desviación típica depende linealmente de la escala.

El día 8 de noviembre de 2015, 12:16, Rubén Fernández-Casal
 escribió:
> La desviación típica no depende de la escala. Si incluyes valores que se
> repiten o que tienen poca variabilidad sería de esperar que pase eso,
> aunque sea en uno de los extremos...
>
> Un saludo, Rubén.
> El 7/11/2015 9:43, "Albert Montolio"  escribió:
>
>> Hola chic@s,
>>
>> tengo una pregunta teórica. Tengo la evolución de una variable en función
>> del tiempo. Hay 145 valores. Los primeros 1 son 0, y los demás son
>> crecientes. Calculo la desviacion estandard con R, contemplando las 145
>> muestras (incluyendo los 0), y las 132 muestras (sin incluir los ceros).
>>
>> Me da que la desviación estandard sin contemplar los 0 es mayor. Como
>> puede ser? no le veo el sentido.
>>
>> Adjunto cálculos en excel. En principio, si quito el mínimo de la serie,
>> los datos tendrian que estar mas comprimidos no?
>>
>> --
>>
>>
>> *Albert Montolio Aguado*
>>
>> ___
>> R-help-es mailing list
>> R-help-es@r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-help-es
>>
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] NULL dev.lis()

2015-11-08 Thread Pascal Oettli via R-help
Dear Tom,

Running R 3.2.2 on Ubuntu 15.04, if I run dev.list(), I get NULL. And
I guess it is the expected behavior, as per the help page, it "returns
the numbers of all open devices, except device 1, the null device".
So, if I run

x11()
dev.list()

I get

X11cairo
   2

HTH,
Pascal

On Mon, Nov 9, 2015 at 1:03 PM, Thomas Adams  wrote:
> All,
>
> I have previous built R from source many times, generally, without
> problems. However on my new Ubuntu 15.04 Linux system with R 3.2.2 when I
> run the command dev.list() I get:
>
>> dev.list()
> NULL
>
> At the completion of running ./configure, I have
>
> R is now configured for x86_64-pc-linux-gnu
>
>   Source directory:  .
>   Installation directory:/usr/local
>
>   C compiler:gcc -std=gnu99  -g -O2
>   Fortran 77 compiler:   gfortran  -g -O2
>
>   C++ compiler:  g++  -g -O2
>   C++ 11 compiler:   g++  -std=c++11 -g -O2
>   Fortran 90/95 compiler:gfortran -g -O2
>   Obj-C compiler:
>
>   Interfaces supported:  X11
>   External libraries:readline, zlib, lzma, PCRE, curl
>   Additional capabilities:   PNG, JPEG, TIFF, NLS, cairo, ICU
>   Options enabled:   shared BLAS, R profiling
>
>   Capabilities skipped:
>   Options not enabled:   memory profiling
>
>   Recommended packages:  yes
>
> This issue is causing me problems with spplot, which I have posted on
> r-sig-geo. R and the display of all other graphics seems to be fine,
> otherwise. My previous installations of R would yield:
>
>> dev.list()
> X11cairo
>2
>
> And I had no problems with spplot. Any thoughts?
>
> Regards,
> Tom
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Pascal Oettli
Project Scientist
JAMSTEC
Yokohama, Japan

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.