Re: [R] MonteCarlo sampling on profile log likelihood - package extRemes

2016-02-12 Thread Jue Lin-Ye
>
> ​​
> Date: Wed, 10 Feb 2016 15:19:21 +
> From: Lila Collet 
> To: r-help@R-project.org
> Subject: [R] MonteCarlo sampling on profile log likelihood - package
> extRemes
>
>
> Hi
>
>
> I am using the package extRemes to assess 100-year return period
> runoffs with the GEV and GP distribution functions and the associated
> 95% confidence intervals.
>
> I use the MLE method for that.
>
>
> Now I would like to sample a few thousands values of return levels on
> the profile likelihood between the 95% confidence interval boundaries.
>
> I saw that the function ?profliker? allows to sample log-likelihood
> values along the profile likelihood.
>
> Is there any way to sample return levels or get the return levels
> corresponding to these log-likelihood values?
>
>
> Thanks for any help.
>
> Kind regards,
>
> LCollet
>
>
​A string with the subject "Generate random numbers under constrain" saved
in the r-help Archive (website: https://stat.ethz.ch/pipermail/r-help/)
around Thu, 27 Nov 2014 19:20:41 +0100 might help you.

-- 
Jue Lin-Ye​

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] Ayuda. Análisis de Parsimonia de Endemismos

2016-02-12 Thread ANTONIO ZAMORA LOPEZ
Hola a todos,

estoy buscando información sobre cómo realizar un Análisis de Parsimonia
de Endemismos (PAE) con R. Tras recopilar información al respecto, no he
conseguido averiguar la librería ni los pasos a seguir para realizarlo.

Si alguno de ustedes ha llevado a cabo este análisis, le agradecería
mucho que me aportase información al respecto.

Gracias de antemano. Saludos!

Antonio Zamora 

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R-es] Invertir dcast

2016-02-12 Thread Ruben Bermad
Muchisimas gracias a todos los que os habeis interesado por mi problema. 
Vuestras soluciones me han servido de gran ayuda y funcionan perfectamente.  

Subject: Re: [R-es] Invertir dcast
From: luisf...@yahoo.com
Date: Thu, 11 Feb 2016 18:41:33 +0100
CC: ruben...@hotmail.com; r-help-es@r-project.org
To: c...@qualityexcellence.es

Buenas,
Como siempre me gusta ver formas distintas de hacer la misma cosa y, a ser 
posible, más eficientes.Si he entendido bien, otra forma que veo de hacer lo 
mismo es la siguiente: usando aritmética modular, la matriz M se puede ver como 
un vector de columnas concatenadas:
M <- matrix(c(5,NA,NA,NA,6,NA,7,NA,8),3,3)
linear.indices <- which(!is.na(M)) - 1M3 <- cbind(columna=linear.indices %/% 
nrow(M) + 1  , fila=linear.indices %% nrow(M) + 1  , 
valor=M[linear.indices+1])
En caso de que la matriz M sea grande, creo (y corregidme si me equivoco), que 
esta solución es más rápida y más eficiente en memoria. Además de que no hay 
que importar ninguna librería extra.En cuanto a poner el mismo nombre, se puede 
usar colnames(M) e indexarlo con la primera columna de M3.
Un saludo,Luisfo
El 11 feb 2016, a las 17:30, Carlos Ortega  
escribió:​Hola,

Sí, es que con los data.tables tienes que cambiar los nombres con la
función "setnames()" y "M2" es un data.table.

Saludos,
Carlos Ortega
www.qualityexcellence.es
​

El 11 de febrero de 2016, 17:26, Ruben Bermad 
escribió:


Parece que funciona y va rapido. Solo una duda, es posible poner que en
las variables fila y columna ponga los nomrbes que estaban en la matriz?
Siguiendo tu codigo, he probado poniendole a M2 los row.names de M:
row.names(M2) <- row.names (M)
pero no sirve.
Si hago row.names (M2) me aparecen los nuevos nombres, pero luego tanto en
la visualizacion como en el paso de M3 se ponen los nombres consecutivos.
Sabeis alguna manera para poder poner los nombres originales?
Muchas gracias !
Date: Thu, 11 Feb 2016 15:08:01 +0100
From: onu...@unex.es
To: javier.ruben.marcu...@gmail.com
CC: ruben...@hotmail.com; r-help-es@r-project.org
Subject: Re: [R-es] Invertir dcast

Con data.table todo puede ir muy rapido.
require(data.table)
M=matrix(c(5,NA,NA,NA,6,NA,7,NA,8),3,3)
M
 [,1] [,2] [,3]
[1,]5   NA7
[2,]   NA6   NA
[3,]   NA   NA8
M2=data.table(M)
M2
   V1 V2 V3
1:  5 NA  7
2: NA  6 NA
3: NA NA  8
M3=melt(M2,variable.name = "columna")
M3
   columna value
1:  V1 5
2:  V1NA
3:  V1NA
4:  V2NA
5:  V2 6
6:  V2NA
7:  V3 7
8:  V3NA
9:  V3 8
M3[,.(fila=which(!is.na(value)),value=na.omit(value)),by=columna]
   columna fila value
1:  V11 5
2:  V22 6
3:  V31 7
4:  V33 8


Un saludo. Olivier

- Mensaje original -
De: "Javier Marcuzzi" 
Para: "Ruben Bermad" , r-help-es@r-project.org
Enviados: Jueves, 11 de Febrero 2016 12:45:02
Asunto: Re: [R-es] Invertir dcast

Estimado Ruben Bernard

¿Usted desea algo como sparce matrix?

Javier Rubén Marcuzzi

De: Ruben Bermad
Enviado: jueves, 11 de febrero de 2016 9:40
Para: r-help-es@r-project.org
Asunto: [R-es] Invertir dcast

Hola a todos,
Queria preguntaros si conoceis alguna manera para invertir la funcion
dcast. Quiero transformar una matriz en un data frame de tres columnas que
indiquen solo los casos donde la combinacion fila-columna sea diferente de
NA.
Se me habia ocurrido hacer un bucle que fuera seleccionando todos los
valores para cada combinacion de fila y columna, pero el problema es que
con una matriz de 53000x5000 tarda demasiado, y tengo muchos valores NA que
no me sirven de nada.
Alguien sabe como podr�a invertir el dcast sin pasar  por todas las
combinaciones.
Muchas gracias por adelantado, Un cordial saludo,Ruben


  [[alternative HTML version deleted]]



  [[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es
  
[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

[R] why is 9 after 10?

2016-02-12 Thread Federico Calboli
Hi All,

I have some data, one of the columns is a bunch of numbers from 6 to 41.

table(my.data[,2])

returns

  10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   25 
  26   27   28   29   30   31   32   33   34   35   36   37 
1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973 1951 1947 
2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340 
  38   39   40   416789 
2681 2905 3399 3941 1648 1690 1727 1668 

whereas the reasonable expectation is that the numbers from 6 to 9 would come 
before 10 to 41.

How do I sort this incredibly silly behaviour so that my table follows a 
reasonable expectation that 9 comes before 10 (and so on and so forth)?

BW

F

--
Federico Calboli
Ecological Genetics Research Unit
Department of Biosciences
PO Box 65 (Biocenter 3, Viikinkaari 1)
FIN-00014 University of Helsinki
Finland

federico.calb...@helsinki.fi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with truncated normal distribution

2016-02-12 Thread Bert Gunter
Define "truncated." (It is often confused with "censored".)  As
stated, it seems to me that you already have the answer. Do you have
data? -- i.e. what do you mean by "parameters" ?

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 12, 2016 at 5:11 AM, Pamela Foggia  wrote:
> Hello,
> Do you know how to obtain the parameters of a theoretical normal
> distribution knowing the parameters of the same truncated normal
> distribution? Is there in R any function that can do it?
>
> Thanks in advance
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] confirm family==binomial and link==logistic

2016-02-12 Thread Bert Gunter
Not an answer 

But note that your several stopifnot() statements can be combined into
1. See ?stopifnot .

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 12, 2016 at 8:33 AM, Jacob Wegelin  wrote:
> To check that a regression object comes from logistic regression, I employ
> the following two lines:
>
> stopifnot(glmObject$family$family=="binomial")
>
> stopifnot(glmObject$family$link=="logit")
>
> For instance:
>
> toyfunction<-function(glmObject) {
> stopifnot(inherits(glmObject, "glm"))
> stopifnot(glmObject$family$family=="binomial")
> stopifnot(glmObject$family$link=="logit")
> cat("okay, I guess\n")
> glmObject
> }
>
> mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv;)
>
> someobject<- glm(admit~gre+gpa, data=mydata)
>
> toyfunction(someobject)
>
> someobject<- glm(admit~gre+gpa, data=mydata, family="binomial")
>
> toyfunction(someobject)
>
> But Doug Bates once stated that it's preferable to use extractor functions
> (and perhaps other ready-made functions?) rather than "deconstructing" an
> object (his term), as I do here.
>
> Accordingly, is there a smarter way to perform the check that I perform
> inside toyfunction?
>
> Thanks for any insight
>
> Jacob A. Wegelin
> Assistant Professor
> C. Kenneth and Dianne Wright Center for Clinical and Translational Research
> Department of Biostatistics
> Virginia Commonwealth University
> 830 E. Main St., Seventh Floor
> P. O. Box 980032
> Richmond VA 23298-0032
> U.S.A. CTSA grant: UL1TR58
> E-mail: jacobwege...@fastmail.fm URL: http://www.people.vcu.edu/~jwegelin
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] lapply on list of lists

2016-02-12 Thread Stefano Sofia
Dear R list users,
I have three lists of data frames, respectively temp_list, wind_list and 
snow_list.
The elements of these three lists are

temp_list$station1, temp_list$station2 and temp_list$station3 with columns date 
and temp;
wind_list$station1, wind_list$station2 and wind_list$station3 with columns 
date, wind_vel and wind_dir;
snow_list$station1, snow_list$station2 and snow_list$station3 with columns date 
and hs

where date has been transformed to character.
I need to merge temp_list$station1, wind_list$station1 and snow_list$station1, 
and same thing for station2 and station3.

If I create a list
list_all <- list(temp_list$station1, wind_list$station1, snow_list$station1)

then
Reduce(function(x, y) merge(x, y, by=c("date"), all=TRUE), list_all)

will do it. But then I have to create the other two lists and apply again 
Reduce.

I would like to create a list of list and using lapply twice in order to get 
this process completely automatic.
I tried

list_all <- list(temp_list, wind_list, snow_list)
names(list_all) <- c("temp", "wind", "snow")
lapply(names(list_all), function(val){list_all$val, lapply(c("station1", 
"station2", "station3"), function(val){Reduce(function(x, y) merge(x$val, 
y$val, by=c("date"), all=TRUE), list_all)}), list_all})

but it gives me a syntax error and I am struggling to make it work.
Could someboby help me to create the correct loop?

Thank you for your help
Stefano



AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere 
informazioni confidenziali, pertanto è destinato solo a persone autorizzate 
alla ricezione. I messaggi di posta elettronica per i client di Regione Marche 
possono contenere informazioni confidenziali e con privilegi legali. Se non si 
è il destinatario specificato, non leggere, copiare, inoltrare o archiviare 
questo messaggio. Se si è ricevuto questo messaggio per errore, inoltrarlo al 
mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi 
dell’art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessità ed 
urgenza, la risposta al presente messaggio di posta elettronica può essere 
visionata da persone estranee al destinatario.
IMPORTANT NOTICE: This e-mail message is intended to be received only by 
persons entitled to receive the confidential information it may contain. E-mail 
messages to clients of Regione Marche may contain information that is 
confidential and legally privileged. Please do not read, copy, forward, or 
store this message unless you are an intended recipient of it. If you have 
received this message in error, please forward it to the sender and delete it 
completely from your computer system.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread C W
On a side note, is it ok to do?

> which(max(p_x))
and use that instead of numerical integration to get E[X]?

I will try both and report back!  Thank you expeRts

On Fri, Feb 12, 2016 at 11:29 AM, C W  wrote:

> Hi Peter,
>
> Great, let me try that and get back to you on my findings in a few hours!
>  :)
>
> On Fri, Feb 12, 2016 at 11:09 AM, peter dalgaard  wrote:
>
>> I don't see here FAQ 7.31 comes in either (for once!...)
>>
>> However, either the density is unnormalized and the integral is not 1, or
>> the integral is 1 and it is normalized. The one in the picture clearly does
>> not integrate to one. You can fit a rectangle of size 0.1 by 1e191 under
>> the curve so the integral should be > 1e190 .
>>
>> As depicted, I don't see why a plain integral from .5 to 1.5 shouldn't
>> work.
>>
>> -pd
>>
>> On 12 Feb 2016, at 16:57 , C W  wrote:
>>
>> > Hi Bert,
>> >
>> > Yay fantasyland!
>> >
>> > In all seriousness, You are referring to this?
>> >
>> https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
>> >
>> > In particular, you mean this: .Machine$double.eps ^ 0.5?
>> >
>> > Thanks!
>> >
>> > On Fri, Feb 12, 2016 at 10:53 AM, Bert Gunter 
>> > wrote:
>> >
>> >> You are working in fantasyland. Your density is nonsense.
>> >>
>> >> Please see FAQ 7.31 for links to computer precision of numeric
>> >> calculations.
>> >>
>> >>
>> >> Cheers,
>> >> Bert
>> >> Bert Gunter
>> >>
>> >> "The trouble with having an open mind is that people keep coming along
>> >> and sticking things into it."
>> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> >>
>> >>
>> >> On Fri, Feb 12, 2016 at 7:36 AM, C W  wrote:
>> >>> Hi David,
>> >>>
>> >>> This is the Gaussian looking distribution I am trying to integrate.
>> >>>
>> >>
>> https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing
>> >>>
>> >>> Notice the unnormalized density goes up as high as 2.5*101^191.
>> >>>
>> >>> I tried to create small intervals like
>>  seq(0.5, 1.3, by = 10^(-8))
>> >>>
>> >>> but that doesn't seem to be good enough, as we know, it should
>> integrate
>> >> to
>> >>> 1.
>> >>>
>> >>> On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius <
>> dwinsem...@comcast.net
>> >>>
>> >>> wrote:
>> >>>
>> 
>> > On Feb 11, 2016, at 11:30 AM, C W  wrote:
>> >
>> > Hi David,
>> >
>> > My real function is actually a multivariate normal, the simple toy
>> 1-d
>>  normal won't work.
>> >
>> > But, you gave me an idea about restricting the bounds, and focus
>>  integrating on that.  I will get back to you if I need any further
>>  assistance.
>> 
>>  You'll probably need to restrict your bounds even more severely than
>> I
>> >> did
>>  in the 1-d case (using 10 SD's on either side of the mean) . You
>> might
>> >> need
>>  adequate representation of points near the center of your
>> >> hyper-rectangles.
>>  At least that's my armchair notion since I expect the densities tail
>> off
>>  rapidly in the corners. You can shoehorn multivariate integration
>> around
>>  the `integrate` function but it's messy and inefficient. There are
>> other
>>  packages that would be better choices. There's an entire section on
>>  numerical differentiation and integration in CRAN Task View:
>> Numerical
>>  Mathematics.
>> 
>>  --
>>  David.
>> 
>> 
>> >
>> > Thank you so much!
>> >
>> > On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius <
>> >> dwinsem...@comcast.net>
>>  wrote:
>> >
>> >> On Feb 11, 2016, at 9:20 AM, C W  wrote:
>> >>
>> >> I want to do numerical integration w.r.t. mu: P(mu) × N(mu,
>> 0.1)
>> >>
>> >> Because the variance is small, it results in density like:
>> >> 7.978846e+94
>> >>
>> >> Is there any good suggestion for this?
>> >
>> > So what's the difficulty? It's rather like the Dirac function.
>> >
>> >> integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
>> > 1 with absolute error < 7.4e-05
>> >
>> >
>> > --
>> > David.
>> >
>> >>
>> >> Thanks so much!
>> >>
>> >>
>> >> On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
>> >>
>> >>> Wow, thank you, that was very clear.  Let me give it some more
>> runs
>>  and
>> >>> investigate this.
>> >>>
>> >>> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap <
>> >> wdun...@tibco.com>
>> >>> wrote:
>> >>>
>>  Most of the mass of that distribution is within 3e-100 of 2.
>>  You have to be pretty lucky to have a point in sequence
>>  land there.  (You will get at most one point there because
>>  the difference between 2 and its nearest neightbors is on
>>  the order of 1e-16.)
>> 

Re: [R] confirm family==binomial and link==logistic

2016-02-12 Thread Marc Schwartz

> On Feb 12, 2016, at 10:33 AM, Jacob Wegelin  wrote:
> 
> To check that a regression object comes from logistic regression, I employ 
> the following two lines:
> 
>   stopifnot(glmObject$family$family=="binomial")
> 
>   stopifnot(glmObject$family$link=="logit")
> 
> For instance:
> 
> toyfunction<-function(glmObject) {
>   stopifnot(inherits(glmObject, "glm"))
>   stopifnot(glmObject$family$family=="binomial")
>   stopifnot(glmObject$family$link=="logit")
>   cat("okay, I guess\n")
>   glmObject
> }
> 
> mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv;)
> 
> someobject<- glm(admit~gre+gpa, data=mydata)
> 
> toyfunction(someobject)
> 
> someobject<- glm(admit~gre+gpa, data=mydata, family="binomial")
> 
> toyfunction(someobject)
> 
> But Doug Bates once stated that it's preferable to use extractor functions 
> (and perhaps other ready-made functions?) rather than "deconstructing" an 
> object (his term), as I do here.
> 
> Accordingly, is there a smarter way to perform the check that I perform 
> inside toyfunction?
> 
> Thanks for any insight
> 
> Jacob A. Wegelin


Hi Jacob,

The rationale behind Doug's comment is that if there is a pre-existing 
extractor function, you are at less risk due to future possible changes in the 
underlying object structure, than if you try to access an object element 
directly. 

If the underlying object structure should change in the future, you never know 
what result you might get by accessing the element directly, if you get one at 
all. 

If you use the extractor function, that would be (or should be) modified to 
reflect the change in the underlying object.

For your examples above, ?family should work, but returns more than just the 
'family' and 'link' components, which print.family() displays:

someobject<- glm(admit~gre+gpa, data=mydata)
> family(someobject)

Family: gaussian 
Link function: identity 


someobject<- glm(admit~gre+gpa, data=mydata, family="binomial")
> family(someobject)

Family: binomial 
Link function: logit 


So, you could feasibly use:

  family(someobject)$family
  family(someobject)$link

and so forth, to perform your checks. If you look at the output of 
str(family(someobject)), you will see the other elements contained.

Regards,

Marc Schwartz

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread C W
Hi Bert,

Yay fantasyland!

In all seriousness, You are referring to this?
https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f

In particular, you mean this: .Machine$double.eps ^ 0.5?

Thanks!

On Fri, Feb 12, 2016 at 10:53 AM, Bert Gunter 
wrote:

> You are working in fantasyland. Your density is nonsense.
>
> Please see FAQ 7.31 for links to computer precision of numeric
> calculations.
>
>
> Cheers,
> Bert
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>
>
> On Fri, Feb 12, 2016 at 7:36 AM, C W  wrote:
> > Hi David,
> >
> > This is the Gaussian looking distribution I am trying to integrate.
> >
> https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing
> >
> > Notice the unnormalized density goes up as high as 2.5*101^191.
> >
> > I tried to create small intervals like
> >> seq(0.5, 1.3, by = 10^(-8))
> >
> > but that doesn't seem to be good enough, as we know, it should integrate
> to
> > 1.
> >
> > On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius  >
> > wrote:
> >
> >>
> >> > On Feb 11, 2016, at 11:30 AM, C W  wrote:
> >> >
> >> > Hi David,
> >> >
> >> > My real function is actually a multivariate normal, the simple toy 1-d
> >> normal won't work.
> >> >
> >> > But, you gave me an idea about restricting the bounds, and focus
> >> integrating on that.  I will get back to you if I need any further
> >> assistance.
> >>
> >> You'll probably need to restrict your bounds even more severely than I
> did
> >> in the 1-d case (using 10 SD's on either side of the mean) . You might
> need
> >> adequate representation of points near the center of your
> hyper-rectangles.
> >> At least that's my armchair notion since I expect the densities tail off
> >> rapidly in the corners. You can shoehorn multivariate integration around
> >> the `integrate` function but it's messy and inefficient. There are other
> >> packages that would be better choices. There's an entire section on
> >> numerical differentiation and integration in CRAN Task View: Numerical
> >> Mathematics.
> >>
> >> --
> >> David.
> >>
> >>
> >> >
> >> > Thank you so much!
> >> >
> >> > On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius <
> dwinsem...@comcast.net>
> >> wrote:
> >> >
> >> > > On Feb 11, 2016, at 9:20 AM, C W  wrote:
> >> > >
> >> > > I want to do numerical integration w.r.t. mu: P(mu) × N(mu, 0.1)
> >> > >
> >> > > Because the variance is small, it results in density like:
> 7.978846e+94
> >> > >
> >> > > Is there any good suggestion for this?
> >> >
> >> > So what's the difficulty? It's rather like the Dirac function.
> >> >
> >> > > integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
> >> > 1 with absolute error < 7.4e-05
> >> >
> >> >
> >> > --
> >> > David.
> >> >
> >> > >
> >> > > Thanks so much!
> >> > >
> >> > >
> >> > > On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
> >> > >
> >> > >> Wow, thank you, that was very clear.  Let me give it some more runs
> >> and
> >> > >> investigate this.
> >> > >>
> >> > >> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap <
> wdun...@tibco.com>
> >> > >> wrote:
> >> > >>
> >> > >>> Most of the mass of that distribution is within 3e-100 of 2.
> >> > >>> You have to be pretty lucky to have a point in sequence
> >> > >>> land there.  (You will get at most one point there because
> >> > >>> the difference between 2 and its nearest neightbors is on
> >> > >>> the order of 1e-16.)
> >> > >>>
> >> > >>> seq(-2,4,len=101), as used by default in curve, does include 2
> >> > >>> but seq(-3,4,len=101) and seq(-2,4,len=100) do not so
> >> > >>> curve(..., -3, 4, 101) and curve(..., -2, 4, 100) will not show
> the
> >> bump.
> >> > >>> The same principal holds for numerical integration.
> >> > >>>
> >> > >>>
> >> > >>> Bill Dunlap
> >> > >>> TIBCO Software
> >> > >>> wdunlap tibco.com
> >> > >>>
> >> > >>> On Wed, Feb 10, 2016 at 6:37 PM, C W  wrote:
> >> > >>>
> >> >  Dear R,
> >> > 
> >> >  I am graphing the following normal density curve.  Why does it
> look
> >> so
> >> >  different?
> >> > 
> >> >  # the curves
> >> >  x <- seq(-2, 4, by=0.1)
> >> >  curve(dnorm(x, 2, 10^(-100)), -4, 4)  #right answer
> >> >  curve(dnorm(x, 2, 10^(-100)), -3, 4)  #changed -4 to -3, I get
> wrong
> >> >  answer
> >> > 
> >> >  Why the second curve is flat?  I just changed it from -4 to -3.
> >> There is
> >> >  no density in that region.
> >> > 
> >> > 
> >> >  Also, I am doing numerical integration.  Why are they so
> different?
> >> > 
> >> > > x <- seq(-2, 4, by=0.1)
> >> > > sum(x*dnorm(x, 2, 10^(-100)))*0.1
> >> >  [1] 7.978846e+94
> >> > > x <- seq(-1, 4, 

[R] Help with the Twitter Analysis

2016-02-12 Thread SHIVI BHATIA
Dear Team, 

 

Kindly refer to the error below while generating a Twitter Analysis for my
firm:

 

# Warning message:

# In doRppAPICall("search/tweets", n, params = params, retryOnRateLimit =
retryOnRateLimit,  :

 

As I checked on forums such as StackOverflow and other r related literature
it mentioned that this error happened due to the reason that there were less
tweets than what I requested for. On further investigation I got to know
that due to twitter API restrictions we can't fetch older tweets i.e. any
tweet prior to 6-7 days. This seems to be a very big hurdle for me to build
a sentiment analysis for the company as the time frame is very low. 

 

Request to please advise if there is some work around for the same or best
possible alternative. 

 

Thanks, Shivi

Mb: 9891002021

 

 

This e-mail is confidential. It may also be legally privileged. If you are not 
the addressee you may not copy, forward, disclose or use any part of it. If you 
have received this message in error, please delete it and all copies from your 
system and notify the sender immediately by return e-mail. Internet 
communications cannot be guaranteed to be timely, secure, error or virus-free. 
The sender does not accept liability for any errors or omissions.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confirm family==binomial and link==logistic

2016-02-12 Thread Jacob Wegelin

To check that a regression object comes from logistic regression, I employ the 
following two lines:

stopifnot(glmObject$family$family=="binomial")

stopifnot(glmObject$family$link=="logit")

For instance:

toyfunction<-function(glmObject) {
stopifnot(inherits(glmObject, "glm"))
stopifnot(glmObject$family$family=="binomial")
stopifnot(glmObject$family$link=="logit")
cat("okay, I guess\n")
glmObject
}

mydata <- read.csv("http://www.ats.ucla.edu/stat/data/binary.csv;)

someobject<- glm(admit~gre+gpa, data=mydata)

toyfunction(someobject)

someobject<- glm(admit~gre+gpa, data=mydata, family="binomial")

toyfunction(someobject)

But Doug Bates once stated that it's preferable to use extractor functions (and perhaps 
other ready-made functions?) rather than "deconstructing" an object (his term), 
as I do here.

Accordingly, is there a smarter way to perform the check that I perform inside 
toyfunction?

Thanks for any insight

Jacob A. Wegelin
Assistant Professor
C. Kenneth and Dianne Wright Center for Clinical and Translational Research
Department of Biostatistics
Virginia Commonwealth University
830 E. Main St., Seventh Floor
P. O. Box 980032
Richmond VA 23298-0032
U.S.A. 
CTSA grant: UL1TR58
E-mail: jacobwege...@fastmail.fm 
URL: http://www.people.vcu.edu/~jwegelin


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with the Twitter Analysis

2016-02-12 Thread Boris Steipe
Since this is due to a throttle in Twitter's API, I would assume that attempts 
to bypass this will violate Twitter's TOS. You'll either have to purchase the 
data you need, or build your own in-house data set, going forward.


B.

On Feb 12, 2016, at 8:18 AM, SHIVI BHATIA  wrote:

> Dear Team, 
> 
> 
> 
> Kindly refer to the error below while generating a Twitter Analysis for my
> firm:
> 
> 
> 
> # Warning message:
> 
> # In doRppAPICall("search/tweets", n, params = params, retryOnRateLimit =
> retryOnRateLimit,  :
> 
> 
> 
> As I checked on forums such as StackOverflow and other r related literature
> it mentioned that this error happened due to the reason that there were less
> tweets than what I requested for. On further investigation I got to know
> that due to twitter API restrictions we can't fetch older tweets i.e. any
> tweet prior to 6-7 days. This seems to be a very big hurdle for me to build
> a sentiment analysis for the company as the time frame is very low. 
> 
> 
> 
> Request to please advise if there is some work around for the same or best
> possible alternative. 
> 
> 
> 
> Thanks, Shivi
> 
> Mb: 9891002021
> 
> 
> 
> 
> 
> This e-mail is confidential. It may also be legally privileged. If you are 
> not the addressee you may not copy, forward, disclose or use any part of it. 
> If you have received this message in error, please delete it and all copies 
> from your system and notify the sender immediately by return e-mail. Internet 
> communications cannot be guaranteed to be timely, secure, error or 
> virus-free. The sender does not accept liability for any errors or omissions.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is 9 after 10?

2016-02-12 Thread Fox, John
Dear Federico,

Might my.data[, 2] contain character data, which therefore would be sorted in 
this manner? For example:

> x <- sample(6:37, 1000, replace=TRUE)
> table(x)
x
 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
32 33 34 35 36 37 
29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 
35 39 31 40 35 29 
> y <- as.character(x)
> table(y)
y
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 
36 37  6  7  8  9 
41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 35 39 31 40 
35 29 29 30 35 29

I hope this helps,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox




> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Federico
> Calboli
> Sent: February 12, 2016 10:13 AM
> To: R Help 
> Subject: [R] why is 9 after 10?
> 
> Hi All,
> 
> I have some data, one of the columns is a bunch of numbers from 6 to 41.
> 
> table(my.data[,2])
> 
> returns
> 
>   10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   
> 25   26   27   28   29
> 30   31   32   33   34   35   36   37
> 1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973 1951
> 1947 2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340
>   38   39   40   416789
> 2681 2905 3399 3941 1648 1690 1727 1668
> 
> whereas the reasonable expectation is that the numbers from 6 to 9 would
> come before 10 to 41.
> 
> How do I sort this incredibly silly behaviour so that my table follows a
> reasonable expectation that 9 comes before 10 (and so on and so forth)?
> 
> BW
> 
> F
> 
> --
> Federico Calboli
> Ecological Genetics Research Unit
> Department of Biosciences
> PO Box 65 (Biocenter 3, Viikinkaari 1)
> FIN-00014 University of Helsinki
> Finland
> 
> federico.calb...@helsinki.fi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is 9 after 10?

2016-02-12 Thread Federico Calboli
Dear John,

that is fortunatey not the case, I just managed to figure out that the problem 
was that in the data reshaping pipeline the numeric column was transformed into 
a factor.

Many thanks for your time.

BW

F



> On 12 Feb 2016, at 17:22, Fox, John  wrote:
> 
> Dear Federico,
> 
> Might my.data[, 2] contain character data, which therefore would be sorted in 
> this manner? For example:
> 
>> x <- sample(6:37, 1000, replace=TRUE)
>> table(x)
> x
> 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
> 32 33 34 35 36 37 
> 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 
> 35 39 31 40 35 29 
>> y <- as.character(x)
>> table(y)
> y
> 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 
> 36 37  6  7  8  9 
> 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 35 39 31 40 
> 35 29 29 30 35 29
> 
> I hope this helps,
> John
> 
> -
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
> 
> 
> 
> 
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Federico
>> Calboli
>> Sent: February 12, 2016 10:13 AM
>> To: R Help 
>> Subject: [R] why is 9 after 10?
>> 
>> Hi All,
>> 
>> I have some data, one of the columns is a bunch of numbers from 6 to 41.
>> 
>> table(my.data[,2])
>> 
>> returns
>> 
>>  10   11   12   13   14   15   16   17   18   19   20   21   22   23   24   
>> 25   26   27   28   29
>> 30   31   32   33   34   35   36   37
>> 1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973 1951
>> 1947 2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340
>>  38   39   40   416789
>> 2681 2905 3399 3941 1648 1690 1727 1668
>> 
>> whereas the reasonable expectation is that the numbers from 6 to 9 would
>> come before 10 to 41.
>> 
>> How do I sort this incredibly silly behaviour so that my table follows a
>> reasonable expectation that 9 comes before 10 (and so on and so forth)?
>> 
>> BW
>> 
>> F
>> 
>> --
>> Federico Calboli
>> Ecological Genetics Research Unit
>> Department of Biosciences
>> PO Box 65 (Biocenter 3, Viikinkaari 1)
>> FIN-00014 University of Helsinki
>> Finland
>> 
>> federico.calb...@helsinki.fi
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.

--
Federico Calboli
Ecological Genetics Research Unit
Department of Biosciences
PO Box 65 (Biocenter 3, Viikinkaari 1)
FIN-00014 University of Helsinki
Finland

federico.calb...@helsinki.fi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with truncated normal distribution

2016-02-12 Thread Pamela Foggia
Hello,
Do you know how to obtain the parameters of a theoretical normal
distribution knowing the parameters of the same truncated normal
distribution? Is there in R any function that can do it?

Thanks in advance

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread C W
Hi David,

This is the Gaussian looking distribution I am trying to integrate.
https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing

Notice the unnormalized density goes up as high as 2.5*101^191.

I tried to create small intervals like
> seq(0.5, 1.3, by = 10^(-8))

but that doesn't seem to be good enough, as we know, it should integrate to
1.

On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius 
wrote:

>
> > On Feb 11, 2016, at 11:30 AM, C W  wrote:
> >
> > Hi David,
> >
> > My real function is actually a multivariate normal, the simple toy 1-d
> normal won't work.
> >
> > But, you gave me an idea about restricting the bounds, and focus
> integrating on that.  I will get back to you if I need any further
> assistance.
>
> You'll probably need to restrict your bounds even more severely than I did
> in the 1-d case (using 10 SD's on either side of the mean) . You might need
> adequate representation of points near the center of your hyper-rectangles.
> At least that's my armchair notion since I expect the densities tail off
> rapidly in the corners. You can shoehorn multivariate integration around
> the `integrate` function but it's messy and inefficient. There are other
> packages that would be better choices. There's an entire section on
> numerical differentiation and integration in CRAN Task View: Numerical
> Mathematics.
>
> --
> David.
>
>
> >
> > Thank you so much!
> >
> > On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius 
> wrote:
> >
> > > On Feb 11, 2016, at 9:20 AM, C W  wrote:
> > >
> > > I want to do numerical integration w.r.t. mu: P(mu) × N(mu, 0.1)
> > >
> > > Because the variance is small, it results in density like: 7.978846e+94
> > >
> > > Is there any good suggestion for this?
> >
> > So what's the difficulty? It's rather like the Dirac function.
> >
> > > integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
> > 1 with absolute error < 7.4e-05
> >
> >
> > --
> > David.
> >
> > >
> > > Thanks so much!
> > >
> > >
> > > On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
> > >
> > >> Wow, thank you, that was very clear.  Let me give it some more runs
> and
> > >> investigate this.
> > >>
> > >> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap 
> > >> wrote:
> > >>
> > >>> Most of the mass of that distribution is within 3e-100 of 2.
> > >>> You have to be pretty lucky to have a point in sequence
> > >>> land there.  (You will get at most one point there because
> > >>> the difference between 2 and its nearest neightbors is on
> > >>> the order of 1e-16.)
> > >>>
> > >>> seq(-2,4,len=101), as used by default in curve, does include 2
> > >>> but seq(-3,4,len=101) and seq(-2,4,len=100) do not so
> > >>> curve(..., -3, 4, 101) and curve(..., -2, 4, 100) will not show the
> bump.
> > >>> The same principal holds for numerical integration.
> > >>>
> > >>>
> > >>> Bill Dunlap
> > >>> TIBCO Software
> > >>> wdunlap tibco.com
> > >>>
> > >>> On Wed, Feb 10, 2016 at 6:37 PM, C W  wrote:
> > >>>
> >  Dear R,
> > 
> >  I am graphing the following normal density curve.  Why does it look
> so
> >  different?
> > 
> >  # the curves
> >  x <- seq(-2, 4, by=0.1)
> >  curve(dnorm(x, 2, 10^(-100)), -4, 4)  #right answer
> >  curve(dnorm(x, 2, 10^(-100)), -3, 4)  #changed -4 to -3, I get wrong
> >  answer
> > 
> >  Why the second curve is flat?  I just changed it from -4 to -3.
> There is
> >  no density in that region.
> > 
> > 
> >  Also, I am doing numerical integration.  Why are they so different?
> > 
> > > x <- seq(-2, 4, by=0.1)
> > > sum(x*dnorm(x, 2, 10^(-100)))*0.1
> >  [1] 7.978846e+94
> > > x <- seq(-1, 4, by=0.1) #changed -2 to -1
> > > sum(x*dnorm(x, 2, 10^(-100)))*0.1
> >  [1] 0
> > 
> >  What is going here?  What a I doing wrong?
> > 
> >  Thanks so much!
> > 
> > [[alternative HTML version deleted]]
> > 
> >  __
> >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >  https://stat.ethz.ch/mailman/listinfo/r-help
> >  PLEASE do read the posting guide
> >  http://www.R-project.org/posting-guide.html
> >  and provide commented, minimal, self-contained, reproducible code.
> > 
> > >>>
> > >>>
> > >>
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > David Winsemius
> > Alameda, CA, USA
> >
> >
>
> David Winsemius
> Alameda, CA, USA
>
>


Re: [R] why is 9 after 10?

2016-02-12 Thread Fox, John
Dear Federico,

> -Original Message-
> From: Federico Calboli [mailto:federico.calb...@helsinki.fi]
> Sent: February 12, 2016 10:27 AM
> To: Fox, John 
> Cc: R Help 
> Subject: Re: [R] why is 9 after 10?
> 
> Dear John,
> 
> that is fortunatey not the case, I just managed to figure out that the problem
> was that in the data reshaping pipeline the numeric column was transformed
> into a factor.

But that shouldn't have this effect, I think:

> z <- as.factor(x)
> table(z)
z
 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 
32 33 34 35 36 37 
29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 
35 39 31 40 35 29

> levels(z)
 [1] "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" 
"21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31"
[27] "32" "33" "34" "35" "36" "37"

Best,
 John

> 
> Many thanks for your time.
> 
> BW
> 
> F
> 
> 
> 
> > On 12 Feb 2016, at 17:22, Fox, John  wrote:
> >
> > Dear Federico,
> >
> > Might my.data[, 2] contain character data, which therefore would be
> sorted in this manner? For example:
> >
> >> x <- sample(6:37, 1000, replace=TRUE)
> >> table(x)
> > x
> > 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
> > 30 31 32 33 34 35 36 37
> > 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31
> > 34 23 32 35 39 31 40 35 29
> >> y <- as.character(x)
> >> table(y)
> > y
> > 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
> > 33 34 35 36 37  6  7  8  9
> > 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 35
> > 39 31 40 35 29 29 30 35 29
> >
> > I hope this helps,
> > John
> >
> > -
> > John Fox, Professor
> > McMaster University
> > Hamilton, Ontario
> > Canada L8S 4M4
> > Web: socserv.mcmaster.ca/jfox
> >
> >
> >
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> >> Federico Calboli
> >> Sent: February 12, 2016 10:13 AM
> >> To: R Help 
> >> Subject: [R] why is 9 after 10?
> >>
> >> Hi All,
> >>
> >> I have some data, one of the columns is a bunch of numbers from 6 to 41.
> >>
> >> table(my.data[,2])
> >>
> >> returns
> >>
> >>  10   11   12   13   14   15   16   17   18   19   20   21   22   23   24  
> >>  25   26   27   28
> 29
> >> 30   31   32   33   34   35   36   37
> >> 1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973
> >> 1951
> >> 1947 2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340
> >>  38   39   40   416789
> >> 2681 2905 3399 3941 1648 1690 1727 1668
> >>
> >> whereas the reasonable expectation is that the numbers from 6 to 9
> >> would come before 10 to 41.
> >>
> >> How do I sort this incredibly silly behaviour so that my table
> >> follows a reasonable expectation that 9 comes before 10 (and so on and
> so forth)?
> >>
> >> BW
> >>
> >> F
> >>
> >> --
> >> Federico Calboli
> >> Ecological Genetics Research Unit
> >> Department of Biosciences
> >> PO Box 65 (Biocenter 3, Viikinkaari 1)
> >> FIN-00014 University of Helsinki
> >> Finland
> >>
> >> federico.calb...@helsinki.fi
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-
> >> guide.html and provide commented, minimal, self-contained,
> >> reproducible code.
> 
> --
> Federico Calboli
> Ecological Genetics Research Unit
> Department of Biosciences
> PO Box 65 (Biocenter 3, Viikinkaari 1)
> FIN-00014 University of Helsinki
> Finland
> 
> federico.calb...@helsinki.fi

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread Bert Gunter
You are working in fantasyland. Your density is nonsense.

Please see FAQ 7.31 for links to computer precision of numeric calculations.


Cheers,
Bert
Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 12, 2016 at 7:36 AM, C W  wrote:
> Hi David,
>
> This is the Gaussian looking distribution I am trying to integrate.
> https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing
>
> Notice the unnormalized density goes up as high as 2.5*101^191.
>
> I tried to create small intervals like
>> seq(0.5, 1.3, by = 10^(-8))
>
> but that doesn't seem to be good enough, as we know, it should integrate to
> 1.
>
> On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius 
> wrote:
>
>>
>> > On Feb 11, 2016, at 11:30 AM, C W  wrote:
>> >
>> > Hi David,
>> >
>> > My real function is actually a multivariate normal, the simple toy 1-d
>> normal won't work.
>> >
>> > But, you gave me an idea about restricting the bounds, and focus
>> integrating on that.  I will get back to you if I need any further
>> assistance.
>>
>> You'll probably need to restrict your bounds even more severely than I did
>> in the 1-d case (using 10 SD's on either side of the mean) . You might need
>> adequate representation of points near the center of your hyper-rectangles.
>> At least that's my armchair notion since I expect the densities tail off
>> rapidly in the corners. You can shoehorn multivariate integration around
>> the `integrate` function but it's messy and inefficient. There are other
>> packages that would be better choices. There's an entire section on
>> numerical differentiation and integration in CRAN Task View: Numerical
>> Mathematics.
>>
>> --
>> David.
>>
>>
>> >
>> > Thank you so much!
>> >
>> > On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius 
>> wrote:
>> >
>> > > On Feb 11, 2016, at 9:20 AM, C W  wrote:
>> > >
>> > > I want to do numerical integration w.r.t. mu: P(mu) × N(mu, 0.1)
>> > >
>> > > Because the variance is small, it results in density like: 7.978846e+94
>> > >
>> > > Is there any good suggestion for this?
>> >
>> > So what's the difficulty? It's rather like the Dirac function.
>> >
>> > > integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
>> > 1 with absolute error < 7.4e-05
>> >
>> >
>> > --
>> > David.
>> >
>> > >
>> > > Thanks so much!
>> > >
>> > >
>> > > On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
>> > >
>> > >> Wow, thank you, that was very clear.  Let me give it some more runs
>> and
>> > >> investigate this.
>> > >>
>> > >> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap 
>> > >> wrote:
>> > >>
>> > >>> Most of the mass of that distribution is within 3e-100 of 2.
>> > >>> You have to be pretty lucky to have a point in sequence
>> > >>> land there.  (You will get at most one point there because
>> > >>> the difference between 2 and its nearest neightbors is on
>> > >>> the order of 1e-16.)
>> > >>>
>> > >>> seq(-2,4,len=101), as used by default in curve, does include 2
>> > >>> but seq(-3,4,len=101) and seq(-2,4,len=100) do not so
>> > >>> curve(..., -3, 4, 101) and curve(..., -2, 4, 100) will not show the
>> bump.
>> > >>> The same principal holds for numerical integration.
>> > >>>
>> > >>>
>> > >>> Bill Dunlap
>> > >>> TIBCO Software
>> > >>> wdunlap tibco.com
>> > >>>
>> > >>> On Wed, Feb 10, 2016 at 6:37 PM, C W  wrote:
>> > >>>
>> >  Dear R,
>> > 
>> >  I am graphing the following normal density curve.  Why does it look
>> so
>> >  different?
>> > 
>> >  # the curves
>> >  x <- seq(-2, 4, by=0.1)
>> >  curve(dnorm(x, 2, 10^(-100)), -4, 4)  #right answer
>> >  curve(dnorm(x, 2, 10^(-100)), -3, 4)  #changed -4 to -3, I get wrong
>> >  answer
>> > 
>> >  Why the second curve is flat?  I just changed it from -4 to -3.
>> There is
>> >  no density in that region.
>> > 
>> > 
>> >  Also, I am doing numerical integration.  Why are they so different?
>> > 
>> > > x <- seq(-2, 4, by=0.1)
>> > > sum(x*dnorm(x, 2, 10^(-100)))*0.1
>> >  [1] 7.978846e+94
>> > > x <- seq(-1, 4, by=0.1) #changed -2 to -1
>> > > sum(x*dnorm(x, 2, 10^(-100)))*0.1
>> >  [1] 0
>> > 
>> >  What is going here?  What a I doing wrong?
>> > 
>> >  Thanks so much!
>> > 
>> > [[alternative HTML version deleted]]
>> > 
>> >  __
>> >  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >  https://stat.ethz.ch/mailman/listinfo/r-help
>> >  PLEASE do read the posting guide
>> >  http://www.R-project.org/posting-guide.html
>> >  and provide commented, minimal, self-contained, reproducible 

Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread peter dalgaard
I don't see here FAQ 7.31 comes in either (for once!...)

However, either the density is unnormalized and the integral is not 1, or the 
integral is 1 and it is normalized. The one in the picture clearly does not 
integrate to one. You can fit a rectangle of size 0.1 by 1e191 under the curve 
so the integral should be > 1e190 .

As depicted, I don't see why a plain integral from .5 to 1.5 shouldn't work. 

-pd

On 12 Feb 2016, at 16:57 , C W  wrote:

> Hi Bert,
> 
> Yay fantasyland!
> 
> In all seriousness, You are referring to this?
> https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> 
> In particular, you mean this: .Machine$double.eps ^ 0.5?
> 
> Thanks!
> 
> On Fri, Feb 12, 2016 at 10:53 AM, Bert Gunter 
> wrote:
> 
>> You are working in fantasyland. Your density is nonsense.
>> 
>> Please see FAQ 7.31 for links to computer precision of numeric
>> calculations.
>> 
>> 
>> Cheers,
>> Bert
>> Bert Gunter
>> 
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>> 
>> 
>> On Fri, Feb 12, 2016 at 7:36 AM, C W  wrote:
>>> Hi David,
>>> 
>>> This is the Gaussian looking distribution I am trying to integrate.
>>> 
>> https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing
>>> 
>>> Notice the unnormalized density goes up as high as 2.5*101^191.
>>> 
>>> I tried to create small intervals like
 seq(0.5, 1.3, by = 10^(-8))
>>> 
>>> but that doesn't seem to be good enough, as we know, it should integrate
>> to
>>> 1.
>>> 
>>> On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius >> 
>>> wrote:
>>> 
 
> On Feb 11, 2016, at 11:30 AM, C W  wrote:
> 
> Hi David,
> 
> My real function is actually a multivariate normal, the simple toy 1-d
 normal won't work.
> 
> But, you gave me an idea about restricting the bounds, and focus
 integrating on that.  I will get back to you if I need any further
 assistance.
 
 You'll probably need to restrict your bounds even more severely than I
>> did
 in the 1-d case (using 10 SD's on either side of the mean) . You might
>> need
 adequate representation of points near the center of your
>> hyper-rectangles.
 At least that's my armchair notion since I expect the densities tail off
 rapidly in the corners. You can shoehorn multivariate integration around
 the `integrate` function but it's messy and inefficient. There are other
 packages that would be better choices. There's an entire section on
 numerical differentiation and integration in CRAN Task View: Numerical
 Mathematics.
 
 --
 David.
 
 
> 
> Thank you so much!
> 
> On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius <
>> dwinsem...@comcast.net>
 wrote:
> 
>> On Feb 11, 2016, at 9:20 AM, C W  wrote:
>> 
>> I want to do numerical integration w.r.t. mu: P(mu) × N(mu, 0.1)
>> 
>> Because the variance is small, it results in density like:
>> 7.978846e+94
>> 
>> Is there any good suggestion for this?
> 
> So what's the difficulty? It's rather like the Dirac function.
> 
>> integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
> 1 with absolute error < 7.4e-05
> 
> 
> --
> David.
> 
>> 
>> Thanks so much!
>> 
>> 
>> On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
>> 
>>> Wow, thank you, that was very clear.  Let me give it some more runs
 and
>>> investigate this.
>>> 
>>> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap <
>> wdun...@tibco.com>
>>> wrote:
>>> 
 Most of the mass of that distribution is within 3e-100 of 2.
 You have to be pretty lucky to have a point in sequence
 land there.  (You will get at most one point there because
 the difference between 2 and its nearest neightbors is on
 the order of 1e-16.)
 
 seq(-2,4,len=101), as used by default in curve, does include 2
 but seq(-3,4,len=101) and seq(-2,4,len=100) do not so
 curve(..., -3, 4, 101) and curve(..., -2, 4, 100) will not show
>> the
 bump.
 The same principal holds for numerical integration.
 
 
 Bill Dunlap
 TIBCO Software
 wdunlap tibco.com
 
 On Wed, Feb 10, 2016 at 6:37 PM, C W  wrote:
 
> Dear R,
> 
> I am graphing the following normal density curve.  Why does it
>> look
 so
> different?
> 
> # the curves
> x <- seq(-2, 4, by=0.1)
> curve(dnorm(x, 2, 10^(-100)), -4, 4)  #right answer
> curve(dnorm(x, 2, 10^(-100)), -3, 4)  #changed -4 

Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread C W
Hi Peter,

Great, let me try that and get back to you on my findings in a few hours!
 :)

On Fri, Feb 12, 2016 at 11:09 AM, peter dalgaard  wrote:

> I don't see here FAQ 7.31 comes in either (for once!...)
>
> However, either the density is unnormalized and the integral is not 1, or
> the integral is 1 and it is normalized. The one in the picture clearly does
> not integrate to one. You can fit a rectangle of size 0.1 by 1e191 under
> the curve so the integral should be > 1e190 .
>
> As depicted, I don't see why a plain integral from .5 to 1.5 shouldn't
> work.
>
> -pd
>
> On 12 Feb 2016, at 16:57 , C W  wrote:
>
> > Hi Bert,
> >
> > Yay fantasyland!
> >
> > In all seriousness, You are referring to this?
> >
> https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> >
> > In particular, you mean this: .Machine$double.eps ^ 0.5?
> >
> > Thanks!
> >
> > On Fri, Feb 12, 2016 at 10:53 AM, Bert Gunter 
> > wrote:
> >
> >> You are working in fantasyland. Your density is nonsense.
> >>
> >> Please see FAQ 7.31 for links to computer precision of numeric
> >> calculations.
> >>
> >>
> >> Cheers,
> >> Bert
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Fri, Feb 12, 2016 at 7:36 AM, C W  wrote:
> >>> Hi David,
> >>>
> >>> This is the Gaussian looking distribution I am trying to integrate.
> >>>
> >>
> https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing
> >>>
> >>> Notice the unnormalized density goes up as high as 2.5*101^191.
> >>>
> >>> I tried to create small intervals like
>  seq(0.5, 1.3, by = 10^(-8))
> >>>
> >>> but that doesn't seem to be good enough, as we know, it should
> integrate
> >> to
> >>> 1.
> >>>
> >>> On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius <
> dwinsem...@comcast.net
> >>>
> >>> wrote:
> >>>
> 
> > On Feb 11, 2016, at 11:30 AM, C W  wrote:
> >
> > Hi David,
> >
> > My real function is actually a multivariate normal, the simple toy
> 1-d
>  normal won't work.
> >
> > But, you gave me an idea about restricting the bounds, and focus
>  integrating on that.  I will get back to you if I need any further
>  assistance.
> 
>  You'll probably need to restrict your bounds even more severely than I
> >> did
>  in the 1-d case (using 10 SD's on either side of the mean) . You might
> >> need
>  adequate representation of points near the center of your
> >> hyper-rectangles.
>  At least that's my armchair notion since I expect the densities tail
> off
>  rapidly in the corners. You can shoehorn multivariate integration
> around
>  the `integrate` function but it's messy and inefficient. There are
> other
>  packages that would be better choices. There's an entire section on
>  numerical differentiation and integration in CRAN Task View: Numerical
>  Mathematics.
> 
>  --
>  David.
> 
> 
> >
> > Thank you so much!
> >
> > On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius <
> >> dwinsem...@comcast.net>
>  wrote:
> >
> >> On Feb 11, 2016, at 9:20 AM, C W  wrote:
> >>
> >> I want to do numerical integration w.r.t. mu: P(mu) × N(mu, 0.1)
> >>
> >> Because the variance is small, it results in density like:
> >> 7.978846e+94
> >>
> >> Is there any good suggestion for this?
> >
> > So what's the difficulty? It's rather like the Dirac function.
> >
> >> integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
> > 1 with absolute error < 7.4e-05
> >
> >
> > --
> > David.
> >
> >>
> >> Thanks so much!
> >>
> >>
> >> On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
> >>
> >>> Wow, thank you, that was very clear.  Let me give it some more runs
>  and
> >>> investigate this.
> >>>
> >>> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap <
> >> wdun...@tibco.com>
> >>> wrote:
> >>>
>  Most of the mass of that distribution is within 3e-100 of 2.
>  You have to be pretty lucky to have a point in sequence
>  land there.  (You will get at most one point there because
>  the difference between 2 and its nearest neightbors is on
>  the order of 1e-16.)
> 
>  seq(-2,4,len=101), as used by default in curve, does include 2
>  but seq(-3,4,len=101) and seq(-2,4,len=100) do not so
>  curve(..., -3, 4, 101) and curve(..., -2, 4, 100) will not show
> >> the
>  bump.
>  The same principal holds for numerical integration.
> 
> 
>  Bill Dunlap
>  TIBCO Software
>  wdunlap tibco.com
> 

Re: [R] Matrix summary

2016-02-12 Thread Jim Lemon
Hi Ashta,
Surely you are aware of the "apply" family of functions that return the
numbers you want:

ashmat<-matrix(c(117,12,13,21,21,32,11,1,65,43,23,7,58,61,78,95 ),
 nrow=4,byrow=TRUE)
apply(ashmat,2,mean)
[1] 65.25 37.00 31.25 31.00
apply(ashmat,1,which.max)
[1] 1 2 1 4

Jim


On Sat, Feb 13, 2016 at 3:04 PM, Ashta  wrote:

> hi all,
>
> I have  a square matrix (1000 by 1000),
> 1. I want calculate  mean,  min and max values for each column and row.
>
> 2, I want pick the  coordinate value of the matrix that has the max
> and min value for each row and column.
> This an example 4 by 4 square matrix
>
>
>   MeanMinMax
> 117   1213 2140.75   12117
>  213211   1 16.25   1   32
>  654323   7  34.57   65
>  586178 957358  95
> Mean652537   31.25
> Min2112111
> Max 117617895
>
>
> Thank you
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Matrix summary

2016-02-12 Thread Bert Gunter
Yes, but colMeans, rowMeans, pmax, pmin , etc. are *much* faster.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Fri, Feb 12, 2016 at 9:15 PM, Jim Lemon  wrote:
> Hi Ashta,
> Surely you are aware of the "apply" family of functions that return the
> numbers you want:
>
> ashmat<-matrix(c(117,12,13,21,21,32,11,1,65,43,23,7,58,61,78,95 ),
>  nrow=4,byrow=TRUE)
> apply(ashmat,2,mean)
> [1] 65.25 37.00 31.25 31.00
> apply(ashmat,1,which.max)
> [1] 1 2 1 4
>
> Jim
>
>
> On Sat, Feb 13, 2016 at 3:04 PM, Ashta  wrote:
>
>> hi all,
>>
>> I have  a square matrix (1000 by 1000),
>> 1. I want calculate  mean,  min and max values for each column and row.
>>
>> 2, I want pick the  coordinate value of the matrix that has the max
>> and min value for each row and column.
>> This an example 4 by 4 square matrix
>>
>>
>>   MeanMinMax
>> 117   1213 2140.75   12117
>>  213211   1 16.25   1   32
>>  654323   7  34.57   65
>>  586178 957358  95
>> Mean652537   31.25
>> Min2112111
>> Max 117617895
>>
>>
>> Thank you
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] why is 9 after 10?

2016-02-12 Thread peter dalgaard
It can also happen if you use colClasses, since that applies as.factor to the 
input column without first converting it to numeric. To wit:

> read.table(text="
+ 9
+ 10", colClasses="factor")$V1
[1] 9  10
Levels: 10 9

-pd

> On 12 Feb 2016, at 22:43 , Jim Lemon  wrote:
> 
> It depends upon what goes into the "data reshaping pipeline". If there is a
> single non-numeric value in the data read in, it will alpha sort it upon
> conversion to a factor:
> 
> x<-factor(c(sample(6:37,1000,TRUE)," "))
> z<-factor(x)
> levels(z)
> [1] " "  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22"
> "23"
> [16] "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37"
> "6"
> [31] "7"  "8"  "9"
> 
> Jim
> 
> 
> On Sat, Feb 13, 2016 at 2:41 AM, Fox, John  wrote:
> 
>> Dear Federico,
>> 
>>> -Original Message-
>>> From: Federico Calboli [mailto:federico.calb...@helsinki.fi]
>>> Sent: February 12, 2016 10:27 AM
>>> To: Fox, John 
>>> Cc: R Help 
>>> Subject: Re: [R] why is 9 after 10?
>>> 
>>> Dear John,
>>> 
>>> that is fortunatey not the case, I just managed to figure out that the
>> problem
>>> was that in the data reshaping pipeline the numeric column was
>> transformed
>>> into a factor.
>> 
>> But that shouldn't have this effect, I think:
>> 
>>> z <- as.factor(x)
>>> table(z)
>> z
>> 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
>> 31 32 33 34 35 36 37
>> 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23
>> 32 35 39 31 40 35 29
>> 
>>> levels(z)
>> [1] "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
>> "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31"
>> [27] "32" "33" "34" "35" "36" "37"
>> 
>> Best,
>> John
>> 
>>> 
>>> Many thanks for your time.
>>> 
>>> BW
>>> 
>>> F
>>> 
>>> 
>>> 
 On 12 Feb 2016, at 17:22, Fox, John  wrote:
 
 Dear Federico,
 
 Might my.data[, 2] contain character data, which therefore would be
>>> sorted in this manner? For example:
 
> x <- sample(6:37, 1000, replace=TRUE)
> table(x)
 x
 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
 30 31 32 33 34 35 36 37
 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31
 34 23 32 35 39 31 40 35 29
> y <- as.character(x)
> table(y)
 y
 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
 33 34 35 36 37  6  7  8  9
 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 35
 39 31 40 35 29 29 30 35 29
 
 I hope this helps,
 John
 
 -
 John Fox, Professor
 McMaster University
 Hamilton, Ontario
 Canada L8S 4M4
 Web: socserv.mcmaster.ca/jfox
 
 
 
 
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> Federico Calboli
> Sent: February 12, 2016 10:13 AM
> To: R Help 
> Subject: [R] why is 9 after 10?
> 
> Hi All,
> 
> I have some data, one of the columns is a bunch of numbers from 6 to
>> 41.
> 
> table(my.data[,2])
> 
> returns
> 
> 10   11   12   13   14   15   16   17   18   19   20   21   22   23
>> 24   25   26   27   28
>>> 29
> 30   31   32   33   34   35   36   37
> 1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973
> 1951
> 1947 2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340
> 38   39   40   416789
> 2681 2905 3399 3941 1648 1690 1727 1668
> 
> whereas the reasonable expectation is that the numbers from 6 to 9
> would come before 10 to 41.
> 
> How do I sort this incredibly silly behaviour so that my table
> follows a reasonable expectation that 9 comes before 10 (and so on and
>>> so forth)?
> 
> BW
> 
> F
> 
> --
> Federico Calboli
> Ecological Genetics Research Unit
> Department of Biosciences
> PO Box 65 (Biocenter 3, Viikinkaari 1)
> FIN-00014 University of Helsinki
> Finland
> 
> federico.calb...@helsinki.fi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html and provide commented, minimal, self-contained,
> reproducible code.
>>> 
>>> --
>>> Federico Calboli
>>> Ecological Genetics Research Unit
>>> Department of Biosciences
>>> PO Box 65 (Biocenter 3, Viikinkaari 1)
>>> FIN-00014 University of Helsinki
>>> Finland
>>> 
>>> federico.calb...@helsinki.fi
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> 

Re: [R] Why two curves and numerical integration look so different?

2016-02-12 Thread peter dalgaard

> On 12 Feb 2016, at 17:44 , C W  wrote:
> 
> On a side note, is it ok to do?
> 
> > which(max(p_x)) 
> and use that instead of numerical integration to get E[X]?

Now THAT makes absolutely no sense. max() is a number, so which(max()) usually 
returns 1.

If you mean whether the mode is equal to the mean: Only if the distribution is 
symmetric and unimodal.

-pd

> 
> I will try both and report back!  Thank you expeRts
> 
> On Fri, Feb 12, 2016 at 11:29 AM, C W  wrote:
> Hi Peter,
> 
> Great, let me try that and get back to you on my findings in a few hours!  :)
> 
> On Fri, Feb 12, 2016 at 11:09 AM, peter dalgaard  wrote:
> I don't see here FAQ 7.31 comes in either (for once!...)
> 
> However, either the density is unnormalized and the integral is not 1, or the 
> integral is 1 and it is normalized. The one in the picture clearly does not 
> integrate to one. You can fit a rectangle of size 0.1 by 1e191 under the 
> curve so the integral should be > 1e190 .
> 
> As depicted, I don't see why a plain integral from .5 to 1.5 shouldn't work.
> 
> -pd
> 
> On 12 Feb 2016, at 16:57 , C W  wrote:
> 
> > Hi Bert,
> >
> > Yay fantasyland!
> >
> > In all seriousness, You are referring to this?
> > https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> >
> > In particular, you mean this: .Machine$double.eps ^ 0.5?
> >
> > Thanks!
> >
> > On Fri, Feb 12, 2016 at 10:53 AM, Bert Gunter 
> > wrote:
> >
> >> You are working in fantasyland. Your density is nonsense.
> >>
> >> Please see FAQ 7.31 for links to computer precision of numeric
> >> calculations.
> >>
> >>
> >> Cheers,
> >> Bert
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming along
> >> and sticking things into it."
> >> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >>
> >>
> >> On Fri, Feb 12, 2016 at 7:36 AM, C W  wrote:
> >>> Hi David,
> >>>
> >>> This is the Gaussian looking distribution I am trying to integrate.
> >>>
> >> https://drive.google.com/file/d/0B2xN0-A6iTB4NThIZ2tYdGxHc00/view?usp=sharing
> >>>
> >>> Notice the unnormalized density goes up as high as 2.5*101^191.
> >>>
> >>> I tried to create small intervals like
>  seq(0.5, 1.3, by = 10^(-8))
> >>>
> >>> but that doesn't seem to be good enough, as we know, it should integrate
> >> to
> >>> 1.
> >>>
> >>> On Thu, Feb 11, 2016 at 3:32 PM, David Winsemius  >>>
> >>> wrote:
> >>>
> 
> > On Feb 11, 2016, at 11:30 AM, C W  wrote:
> >
> > Hi David,
> >
> > My real function is actually a multivariate normal, the simple toy 1-d
>  normal won't work.
> >
> > But, you gave me an idea about restricting the bounds, and focus
>  integrating on that.  I will get back to you if I need any further
>  assistance.
> 
>  You'll probably need to restrict your bounds even more severely than I
> >> did
>  in the 1-d case (using 10 SD's on either side of the mean) . You might
> >> need
>  adequate representation of points near the center of your
> >> hyper-rectangles.
>  At least that's my armchair notion since I expect the densities tail off
>  rapidly in the corners. You can shoehorn multivariate integration around
>  the `integrate` function but it's messy and inefficient. There are other
>  packages that would be better choices. There's an entire section on
>  numerical differentiation and integration in CRAN Task View: Numerical
>  Mathematics.
> 
>  --
>  David.
> 
> 
> >
> > Thank you so much!
> >
> > On Thu, Feb 11, 2016 at 2:06 PM, David Winsemius <
> >> dwinsem...@comcast.net>
>  wrote:
> >
> >> On Feb 11, 2016, at 9:20 AM, C W  wrote:
> >>
> >> I want to do numerical integration w.r.t. mu: P(mu) × N(mu, 0.1)
> >>
> >> Because the variance is small, it results in density like:
> >> 7.978846e+94
> >>
> >> Is there any good suggestion for this?
> >
> > So what's the difficulty? It's rather like the Dirac function.
> >
> >> integrate( function(x) dnorm(x, sd=0.1), -.0001,0.0001)
> > 1 with absolute error < 7.4e-05
> >
> >
> > --
> > David.
> >
> >>
> >> Thanks so much!
> >>
> >>
> >> On Thu, Feb 11, 2016 at 9:14 AM, C W  wrote:
> >>
> >>> Wow, thank you, that was very clear.  Let me give it some more runs
>  and
> >>> investigate this.
> >>>
> >>> On Thu, Feb 11, 2016 at 12:31 AM, William Dunlap <
> >> wdun...@tibco.com>
> >>> wrote:
> >>>
>  Most of the mass of that distribution is within 3e-100 of 2.
>  You have to be pretty lucky to have a point in sequence
>  land there.  (You will get at most one point there because
> 

Re: [R] why is 9 after 10?

2016-02-12 Thread Jim Lemon
It depends upon what goes into the "data reshaping pipeline". If there is a
single non-numeric value in the data read in, it will alpha sort it upon
conversion to a factor:

x<-factor(c(sample(6:37,1000,TRUE)," "))
z<-factor(x)
levels(z)
 [1] " "  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "20" "21" "22"
"23"
[16] "24" "25" "26" "27" "28" "29" "30" "31" "32" "33" "34" "35" "36" "37"
"6"
[31] "7"  "8"  "9"

Jim


On Sat, Feb 13, 2016 at 2:41 AM, Fox, John  wrote:

> Dear Federico,
>
> > -Original Message-
> > From: Federico Calboli [mailto:federico.calb...@helsinki.fi]
> > Sent: February 12, 2016 10:27 AM
> > To: Fox, John 
> > Cc: R Help 
> > Subject: Re: [R] why is 9 after 10?
> >
> > Dear John,
> >
> > that is fortunatey not the case, I just managed to figure out that the
> problem
> > was that in the data reshaping pipeline the numeric column was
> transformed
> > into a factor.
>
> But that shouldn't have this effect, I think:
>
> > z <- as.factor(x)
> > table(z)
> z
>  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
> 31 32 33 34 35 36 37
> 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23
> 32 35 39 31 40 35 29
>
> > levels(z)
>  [1] "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15" "16" "17" "18" "19"
> "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30" "31"
> [27] "32" "33" "34" "35" "36" "37"
>
> Best,
>  John
>
> >
> > Many thanks for your time.
> >
> > BW
> >
> > F
> >
> >
> >
> > > On 12 Feb 2016, at 17:22, Fox, John  wrote:
> > >
> > > Dear Federico,
> > >
> > > Might my.data[, 2] contain character data, which therefore would be
> > sorted in this manner? For example:
> > >
> > >> x <- sample(6:37, 1000, replace=TRUE)
> > >> table(x)
> > > x
> > > 6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29
> > > 30 31 32 33 34 35 36 37
> > > 29 30 35 29 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31
> > > 34 23 32 35 39 31 40 35 29
> > >> y <- as.character(x)
> > >> table(y)
> > > y
> > > 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
> > > 33 34 35 36 37  6  7  8  9
> > > 41 33 27 21 38 36 34 35 31 29 27 26 28 22 21 34 32 33 31 34 23 32 35
> > > 39 31 40 35 29 29 30 35 29
> > >
> > > I hope this helps,
> > > John
> > >
> > > -
> > > John Fox, Professor
> > > McMaster University
> > > Hamilton, Ontario
> > > Canada L8S 4M4
> > > Web: socserv.mcmaster.ca/jfox
> > >
> > >
> > >
> > >
> > >> -Original Message-
> > >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> > >> Federico Calboli
> > >> Sent: February 12, 2016 10:13 AM
> > >> To: R Help 
> > >> Subject: [R] why is 9 after 10?
> > >>
> > >> Hi All,
> > >>
> > >> I have some data, one of the columns is a bunch of numbers from 6 to
> 41.
> > >>
> > >> table(my.data[,2])
> > >>
> > >> returns
> > >>
> > >>  10   11   12   13   14   15   16   17   18   19   20   21   22   23
>  24   25   26   27   28
> > 29
> > >> 30   31   32   33   34   35   36   37
> > >> 1761 1782 1897 1749 1907 1797 1734 1810 1913 1988 1914 1822 1951 1973
> > >> 1951
> > >> 1947 2067 1967 1812 2119 1999 2086 2133 2081 2165 2365 2330 2340
> > >>  38   39   40   416789
> > >> 2681 2905 3399 3941 1648 1690 1727 1668
> > >>
> > >> whereas the reasonable expectation is that the numbers from 6 to 9
> > >> would come before 10 to 41.
> > >>
> > >> How do I sort this incredibly silly behaviour so that my table
> > >> follows a reasonable expectation that 9 comes before 10 (and so on and
> > so forth)?
> > >>
> > >> BW
> > >>
> > >> F
> > >>
> > >> --
> > >> Federico Calboli
> > >> Ecological Genetics Research Unit
> > >> Department of Biosciences
> > >> PO Box 65 (Biocenter 3, Viikinkaari 1)
> > >> FIN-00014 University of Helsinki
> > >> Finland
> > >>
> > >> federico.calb...@helsinki.fi
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide http://www.R-project.org/posting-
> > >> guide.html and provide commented, minimal, self-contained,
> > >> reproducible code.
> >
> > --
> > Federico Calboli
> > Ecological Genetics Research Unit
> > Department of Biosciences
> > PO Box 65 (Biocenter 3, Viikinkaari 1)
> > FIN-00014 University of Helsinki
> > Finland
> >
> > federico.calb...@helsinki.fi
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To 

Re: [R] Separating point symbols and line types in a legend.

2016-02-12 Thread Greg Snow
One option is to call `legend` twice and do some manual positioning.
This worked for me:

plot(1:10)
legend('topleft', lty=1:3, bty="n", legend=c('','','') )
legend('topleft', pch=c(20,8,1), bty="n",
legend=c('clyde','irving','melvin'), inset=c(0.1,0))

You may need to fiddle with the amount of inset for your particular
plot and device combination.



On Thu, Feb 11, 2016 at 9:52 PM, Rolf Turner  wrote:
>
> I would like to have a legend given in the manner
>
> legend("topleft",pch=c(20,8,1),lty=1:3,bty="n",
>legend=c("clyde","irving","melvin"))
>
> but with the point symbol *NOT* being superimposed on the line segments that
> are plotted.
>
> I saw that I can specify "merge=FALSE" in the call to legend() but this
> gives results like unto
>
>* irving
>
> with the plot symbol being immediately juxtaposed to the plotted line
> segment.  I would like a space between them, like so:
>
> * irving
>
> (See the difference?)
>
> I can see no arguments to legend that allow me to effect this.  I can adjust
> positioning of the legend text, but not of the plotted point character or
> line segment.  Is there any way to effect the desired result?  Or is there a
> "simple" adjustment that one could make to the code for legend() that would
> allow me to accomplish what I want?
>
> Ta.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Gregory (Greg) L. Snow Ph.D.
538...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Separating point symbols and line types in a legend.

2016-02-12 Thread Rolf Turner


Thanks Greg.  Worked perfectly!!!

cheers,

Rolf

On 13/02/16 09:22, Greg Snow wrote:

One option is to call `legend` twice and do some manual positioning.
This worked for me:

plot(1:10)
legend('topleft', lty=1:3, bty="n", legend=c('','','') )
legend('topleft', pch=c(20,8,1), bty="n",
legend=c('clyde','irving','melvin'), inset=c(0.1,0))

You may need to fiddle with the amount of inset for your particular
plot and device combination.



On Thu, Feb 11, 2016 at 9:52 PM, Rolf Turner  wrote:


I would like to have a legend given in the manner

legend("topleft",pch=c(20,8,1),lty=1:3,bty="n",
legend=c("clyde","irving","melvin"))

but with the point symbol *NOT* being superimposed on the line segments that
are plotted.

I saw that I can specify "merge=FALSE" in the call to legend() but this
gives results like unto

* irving

with the plot symbol being immediately juxtaposed to the plotted line
segment.  I would like a space between them, like so:

 * irving

(See the difference?)

I can see no arguments to legend that allow me to effect this.  I can adjust
positioning of the legend text, but not of the plotted point character or
line segment.  Is there any way to effect the desired result?  Or is there a
"simple" adjustment that one could make to the code for legend() that would
allow me to accomplish what I want?

Ta.


--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Matrix summary

2016-02-12 Thread Ashta
hi all,

I have  a square matrix (1000 by 1000),
1. I want calculate  mean,  min and max values for each column and row.

2, I want pick the  coordinate value of the matrix that has the max
and min value for each row and column.
This an example 4 by 4 square matrix


  MeanMinMax
117   1213 2140.75   12117
 213211   1 16.25   1   32
 654323   7  34.57   65
 586178 957358  95
Mean652537   31.25
Min2112111
Max 117617895


Thank you

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.