Re: [R] WGCNA cluster

2015-11-18 Thread Peter Langfelder
Hi Giovanni,

please follow Tutorial I, section 3 (particularly 3d, "Summary output
of network analysis results") at
http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/Tutorials/index.html
. This will show you how to output module membership of each CpG into
a file. If you want to assign different colors, you will have to do it
yourself - first choose a set of colors (you will need at least 26),
then you will need to write a simple R code to change the colors.

Best,

Peter

On Wed, Nov 18, 2015 at 3:31 AM, Calice Giovanni
 wrote:
> Hi all,
>
> I am using WGCNA for methylation level network construction e modules 
> detection.
> In my network there are 26 modules with assigned colors and numeric labels, 
> id.
>
> Which Is the best way to reassign different color to each module?
>
> How to know the elements associated to each colored module in the cluster?
>
>
> Thanks in advance, Regards
>
> Giovanni
>
>
> Laboratory of Preclinical and Translational Research
> IRCCS - CROB Oncology Referral Center of Basilicata - Italy
>
> Servizio di posta elettronica della Regione Basilicata "Powered By Microsoft 
> Exchange 2007"
> Sito web istituzionale www.regione.basilicata.it

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] WGCNA cluster

2015-11-18 Thread Giovanni Calice
Hi all,

I am using WGCNA for methylation level network construction e modules
detection.
In my network there are 26 modules with assigned colors and numeric labels,
id.

Which Is the best way to reassign different color to each module?

How to know the elements associated to each colored module in the cluster?


Thanks in advance, Regards

Giovanni


Laboratory of Preclinical and Translational Research
IRCCS - CROB Oncology Referral Center of Basilicata - Italy

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SWEAVE - a gentle introduction

2015-11-18 Thread Greg Snow
John,

One additional point that I have not seen brought up yet.  If your
main goal is to have all the output from an existing R script put into
a single output file then you should look at the `stitch` function in
the knitr package.  This will take an existing R script and convert it
to one of the formats that knitr can process, then processes it for
you without you needing to modify the script or learn any of the
markdown (LaTeX or HTML or other).  You do not have a lot of control
over how the output looks, but it is quick and easy.

For the long run I would suggest learning to use the full power of
knitr, but stitch (and the related spin function which gives a few
more options) is a quick way to process an existing script.

On Tue, Nov 17, 2015 at 8:21 AM, John Sorkin
 wrote:
> I am looking for a gentle introduction to SWEAVE, and would appreciate 
> recommendations.
> I have an R program that I want to run and have the output and plots in one 
> document. I believe this can be accomplished with SWEAVE. Unfortunately I 
> don't know HTML, but am willing to learn. . . as I said I need a gentle 
> introduction to SWEAVE.
> Thank you,
> John
>
>
>
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>
> Confidentiality Statement:
> This email message, including any attachments, is for ...{{dropped:18}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Glmnet penalty.factor with multigaussian response

2015-11-18 Thread Maria Vila Casadesús
Hi all,
I'm trying to use glmnet with penalty factors in a multigaussian response
model.

In case of the gaussian response the input for penalty factors is a vector,
but I haven't figured out how can I use penalty factors with a
multigaussian response, and even if it's possible. I've tried to use a
matrix of values, it doesn't give any error or warning but it seems to be
using only part of the data: the first column.

Do you know if it's possible to use penalty factors in this case? Or are
there any other alternatives?

So far, I've tried this:

cv<-cv.glmnet(x,y,alpha=.5,family = "mgaussian", penalty.factor=pen.matrix.tot)

Where:

> dim(x)[1]  40 723> dim(y)[1] 40  7

Penalty matrix looks like:

> pen.matrix.tot[1:5,1:5]
NAT2 CYP1A2 CYP2A6 CYP2A7 CYP2A13
hsa-let-7a 1  0  1  1   1
hsa-let-7a*1  1  1  1   1
hsa-let-7b 0  0  1  1   1
hsa-let-7b*1  1  1  1   1
hsa-let-7c 0  0  1  1   1
> dim(pen.matrix.tot)[1] 723   7

The coefficients for lambda.min:

 > fullcoefs[1:5,1:5]
NAT2  CYP1A2   CYP2A6 CYP2A7 CYP2A13
 hsa-let-7a   NA  NA   NA NA  NA
 hsa-let-7a*  NA  NA   NA NA  NA
 hsa-let-7b   0.10222788  0.06621902  0.064668084  0.3887164  0.06369455
 hsa-let-7b*  NA  NA   NA NA  NA
 hsa-let-7c  -0.06436899 -0.03362183 -0.007440406 -0.2581606
-0.01517728> dim(fullcoefs)[1] 723   7


More speciffically, in this case I would expect some value for
"hsa-let-7a":"CYP1A2", as the penalty for it is 0.

Many thanks!

Maria

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SWEAVE - a gentle introduction

2015-11-18 Thread John Maindonald
What’s more, for pdf output one can use R Markdown and judiciously
sneak in html and/or LaTeX (consider however what the processing
steps might do to such markup).


John Maindonald email: 
john.maindon...@anu.edu.au


On 19/11/2015, at 00:00, 
r-help-requ...@r-project.org wrote:

From: Duncan Murdoch >
Subject: Re: [R] SWEAVE - a gentle introduction
Date: 18 November 2015 08:09:34 NZDT
To: Marc Schwartz >, John 
Sorkin >
Cc: R-help >


On 17/11/2015 10:42 AM, Marc Schwartz wrote:

On Nov 17, 2015, at 9:21 AM, John Sorkin 
> wrote:

I am looking for a gentle introduction to SWEAVE, and would appreciate 
recommendations.
I have an R program that I want to run and have the output and plots in one 
document. I believe this can be accomplished with SWEAVE. Unfortunately I don't 
know HTML, but am willing to learn. . . as I said I need a gentle introduction 
to SWEAVE.
Thank you,
John



John,

A couple of initial comments.

First, you will likely get some recommendations to also consider using Knitr:

  http://yihui.name/knitr/

which I do not use myself (I use Sweave), but to be fair, is worth considering 
as an alternative.

He did, and I'd agree with them.  I've switched to knitr for all new projects 
and some old ones.  knitr should be thought of as Sweave version 2.

Duncan Murdoch


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem running mvrnorm

2015-11-18 Thread Gang Chen
 I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
following problem.

It works fine with the following:

require('MASS’)
var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))

However, when running the following in a loop with simulated data (Sigma):

# Sigma defined somewhere else
mvrnorm(n=1000, rep(0, 190), Sigma)

I get this opaque message:

 *** caught illegal operation ***
address 0x7fe78f8693d2, cause 'illegal operand'

Traceback:
 1: eigen(Sigma, symmetric = TRUE)
 2: mvrnorm(n = nr, rep(0, NN), Sigma)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I tried to do core dump (option 1), but it didn’t go anywhere (hanging
there forever). I also ran the same code on a Mac, and there was no problem
at all. What is causing the problem on the Linux server? In case the
variance-covariance matrix ‘Sigma’ is needed, I can provide its definition
later.

Thanks,
Gang

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Problem running mvrnorm

2015-11-18 Thread Rolf Turner


I cobbled together a 190 x 190 positive definite matrix Sigma and ran 
your example.  I got a result instantaneously, with no error message. 
(I'm running Linux; an ancient Fedora 17 system.)


So the problem is peculiar to your particular Sigma.

As the error message tells you, the problem comes from doing an 
eigendecomposition of Sigma.  So start your investigation by doing


E <- eigen(Sigma,symmetric=TRUE)

Presumably that will lead to the same error.  How to get around this 
error is beyond the scope of my capabilities.


You *might* get somewhere by using the singular value decomposition
(equivalent for a positive definite matrix) rather than the 
eigendecomposition.  I have the vague notion that the svd is more 
numerically robust than eigendecomposition.  However I might well have 
that wrong.


Doing anything in 190 dimensions is bound to be fraught with numeric
peril.

cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

On 19/11/15 08:28, Gang Chen wrote:

  I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
following problem.

It works fine with the following:

require('MASS’)
var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))

However, when running the following in a loop with simulated data (Sigma):

# Sigma defined somewhere else
mvrnorm(n=1000, rep(0, 190), Sigma)

I get this opaque message:

  *** caught illegal operation ***
address 0x7fe78f8693d2, cause 'illegal operand'

Traceback:
  1: eigen(Sigma, symmetric = TRUE)
  2: mvrnorm(n = nr, rep(0, NN), Sigma)

Possible actions:
1: abort (with core dump, if enabled)
2: normal R exit
3: exit R without saving workspace
4: exit R saving workspace

I tried to do core dump (option 1), but it didn’t go anywhere (hanging
there forever). I also ran the same code on a Mac, and there was no problem
at all. What is causing the problem on the Linux server? In case the
variance-covariance matrix ‘Sigma’ is needed, I can provide its definition
later.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Borrar cada fila 400

2015-11-18 Thread Javier Rubén Marcuzzi
Estimado Jesús Para Fernández

El ejemplo que usted presenta es algo básico pero no por ello lejos de carecer 
importancia, al respecto hay varios autores de paquetes que abordaron ese 
problema buscando optimizar, facilitar, etc.

Lo que usted piensa es correcto, si desea buscar optimizar puede leer algo de 
plyr, data.table, reshape (este tiene un video explicando casi lo que usted 
plantea), y algunas otras opciones. No se preocupe en buscar lo más eficiente, 
utilice lo que comprende y puede escribir, porque a mi me paso muchas veces que 
por buscar lo más eficiente no medí ni el tiempo invertido ni la facilidad de 
mantener mi código. Lógicamente a la larga uno termina manejando varias 
alternativas para lo mismo, pero por lo que usted pregunta mi consejo es ¿llegó 
solo a resolverlo? Si es así para usted esa es la mejor alternativa. Dentro de 
unos días puede buscar otra opción, pero no mezcle todo junto en su cabeza. 
Deje descansar al cerebro sobre ese problema. Yo pensé hace un tiempo escribir 
en forma de libro varias formas de realizar lo mismo con R, cosas que no son 
“la ciencia ni el gran libro de R”, simplemente lo que aprendí con el tiempo, 
pero el cansancio mental de realizar esa tarea es impresionante. Por lo cuál no 
le recomiendo que busque una alternativa óptima para resolver ese problema, si 
encuentra la solución descanse, luego de unos días regrese a R buscando 
optimizar.

Dentro de sus pensamientos está dividir el data.frame, una forma rápida es por 
ejemplo Edad[Edad>1 & Edad<3] (los que tienen edad mayor a 1 y menor a 3, por 
no decir dos años).

Si es que no comprendí mal, que también es posible.

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Jesús Para Fernández
Enviado: miércoles, 18 de noviembre de 2015 5:02
Para: Carlos Ortega
CC: r-help-es@r-project.org
Asunto: Re: [R-es] Borrar cada fila 400


Buenas Carlos, 

Lo que quiero hacer es calcular la media de diferentes regiones con un patrón 
de repetición. Ayer estaba algo espeso, a ver si ahora consigo explicarme 
mejor. 

Tengo un dataframe creado de la union de varios csv. Esta compuesto por n filas 
y x columnas, a la que he añadido, gracias a vuestros consejos, una columna 
"nueva", donde se indica el csv al que pertenece. En total tengo un dataframe 
al uqellamare "datos". 

La dispoisción queda por tanto de la siguiente forma:

x1x2 x3xnnueva
21   2332121
27   2139141
24   2230111
..
21   2432   192
27   2139142
..
27   22 3011n

   

Por tanto quiero ir guardando en una matriz las medias de cada una de las 
regiones, las cuales serian, por ejemplo:
region1,1 (filas 1:20,columnas 1:20) para cada valor diferente de nueva
region 2,1 (filas 21:40,columnas1:20) para cada valor diferente de nueva
region 1,2 (filas 1:20, columnas 21:20)...

y asi para todas. 

Acabo de pensar que puedo predefinir las zonas y crear esto:

zona1<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)
zona2<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)

y luego unir todo en un unico dataframe
datosmedias<-data.frame(zona1,zona2,)

Estoy seguro que hay una manera más eficiente de conseguirlo. 

Gracias por tdoo
JEsús


Date: Tue, 17 Nov 2015 20:33:49 +0100
Subject: Re: [R-es] Borrar cada fila 400
From: c...@qualityexcellence.es
To: j.para.fernan...@hotmail.com
CC: r-help-es@r-project.org

¿Qué quieres decir con "valores que quiera"?.
¿Quieres calcular la media de unas regiones de la matriz con algún tipo de 
patrón? ¿periodicidad?
Si es que no, basta como te mostraba en el ejemplo, definir unos índices (tu 1 
y 20) y ya está...

Saludos,
Carlos Ortega
www.qualityexcellence.es

El 17 de noviembre de 2015, 19:29, Jesús Para Fernández 
 escribió:



Vale, con as.matrix lo consigo. 

es poner un mean(as.matrix(1:20,1:20)) y lo obtengo. 

Ahora lo bonito es como hacerlo para los valores que quiera, isn usar un bucle 
for, sino un apply, o un tapply...

> From: j.para.fernan...@hotmail.com
> To: c...@qualityexcellence.es
> Date: Tue, 17 Nov 2015 19:17:30 +0100
> CC: r-help-es@r-project.org
> Subject: Re: [R-es] Borrar cada fila 400
> 
> Gracias Carlos una vez más, pero no es exactamente lo que quiero
> 
> Con colMeans estas calculando por columnas, pero yo quiero que calcule asi:
> 
> mean(datos[1:20,1:20]), pero claro, para toda la secuencia.
> 
> 
> mean(datos[1:20,1:20]) me devuelve el error-> Error in datos[1:2, 1:2] : 
> object of type 'closure' is not subsettable
> 
> 
> 
> 
> 
> 
> Date: Tue, 17 Nov 2015 18:34:59 +0100
> Subject: Re: [R-es] Borrar cada fila 400
> From: c...@qualityexcellence.es
> To: j.para.fernan...@hotmail.com
> CC: c...@datanalytics.com; r-help-es@r-project.org
> 
> Hola,
> 
> Esta es una forma.
> Indicas con unos indices el 

[R] Getting rid of unwanted csv files

2015-11-18 Thread WRAY NICHOLAS
Hi I have got a large folder with hundreds of csv files in it   The problem is
that some of them are junk, and I know which ones are junk because they were
created on certain days, that is before I had honed the programme generating
them to ultimate perfection

I'd like to get shot of the junk files, but I can't distinguish them by
names/label from the ones I want to keep -- the only criterion is the date of
making them, but I cannot see a way of telling R to look at the date rather than
the name/label   Obviously I could plough by hand, but there are loads and if
anyone has any ideas I'd be grateful

Thanks, Nick Wray

PS Just had a thought -- Going off R-piste now but of course is it poss to get
rid of them within Windows (which I'm using) rather than R itself?
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Borrar cada fila 400

2015-11-18 Thread Jesús Para Fernández
Gracias por el aporte, 

Investigaré las librerias que comentas. 

Un saludo
Jesús

To: j.para.fernan...@hotmail.com; c...@qualityexcellence.es
CC: r-help-es@r-project.org
From: javier.ruben.marcu...@gmail.com
Subject: RE: [R-es] Borrar cada fila 400
Date: Wed, 18 Nov 2015 07:57:11 -0300

Estimado Jesús Para Fernández El ejemplo que usted presenta es algo básico pero 
no por ello lejos de carecer importancia, al respecto hay varios autores de 
paquetes que abordaron ese problema buscando optimizar, facilitar, etc. Lo que 
usted piensa es correcto, si desea buscar optimizar puede leer algo de plyr, 
data.table, reshape (este tiene un video explicando casi lo que usted plantea), 
y algunas otras opciones. No se preocupe en buscar lo más eficiente, utilice lo 
que comprende y puede escribir, porque a mi me paso muchas veces que por buscar 
lo más eficiente no medí ni el tiempo invertido ni la facilidad de mantener mi 
código. Lógicamente a la larga uno termina manejando varias alternativas para 
lo mismo, pero por lo que usted pregunta mi consejo es ¿llegó solo a 
resolverlo? Si es así para usted esa es la mejor alternativa. Dentro de unos 
días puede buscar otra opción, pero no mezcle todo junto en su cabeza. Deje 
descansar al cerebro sobre ese problema. Yo pensé hace un tiempo escribir en 
forma de libro varias formas de realizar lo mismo con R, cosas que no son “la 
ciencia ni el gran libro de R”, simplemente lo que aprendí con el tiempo, pero 
el cansancio mental de realizar esa tarea es impresionante. Por lo cuál no le 
recomiendo que busque una alternativa óptima para resolver ese problema, si 
encuentra la solución descanse, luego de unos días regrese a R buscando 
optimizar. Dentro de sus pensamientos está dividir el data.frame, una forma 
rápida es por ejemplo Edad[Edad>1 & Edad<3] (los que tienen edad mayor a 1 y 
menor a 3, por no decir dos años). Si es que no comprendí mal, que también es 
posible. Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario 
 

De: Jesús Para Fernández
Enviado: miércoles, 18 de noviembre de 2015 5:02
Para: Carlos Ortega
CC: r-help-es@r-project.org
Asunto: Re: [R-es] Borrar cada fila 400  Buenas Carlos,  Lo que quiero hacer es 
calcular la media de diferentes regiones con un patrón de repetición. Ayer 
estaba algo espeso, a ver si ahora consigo explicarme mejor.  Tengo un 
dataframe creado de la union de varios csv. Esta compuesto por n filas y x 
columnas, a la que he añadido, gracias a vuestros consejos, una columna 
"nueva", donde se indica el csv al que pertenece. En total tengo un dataframe 
al uqellamare "datos".  La dispoisción queda por tanto de la siguiente forma: 
x1x2 x3xnnueva21   233212127   21   
 3914124   223011
1..21   2432   19
227   213914
2..27   22 3011
n Por tanto quiero ir guardando en una matriz las medias de cada una de las 
regiones, las cuales serian, por ejemplo:region1,1 (filas 1:20,columnas 1:20) 
para cada valor diferente de nuevaregion 2,1 (filas 21:40,columnas1:20) para 
cada valor diferente de nuevaregion 1,2 (filas 1:20, columnas 21:20)... y asi 
para todas.  Acabo de pensar que puedo predefinir las zonas y crear esto: 
zona1<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)zona2<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)
 y luego unir todo en un unico 
dataframedatosmedias<-data.frame(zona1,zona2,) Estoy seguro que hay una 
manera más eficiente de conseguirlo.  Gracias por tdooJEsús  Date: Tue, 17 Nov 
2015 20:33:49 +0100Subject: Re: [R-es] Borrar cada fila 400From: 
c...@qualityexcellence.esto: j.para.fernandez@hotmail.comCC: 
r-help-es@r-project.org ¿Qué quieres decir con "valores que quiera"?.¿Quieres 
calcular la media de unas regiones de la matriz con algún tipo de patrón? 
¿periodicidad?Si es que no, basta como te mostraba en el ejemplo, definir unos 
índices (tu 1 y 20) y ya está... Saludos,Carlos Ortegawww.qualityexcellence.es 
El 17 de noviembre de 2015, 19:29, Jesús Para Fernández 
 escribió:   Vale, con as.matrix lo consigo.  es 
poner un mean(as.matrix(1:20,1:20)) y lo obtengo.  Ahora lo bonito es como 
hacerlo para los valores que quiera, isn usar un bucle for, sino un apply, o un 
tapply... > From: j.para.fernan...@hotmail.com> To: c...@qualityexcellence.es> 
Date: Tue, 17 Nov 2015 19:17:30 +0100> CC: r-help-es@r-project.org> Subject: 
Re: [R-es] Borrar cada fila 400> > Gracias Carlos una vez más, pero no es 
exactamente lo que quiero> > Con colMeans estas calculando por columnas, pero 
yo quiero que calcule asi:> > mean(datos[1:20,1:20]), pero claro, para toda la 
secuencia.> > > mean(datos[1:20,1:20]) me devuelve el error-> Error in 
datos[1:2, 1:2] : object of type 'closure' is not subsettable> > > > > > > 
Date: Tue, 17 Nov 

Re: [R] Get means of matrix

2015-11-18 Thread Jim Lemon
Hi Jesus,
While I do not have your data and cannot test this, the problem may be that
you are using two different names for the data frame. Is this more or less
what you want?

datos<-data.frame(x1=sample(20:40,40,TRUE),x2=sample(20:40,40,TRUE),
 x3=sample(20:40,40,TRUE),csvdata=rep(1:2,each=20))
sapply(datos[datos$csvdata==1,],mean,na.rm=TRUE)
sapply(datos[datos$csvdata==2,],mean,na.rm=TRUE)

Jim


On Wed, Nov 18, 2015 at 9:19 PM, Jesús Para Fernández <
j.para.fernan...@hotmail.com> wrote:

> Hi everyone
>
> I have a dataframe "data" wich is the result of join multiple csv (400
> rows and 600cols every csv). The "data" dataframe has n rows and m columns
> (20 rows and 600 cols) , and I have add a new colum, "csvdata", in
> which I specify the number of csv at wich those data belong.
>
> So, the dataframe "data" looks like:
>
> x1x2 x3xncsvdata
> 21   2332121
> 27   2139141
> 24   2230111
> ..
> 21   2432   19 2
> 27   2139142
> ..
> 27   22 3011n
>
>
>
> I want to store into a matrix the mean values of different substes of data
> of every csv, for example:
>
> region1,1 (rows 1:20,columns 1:20) for every "csvdata" value
> region 2,1 (rows 21:40,columns 1:20) para every "csvdata" value
> 
>
> And so on for hole data.frame.
>
> I have tryed:
>
> area1<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
> area2<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
>
> But this error is the output I obtain:
>
> Error in tapply(data[1:30, ], datos$nueva, mean, na.rm = T) :
>   arguments must have same length
>
> I´m sure that it is not very complex to do it, but I have no idea of how
> to do it.
>
> Thanks for all.
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Getting rid of unwanted csv files

2015-11-18 Thread John Laing
?file.info

On Wed, Nov 18, 2015 at 6:12 AM, WRAY NICHOLAS 
wrote:

> Hi I have got a large folder with hundreds of csv files in it   The
> problem is
> that some of them are junk, and I know which ones are junk because they
> were
> created on certain days, that is before I had honed the programme
> generating
> them to ultimate perfection
>
> I'd like to get shot of the junk files, but I can't distinguish them by
> names/label from the ones I want to keep -- the only criterion is the date
> of
> making them, but I cannot see a way of telling R to look at the date
> rather than
> the name/label   Obviously I could plough by hand, but there are loads and
> if
> anyone has any ideas I'd be grateful
>
> Thanks, Nick Wray
>
> PS Just had a thought -- Going off R-piste now but of course is it poss to
> get
> rid of them within Windows (which I'm using) rather than R itself?
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Borrar cada fila 400

2015-11-18 Thread Olivier Nuñez
Jesús,

una manera eficiente de llevarlo es lo siguiente:

Pon todos tus ficheros csv en un directorio que llamaremos "DIR".
Luego, muevete a este directorio con setwd("ruta de DIR").
Una vez hecho, el siguiente codigo debería funcionarte:

ficheros=list.files(pattern="*.csv") 
temp=lapply(as.list(ficheros),read.csv) #lee la lista de data.frame
index <- function(x) {temp[[x]]$region <- x; temp[[x]]} #función para indexar 
cada data.frame (variable region) 
temp2= lapply(temp,index) #data.frames indexados
datos=do.call(rbind,temp2) #concatena data.frame indexados
tapply(datos$x1,datos$region,mean) #calcula media de la columna "x1" en cada 
data.frame

El codigo puede ser más sencillo si utilizas el paquete "plyr".
Un saludo. Olivier


- Mensaje original -
De: "Javier Rubén Marcuzzi" 
Para: "Jesús Para Fernández" , "Carlos Ortega" 

CC: r-help-es@r-project.org
Enviados: Miércoles, 18 de Noviembre 2015 11:57:11
Asunto: Re: [R-es] Borrar cada fila 400

Estimado Jesús Para Fernández

El ejemplo que usted presenta es algo básico pero no por ello lejos de carecer 
importancia, al respecto hay varios autores de paquetes que abordaron ese 
problema buscando optimizar, facilitar, etc.

Lo que usted piensa es correcto, si desea buscar optimizar puede leer algo de 
plyr, data.table, reshape (este tiene un video explicando casi lo que usted 
plantea), y algunas otras opciones. No se preocupe en buscar lo más eficiente, 
utilice lo que comprende y puede escribir, porque a mi me paso muchas veces que 
por buscar lo más eficiente no medí ni el tiempo invertido ni la facilidad de 
mantener mi código. Lógicamente a la larga uno termina manejando varias 
alternativas para lo mismo, pero por lo que usted pregunta mi consejo es ¿llegó 
solo a resolverlo? Si es así para usted esa es la mejor alternativa. Dentro de 
unos días puede buscar otra opción, pero no mezcle todo junto en su cabeza. 
Deje descansar al cerebro sobre ese problema. Yo pensé hace un tiempo escribir 
en forma de libro varias formas de realizar lo mismo con R, cosas que no son 
“la ciencia ni el gran libro de R”, simplemente lo que aprendí con el tiempo, 
pero el cansancio mental de realizar esa tarea es impresionante. Por lo cuál no 
le recomiendo que busque una alternativa óptima para resolver ese problema, si 
encuentra la solución descanse, luego de unos días regrese a R buscando 
optimizar.

Dentro de sus pensamientos está dividir el data.frame, una forma rápida es por 
ejemplo Edad[Edad>1 & Edad<3] (los que tienen edad mayor a 1 y menor a 3, por 
no decir dos años).

Si es que no comprendí mal, que también es posible.

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Jesús Para Fernández
Enviado: miércoles, 18 de noviembre de 2015 5:02
Para: Carlos Ortega
CC: r-help-es@r-project.org
Asunto: Re: [R-es] Borrar cada fila 400


Buenas Carlos, 

Lo que quiero hacer es calcular la media de diferentes regiones con un patrón 
de repetición. Ayer estaba algo espeso, a ver si ahora consigo explicarme 
mejor. 

Tengo un dataframe creado de la union de varios csv. Esta compuesto por n filas 
y x columnas, a la que he añadido, gracias a vuestros consejos, una columna 
"nueva", donde se indica el csv al que pertenece. En total tengo un dataframe 
al uqellamare "datos". 

La dispoisción queda por tanto de la siguiente forma:

x1x2 x3xnnueva
21   2332121
27   2139141
24   2230111
..
21   2432   192
27   2139142
..
27   22 3011n

   

Por tanto quiero ir guardando en una matriz las medias de cada una de las 
regiones, las cuales serian, por ejemplo:
region1,1 (filas 1:20,columnas 1:20) para cada valor diferente de nueva
region 2,1 (filas 21:40,columnas1:20) para cada valor diferente de nueva
region 1,2 (filas 1:20, columnas 21:20)...

y asi para todas. 

Acabo de pensar que puedo predefinir las zonas y crear esto:

zona1<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)
zona2<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)

y luego unir todo en un unico dataframe
datosmedias<-data.frame(zona1,zona2,)

Estoy seguro que hay una manera más eficiente de conseguirlo. 

Gracias por tdoo
JEsús


Date: Tue, 17 Nov 2015 20:33:49 +0100
Subject: Re: [R-es] Borrar cada fila 400
From: c...@qualityexcellence.es
To: j.para.fernan...@hotmail.com
CC: r-help-es@r-project.org

¿Qué quieres decir con "valores que quiera"?.
¿Quieres calcular la media de unas regiones de la matriz con algún tipo de 
patrón? ¿periodicidad?
Si es que no, basta como te mostraba en el ejemplo, definir unos índices (tu 1 
y 20) y ya está...

Saludos,
Carlos Ortega
www.qualityexcellence.es

El 17 de noviembre de 2015, 19:29, Jesús Para Fernández 

Re: [R-es] Borrar cada fila 400

2015-11-18 Thread Olivier Nuñez
Ups!
En el codigo anterior sustituye 

temp2= lapply(temp,index) #data.frames indexados

por 

temp2= lapply(seq_along(temp),index) #data.frames indexados

Un saludo. Olivier

- Mensaje original -
De: "Olivier Nuñez" 
Para: "Javier Rubén Marcuzzi" 
CC: r-help-es@r-project.org
Enviados: Miércoles, 18 de Noviembre 2015 13:15:52
Asunto: Re: [R-es] Borrar cada fila 400

Jesús,

una manera eficiente de llevarlo es lo siguiente:

Pon todos tus ficheros csv en un directorio que llamaremos "DIR".
Luego, muevete a este directorio con setwd("ruta de DIR").
Una vez hecho, el siguiente codigo debería funcionarte:

ficheros=list.files(pattern="*.csv") 
temp=lapply(as.list(ficheros),read.csv) #lee la lista de data.frame
index <- function(x) {temp[[x]]$region <- x; temp[[x]]} #función para indexar 
cada data.frame (variable region) 
temp2= lapply(temp,index) #data.frames indexados
datos=do.call(rbind,temp2) #concatena data.frame indexados
tapply(datos$x1,datos$region,mean) #calcula media de la columna "x1" en cada 
data.frame

El codigo puede ser más sencillo si utilizas el paquete "plyr".
Un saludo. Olivier


- Mensaje original -
De: "Javier Rubén Marcuzzi" 
Para: "Jesús Para Fernández" , "Carlos Ortega" 

CC: r-help-es@r-project.org
Enviados: Miércoles, 18 de Noviembre 2015 11:57:11
Asunto: Re: [R-es] Borrar cada fila 400

Estimado Jesús Para Fernández

El ejemplo que usted presenta es algo básico pero no por ello lejos de carecer 
importancia, al respecto hay varios autores de paquetes que abordaron ese 
problema buscando optimizar, facilitar, etc.

Lo que usted piensa es correcto, si desea buscar optimizar puede leer algo de 
plyr, data.table, reshape (este tiene un video explicando casi lo que usted 
plantea), y algunas otras opciones. No se preocupe en buscar lo más eficiente, 
utilice lo que comprende y puede escribir, porque a mi me paso muchas veces que 
por buscar lo más eficiente no medí ni el tiempo invertido ni la facilidad de 
mantener mi código. Lógicamente a la larga uno termina manejando varias 
alternativas para lo mismo, pero por lo que usted pregunta mi consejo es ¿llegó 
solo a resolverlo? Si es así para usted esa es la mejor alternativa. Dentro de 
unos días puede buscar otra opción, pero no mezcle todo junto en su cabeza. 
Deje descansar al cerebro sobre ese problema. Yo pensé hace un tiempo escribir 
en forma de libro varias formas de realizar lo mismo con R, cosas que no son 
“la ciencia ni el gran libro de R”, simplemente lo que aprendí con el tiempo, 
pero el cansancio mental de realizar esa tarea es impresionante. Por lo cuál no 
le recomiendo que busque una alternativa óptima para resolver ese problema, si 
encuentra la solución descanse, luego de unos días regrese a R buscando 
optimizar.

Dentro de sus pensamientos está dividir el data.frame, una forma rápida es por 
ejemplo Edad[Edad>1 & Edad<3] (los que tienen edad mayor a 1 y menor a 3, por 
no decir dos años).

Si es que no comprendí mal, que también es posible.

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Jesús Para Fernández
Enviado: miércoles, 18 de noviembre de 2015 5:02
Para: Carlos Ortega
CC: r-help-es@r-project.org
Asunto: Re: [R-es] Borrar cada fila 400


Buenas Carlos, 

Lo que quiero hacer es calcular la media de diferentes regiones con un patrón 
de repetición. Ayer estaba algo espeso, a ver si ahora consigo explicarme 
mejor. 

Tengo un dataframe creado de la union de varios csv. Esta compuesto por n filas 
y x columnas, a la que he añadido, gracias a vuestros consejos, una columna 
"nueva", donde se indica el csv al que pertenece. En total tengo un dataframe 
al uqellamare "datos". 

La dispoisción queda por tanto de la siguiente forma:

x1x2 x3xnnueva
21   2332121
27   2139141
24   2230111
..
21   2432   192
27   2139142
..
27   22 3011n

   

Por tanto quiero ir guardando en una matriz las medias de cada una de las 
regiones, las cuales serian, por ejemplo:
region1,1 (filas 1:20,columnas 1:20) para cada valor diferente de nueva
region 2,1 (filas 21:40,columnas1:20) para cada valor diferente de nueva
region 1,2 (filas 1:20, columnas 21:20)...

y asi para todas. 

Acabo de pensar que puedo predefinir las zonas y crear esto:

zona1<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)
zona2<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)

y luego unir todo en un unico dataframe
datosmedias<-data.frame(zona1,zona2,)

Estoy seguro que hay una manera más eficiente de conseguirlo. 

Gracias por tdoo
JEsús


Date: Tue, 17 Nov 2015 20:33:49 +0100
Subject: Re: [R-es] Borrar cada fila 400
From: 

Re: [R] Get means of matrix

2015-11-18 Thread PIKAL Petr
Hi

See in line

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jesús
> Para Fernández
> Sent: Wednesday, November 18, 2015 11:20 AM
> To: r-help@r-project.org
> Subject: [R] Get means of matrix
>
> Hi everyone
>
> I have a dataframe "data" wich is the result of join multiple csv (400
> rows and 600cols every csv). The "data" dataframe has n rows and m
> columns (20 rows and 600 cols) , and I have add a new colum,
> "csvdata", in which I specify the number of csv at wich those data
> belong.
>
> So, the dataframe "data" looks like:
>
> x1x2 x3xncsvdata
> 21   2332121
> 27   2139141
> 24   2230111
> ..
> 21   2432   19 2
> 27   2139142
> ..
> 27   22 3011n
>
>
>
> I want to store into a matrix the mean values of different substes of
> data of every csv, for example:
>
> region1,1 (rows 1:20,columns 1:20) for every "csvdata" value region 2,1
> (rows 21:40,columns 1:20) para every "csvdata" value 
>
> And so on for hole data.frame.
>
> I have tryed:
>
> area1<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
> area2<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
>
> But this error is the output I obtain:
>
> Error in tapply(data[1:30, ], datos$nueva, mean, na.rm = T) :
>   arguments must have same length

Most probably length(datos$csvdata) is not 20 as you specified by data[1:20,1].

If you have already number of csv file in your data$csvdata you can do

area <- aggregate(data[,1:n], list(data$csvdata), mean, na.rm=TRUE)

If you want to aggregate each 20 rows you can elaborate index

ind <- rep(1:20, each=20)
area <- aggregate(data[,1:n], list(ind), mean, na.rm=TRUE)

Cheers
Petr

>
> I m sure that it is not very complex to do it, but I have no idea of
> how to do it.
>
> Thanks for all.
>
>
>   [[alternative HTML version deleted]]



Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the 

Re: [R] R hist density wrong?

2015-11-18 Thread Duncan Murdoch

On 18/11/2015 3:49 AM, Luca Cerone wrote:

Dear all,
this is probably a very naive question but I can't understand what
hist() means by density.

A very simple example:

h <- hist(c(1,1,2,3), plot=F)

h$counts
[1] 2 1 0 1

h$density
[1] 1.0 0.5 0.0 0.5

The counts are as I expect, but density is quite puzzling for me.

I would have expected to obtain the probability of that bin (i.e. 0.5,
0.25, 0, 0.25),
but I can't understand how those numbers come out.


The bins are 0.5 wide (see h$breaks).  Density has the usual meaning for 
continuous distributions:  probability per unit.  So a density of 1 per 
unit over a distance of 0.5 gives a probability of 0.5.


Sometimes sum(h$density) is equal to 1 as I would expect, though.


sum(h$density) would rarely make sense to calculate, any more than the 
sum of the normal density function at 4 points would.  You want to 
integrate a density.  The formula for that is 
sum(h$density*diff(h$breaks)).


Duncan Murdoch


What am I misunderstanding here?

Thanks a lot for the help!

Cheers,
Luca

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Get means of matrix

2015-11-18 Thread Jesús Para Fernández
Hi everyone

I have a dataframe "data" wich is the result of join multiple csv (400 rows and 
600cols every csv). The "data" dataframe has n rows and m columns (20 rows 
and 600 cols) , and I have add a new colum, "csvdata", in which I specify the 
number of csv at wich those data belong. 

So, the dataframe "data" looks like:

x1x2 x3xncsvdata
21   2332121
27   2139141
24   2230111
..
21   2432   19 2
27   2139142
..
27   22 3011n

   

I want to store into a matrix the mean values of different substes of data of 
every csv, for example: 

region1,1 (rows 1:20,columns 1:20) for every "csvdata" value
region 2,1 (rows 21:40,columns 1:20) para every "csvdata" value


And so on for hole data.frame. 

I have tryed:

area1<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
area2<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)

But this error is the output I obtain:
 
Error in tapply(data[1:30, ], datos$nueva, mean, na.rm = T) : 
  arguments must have same length

I�m sure that it is not very complex to do it, but I have no idea of how to do 
it.

Thanks for all. 


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Converting daily time series to monthly?

2015-11-18 Thread PIKAL Petr
Hi

You can use various ways. I would recommend to look at ?aggregate together with 
cut ?cut.POSIXt.

Your data shall be converted to POSIX type ?strptime before using them for cut 
and aggregate.

Cheers
Petr


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> Chattopadhyay, Somsubhra
> Sent: Monday, November 16, 2015 8:37 PM
> To: r-help@r-project.org
> Subject: [R] Converting daily time series to monthly?
>
> Hi all,
>
> I have daily time series of rainfall for sufficiently long period of
> time (70 years). I want to aggregate the data series into monthly,
> seasonal and annual basis. I know excel can handle this with the pivot
> table functionality. However, I have too many data points so, it's not
> a very smart way of dealing with this that way. I am wondering a simple
> R code or package may help me out here. I appreciate any feedback.
>
> Thanks
> Som
>
> --
> Somsubhra Chattopadhyay
> Graduate Research Assistant
> Biosystem and Agricultural Engineering Department University of
> Kentucky, Lexington, KY 40546
> Email: schatto...@uky.edu
> Cell: 9198026951
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Borrar cada fila 400

2015-11-18 Thread Jesús Para Fernández
Buenas Carlos, 

Lo que quiero hacer es calcular la media de diferentes regiones con un patrón 
de repetición. Ayer estaba algo espeso, a ver si ahora consigo explicarme 
mejor. 

Tengo un dataframe creado de la union de varios csv. Esta compuesto por n filas 
y x columnas, a la que he añadido, gracias a vuestros consejos, una columna 
"nueva", donde se indica el csv al que pertenece. En total tengo un dataframe 
al uqellamare "datos". 

La dispoisción queda por tanto de la siguiente forma:

x1x2 x3xnnueva
21   2332121
27   2139141
24   2230111
..
21   2432   192
27   2139142
..
27   22 3011n

   

Por tanto quiero ir guardando en una matriz las medias de cada una de las 
regiones, las cuales serian, por ejemplo:
region1,1 (filas 1:20,columnas 1:20) para cada valor diferente de nueva
region 2,1 (filas 21:40,columnas1:20) para cada valor diferente de nueva
region 1,2 (filas 1:20, columnas 21:20)...

y asi para todas. 

Acabo de pensar que puedo predefinir las zonas y crear esto:

zona1<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)
zona2<-tapply(as.matrix(datos[1:20,1]),datos$nueva,mean,na.rm=T)

y luego unir todo en un unico dataframe
datosmedias<-data.frame(zona1,zona2,)

Estoy seguro que hay una manera más eficiente de conseguirlo. 

Gracias por tdoo
JEsús


Date: Tue, 17 Nov 2015 20:33:49 +0100
Subject: Re: [R-es] Borrar cada fila 400
From: c...@qualityexcellence.es
To: j.para.fernan...@hotmail.com
CC: r-help-es@r-project.org

¿Qué quieres decir con "valores que quiera"?.
¿Quieres calcular la media de unas regiones de la matriz con algún tipo de 
patrón? ¿periodicidad?
Si es que no, basta como te mostraba en el ejemplo, definir unos índices (tu 1 
y 20) y ya está...

Saludos,
Carlos Ortega
www.qualityexcellence.es

El 17 de noviembre de 2015, 19:29, Jesús Para Fernández 
 escribió:



Vale, con as.matrix lo consigo. 

es poner un mean(as.matrix(1:20,1:20)) y lo obtengo. 

Ahora lo bonito es como hacerlo para los valores que quiera, isn usar un bucle 
for, sino un apply, o un tapply...

> From: j.para.fernan...@hotmail.com
> To: c...@qualityexcellence.es
> Date: Tue, 17 Nov 2015 19:17:30 +0100
> CC: r-help-es@r-project.org
> Subject: Re: [R-es] Borrar cada fila 400
> 
> Gracias Carlos una vez más, pero no es exactamente lo que quiero
> 
> Con colMeans estas calculando por columnas, pero yo quiero que calcule asi:
> 
> mean(datos[1:20,1:20]), pero claro, para toda la secuencia.
> 
> 
> mean(datos[1:20,1:20]) me devuelve el error-> Error in datos[1:2, 1:2] : 
> object of type 'closure' is not subsettable
> 
> 
> 
> 
> 
> 
> Date: Tue, 17 Nov 2015 18:34:59 +0100
> Subject: Re: [R-es] Borrar cada fila 400
> From: c...@qualityexcellence.es
> To: j.para.fernan...@hotmail.com
> CC: c...@datanalytics.com; r-help-es@r-project.org
> 
> Hola,
> 
> Esta es una forma.
> Indicas con unos indices el trozo que quieres, lo seleccionas (df_df_reg) y 
> sobre él calculas medias por fila o por columna. R tiene funciones 
> específicas para este cálculo.
> 
> #---
> n_row <- 400
> n_col <- 500
> df_mat <- matrix(rnorm(n_row * n_col), nrow=n_row, ncol=n_col)
> df_df <- as.data.frame(df_mat)
> 
> n_row_ini <- 1 
> n_row_end <- 20
> n_col_ini <- 1
> n_col_end <- 20
> 
> df_df_reg <- df_df[n_row_ini:n_row_end, n_col_ini:n_col_end ]
> colMeans ( df_df_reg, na.rm=TRUE )
> rowMeans ( df_df_reg, na.rm=TRUE )
> #---
> 
> 
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
> 
> 
> El 17 de noviembre de 2015, 18:20, Jesús Para Fernández 
>  escribió:
> 
> 
> 
> La verad es que es un asolución sencilla pero muy eficaz. 
> 
> Ya con esta siguiente duda termino:
> 
> La matriz de cada csv es de 400x500, es decir, 400 filas y 500 columnas. Si 
> quiero calcular la media de diferentes regiones del csv, por ejemplo la media 
> de las 20 primeras filas y 20 primreas columnas, pero del que tiene los 
> 50.000 registros, tomando el valor 1, como pued hacerlo??
> 
> He probado con tapply(datos,new,mean,na.rm=T) pero a parte de darme error no 
> segmenta como quiero. 
> 
> Gracias
> 
> Date: Tue, 17 Nov 2015 16:45:03 +0100
> Subject: Re: [R-es] Borrar cada fila 400
> From: c...@qualityexcellence.es
> To: j.para.fernan...@hotmail.com
> CC: c...@datanalytics.com; r-help-es@r-project.org
> 
> Hola,
> 
> Esta es una forma:
> 
> > DF <- data.frame(a=rnorm(1000))
> > DF$new <- 1 +  floor(1:nrow(DF) / 400)
> > unique(DF$new)
> [1] 1 2 3
> 
> 
> Saludos,
> Carlos Ortega
> www.qualityexcellence.es
> 
> 
> El 17 de noviembre de 2015, 15:50, Jesús Para Fernández 
>  escribió:
> Entiendo la logica pero no veo el como hacerlo.
> 
> 
> 
> No se como implementar el 

[R] R hist density wrong?

2015-11-18 Thread Luca Cerone
Dear all,
this is probably a very naive question but I can't understand what
hist() means by density.

A very simple example:

h <- hist(c(1,1,2,3), plot=F)

h$counts
[1] 2 1 0 1

h$density
[1] 1.0 0.5 0.0 0.5

The counts are as I expect, but density is quite puzzling for me.

I would have expected to obtain the probability of that bin (i.e. 0.5,
0.25, 0, 0.25),
but I can't understand how those numbers come out.

Sometimes sum(h$density) is equal to 1 as I would expect, though.

What am I misunderstanding here?

Thanks a lot for the help!

Cheers,
Luca

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Getting rid of unwanted csv files

2015-11-18 Thread Par Leijonhufvud
WRAY NICHOLAS  [2015.11.18] wrote:
> 
> I'd like to get shot of the junk files, but I can't distinguish them by
> names/label from the ones I want to keep -- the only criterion is the date of
> making them, but I cannot see a way of telling R to look at the date rather 
> than
> the name/label   Obviously I could plough by hand, but there are loads and if
> anyone has any ideas I'd be grateful

> PS Just had a thought -- Going off R-piste now but of course is it poss to get
> rid of them within Windows (which I'm using) rather than R itself?
>   [[alternative HTML version deleted]]

You can list them by date, and them select all the ones with a certain
date range.

With other operating systems it would be even simplier... 

Pär

-- 
Pär Leijonhufvudp...@leijonhufvud.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] SWEAVE - a gentle introduction

2015-11-18 Thread Therneau, Terry M., Ph.D.

As a digest reader I am late to the discussion, but let me toss in 2 further 
notes.

1. Three advantages of knitr over Sweave
  a. The book "Dynamic documents with R and knitr".  It is well written; sitting down for 
an evening with the first half (70 pages) is a pretty good way to learn the package.  The 
second half covers features you may or may not use over time.  My only complaint about the 
book is that it needs a longer index; I have had several cases of "I know I read about 
xxx, but where was it?".  But this is ameliorated by


  b. Good online resources: manuals, tutorials, and question/answer pairs on 
various lists.

  c. Ongoing support. Sweave is static.

2. Latex vs markdown  (knitr supports both)
  One can choose "latex style" or "markdown style" for writing your documents.  I know 
latex very well (I wrote my book using it) but recommend markdown to all others in my 
department.  The latter is about 1/3 the learning curve.  Markdown produces very nice 
output, latex goes the extra mile to produce true book quality.  But one rarely needs that 
extra polish, and even more rarely needs it enough to justify the extra learning cost.  I 
still use the latex form myself as it is not at all difficult to use --- once you learn it.


Terry Therneau

On 11/18/2015 05:00 AM, r-help-requ...@r-project.org wrote:

I am looking for a gentle introduction to SWEAVE, and would appreciate 
recommendations.
I have an R program that I want to run and have the output and plots in one 
document. I believe this can be accomplished with SWEAVE. Unfortunately I don't 
know HTML, but am willing to learn. . . as I said I need a gentle introduction 
to SWEAVE.
Thank you,
John


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to find the likelihood, MLE and plot it?

2015-11-18 Thread C W
Dear R list,

I am trying to find the MLE of the likelihood function.  I will plot the
log-likelihood to check my answer.

Here's my R code:

xvec <- c(2,5,3,7,-3,-2,0)

fn <- function(theta){

sum(0.5 * (xvec - rep(theta, 7)) ^ 2 / 1 + 0.5 * log(1))

}

gn <- Vectorize(fn)

curve(gn, -5, 20)

optimize(gn, c(-5, 20))

$minimum

[1] 1.714286

$objective

[1] 39.71429


The MLE using optimize() is 1.71, but what curve() gives me is the absolute
minimum.

I think 1.71 is the right answer, but why does the graph showing it's the
minimum?  What is going on here?

Thank so much!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Problem running mvrnorm

2015-11-18 Thread Rolf Turner

On 19/11/15 10:54, Gang Chen wrote:

Thanks a lot for the pointer, Rolf!

You're correct that

  E <- eigen(Sigma,symmetric=TRUE)

does lead to the same error on the RedHat machine. However, the same
computation on my Mac works fine:

E <- eigen(Sigma,symmetric=TRUE)

E$values




I used your constrSigma() function to create Sigma and did eigen() to 
it.  In a trice I got results which agree with those that you show.


So it's not *Linux* as such that is the problem.

(a) What happens if you do svd(Sigma)?  Or, in accordance with Peter 
Dalgaard's suggestion, chol(Sigma)?


(b) It would seem that you need to follow the lines of enquiry suggested 
by Peter.


cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.na behavior

2015-11-18 Thread David Winsemius

> On Nov 18, 2015, at 5:54 PM, Richard M. Heiberger  wrote:
> 
> What is the rationale for the following warning in R-3.2.2?
> 
>> is.na(expression(abcd))
> [1] FALSE
> Warning message:
> In is.na(expression(abcd)) :
>  is.na() applied to non-(list or vector) of type ‘expression’

Well, the R interpreter does think that this is not a list:

> is.list(expression(abcd))
[1] FALSE


> methods(is.na)
 [1] is.na,abIndex-method   is.na,denseMatrix-method  
 [3] is.na,indMatrix-method is.na,nsparseMatrix-method
 [5] is.na,nsparseVector-method is.na,sparseMatrix-method 
 [7] is.na,sparseVector-method  is.na.coxph.penalty*  
 [9] is.na.data.frame   is.na.numeric_version 
[11] is.na.POSIXlt  is.na.raster* 
[13] is.na.ratetable*   is.na.Surv 


So the rationale is probably the same as the rationale for this warning:

> is.na(call("mean", 1:4))
[1] FALSE FALSE
Warning message:
In is.na(call("mean", 1:4)) :
  is.na() applied to non-(list or vector) of type ‘language'

>   [[alternative HTML version deleted]]


I’m somewhat puzzled at your use of HTML for an Rhelp posting. I thought you 
were a longtime R user?

__
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] is.na behavior

2015-11-18 Thread Richard M. Heiberger
David,

Your answer begs the question.
What is the problem with non-(list or vector) of type language.
To my eye both expression(abcd) and call("mean") look like they have
non-missing values, hence I anticipated that they are not NA, and therefore
that is.na() would return FALSE without a warning.

On the html email, I turned that off years ago.  It looks like gmail
(who handles
my university's email accounts) turned it back on.  I just turned it off again.
I too find it very annoying to have to revisit setting changes that I
didn't make.
Thank you for letting me know.

Rich


On Wed, Nov 18, 2015 at 9:36 PM, David Winsemius  wrote:
>
>> On Nov 18, 2015, at 5:54 PM, Richard M. Heiberger  wrote:
>>
>> What is the rationale for the following warning in R-3.2.2?
>>
>>> is.na(expression(abcd))
>> [1] FALSE
>> Warning message:
>> In is.na(expression(abcd)) :
>>  is.na() applied to non-(list or vector) of type ‘expression’
>
> Well, the R interpreter does think that this is not a list:
>
>> is.list(expression(abcd))
> [1] FALSE
>
>
>> methods(is.na)
>  [1] is.na,abIndex-method   is.na,denseMatrix-method
>  [3] is.na,indMatrix-method is.na,nsparseMatrix-method
>  [5] is.na,nsparseVector-method is.na,sparseMatrix-method
>  [7] is.na,sparseVector-method  is.na.coxph.penalty*
>  [9] is.na.data.frame   is.na.numeric_version
> [11] is.na.POSIXlt  is.na.raster*
> [13] is.na.ratetable*   is.na.Surv
>
>
> So the rationale is probably the same as the rationale for this warning:
>
>> is.na(call("mean", 1:4))
> [1] FALSE FALSE
> Warning message:
> In is.na(call("mean", 1:4)) :
>   is.na() applied to non-(list or vector) of type ‘language'
>
>>   [[alternative HTML version deleted]]
>
>
> I’m somewhat puzzled at your use of HTML for an Rhelp posting. I thought you 
> were a longtime R user?
>
> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] is.na behavior

2015-11-18 Thread Richard M. Heiberger
What is the rationale for the following warning in R-3.2.2?

> is.na(expression(abcd))
[1] FALSE
Warning message:
In is.na(expression(abcd)) :
  is.na() applied to non-(list or vector) of type 'expression'

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.na behavior

2015-11-18 Thread William Dunlap
You can convert the expression to a list and use is.na on that:
   > e <- expression(1+NA, NA, 7, function(x)x+1)
   > is.na(as.list(e))
   [1] FALSE  TRUE FALSE FALSE
and you can do the same for a call object
   > is.na(as.list(quote(func(arg1, tag2=NA, tag3=log(NA)
tag2  tag3
   FALSE FALSE  TRUE FALSE

However, what is your motivation for wanting to apply is.na to an expression?

Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Wed, Nov 18, 2015 at 5:54 PM, Richard M. Heiberger  wrote:
> What is the rationale for the following warning in R-3.2.2?
>
>> is.na(expression(abcd))
> [1] FALSE
> Warning message:
> In is.na(expression(abcd)) :
>   is.na() applied to non-(list or vector) of type 'expression'
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [FORGED] Problem running mvrnorm

2015-11-18 Thread Gang Chen
Thanks a lot for the pointer, Rolf!

You're correct that

 E <- eigen(Sigma,symmetric=TRUE)

does lead to the same error on the RedHat machine. However, the same
computation on my Mac works fine:

E <- eigen(Sigma,symmetric=TRUE)

E$values

  [1] 4.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6 2.6
2.6
 [19] 2.6 2.6 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [37] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [55] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [73] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
 [91] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[109] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[127] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[145] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[163] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8
0.8
[181] 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8 0.8

As you can see, all the eigenvalues are truly positive. Is it possible that
some numerical library or package is required by eigen() but missing on the
Linux server? In case the real Sigma is useful, here is how the matrix
Sigma is defined:

constrSigma <- function(n, sig) {
   N <- n*(n-1)/2
   Sigma <- diag(N)
   bgn <- 1
   for(ii in (n-1):1) {
  end <- bgn+(ii-1)
  Sigma[bgn:end,bgn:end][lower.tri(Sigma[bgn:end,bgn:end])] <- rep(sig,
ii*(ii-1)/2)
  if(ii<(n-1)) {
 xx <- cbind(rep(sig,ii), diag(ii)*sig)
 yy <- xx
 for(jj in 1:(n-1-ii)) {
if(jj>1) {
   xx <- cbind(rep(0, ii), xx)
   yy <- cbind(xx, yy)
}
 }
 Sigma[bgn:end,1:(bgn-1)] <- yy
  }
  bgn <- end+1
   }
   Sigma[upper.tri(Sigma)] <- t(Sigma)[upper.tri(t(Sigma))]
   return(Sigma)
}

Sigma <- constrSigma(20, 0.1)
mvrnorm(n=1000, rep(0, 190), Sigma)






On Wed, Nov 18, 2015 at 3:56 PM, Rolf Turner 
wrote:
>
>
> I cobbled together a 190 x 190 positive definite matrix Sigma and ran
your example.  I got a result instantaneously, with no error message. (I'm
running Linux; an ancient Fedora 17 system.)
>
> So the problem is peculiar to your particular Sigma.
>
> As the error message tells you, the problem comes from doing an
eigendecomposition of Sigma.  So start your investigation by doing
>
> E <- eigen(Sigma,symmetric=TRUE)
>
> Presumably that will lead to the same error.  How to get around this
error is beyond the scope of my capabilities.
>
> You *might* get somewhere by using the singular value decomposition
> (equivalent for a positive definite matrix) rather than the
eigendecomposition.  I have the vague notion that the svd is more
numerically robust than eigendecomposition.  However I might well have that
wrong.
>
> Doing anything in 190 dimensions is bound to be fraught with numeric
> peril.
>
> cheers,
>
> Rolf Turner
>
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
>
>
> On 19/11/15 08:28, Gang Chen wrote:
>>
>>   I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
>> following problem.
>>
>> It works fine with the following:
>>
>> require('MASS’)
>> var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))
>>
>> However, when running the following in a loop with simulated data
(Sigma):
>>
>> # Sigma defined somewhere else
>> mvrnorm(n=1000, rep(0, 190), Sigma)
>>
>> I get this opaque message:
>>
>>   *** caught illegal operation ***
>> address 0x7fe78f8693d2, cause 'illegal operand'
>>
>> Traceback:
>>   1: eigen(Sigma, symmetric = TRUE)
>>   2: mvrnorm(n = nr, rep(0, NN), Sigma)
>>
>> Possible actions:
>> 1: abort (with core dump, if enabled)
>> 2: normal R exit
>> 3: exit R without saving workspace
>> 4: exit R saving workspace
>>
>> I tried to do core dump (option 1), but it didn’t go anywhere (hanging
>> there forever). I also ran the same code on a Mac, and there was no
problem
>> at all. What is causing the problem on the Linux server? In case the
>> variance-covariance matrix ‘Sigma’ is needed, I can provide its
definition
>> later.
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] [FORGED] Problem running mvrnorm

2015-11-18 Thread peter dalgaard
I have a sense of a tiny bell ringing faintly from the distant past... There 
may be an issue with some versions of BLAS/LAPACK on some systems showing up in 
eigendecompositions. 

Checking... hmm, there have been several issues over time but I don't see 
anything that would be directly related to the present issue. There was a 
really stupid bug in R's reference BLAS, but that was found and fixed two years 
ago, and besides, it involved complex eigenvalues which I don't see creeping 
into mvrnorm. Also PR#15211 located an issue that could cause a hang (not 
segfault) which ended in a WONTFIX situation because it really needed to be 
fixed in LAPACK.

Anyways, some details about versions, hardware (you may be using a library 
optimized for the wrong CPU), would be needed. Even better if you can set up to 
run under Valgrind or a debugger to pinpoint the cause of the trouble. Also, 
check that you aren't simply running out of memory.

Pragmatically speaking, couldn't you get away with using a Choleski 
decomposition of Sigma? 

-pd 


> On 18 Nov 2015, at 21:56 , Rolf Turner  wrote:
> 
> 
> I cobbled together a 190 x 190 positive definite matrix Sigma and ran your 
> example.  I got a result instantaneously, with no error message. (I'm running 
> Linux; an ancient Fedora 17 system.)
> 
> So the problem is peculiar to your particular Sigma.
> 
> As the error message tells you, the problem comes from doing an 
> eigendecomposition of Sigma.  So start your investigation by doing
> 
>E <- eigen(Sigma,symmetric=TRUE)
> 
> Presumably that will lead to the same error.  How to get around this error is 
> beyond the scope of my capabilities.
> 
> You *might* get somewhere by using the singular value decomposition
> (equivalent for a positive definite matrix) rather than the 
> eigendecomposition.  I have the vague notion that the svd is more numerically 
> robust than eigendecomposition.  However I might well have that wrong.
> 
> Doing anything in 190 dimensions is bound to be fraught with numeric
> peril.
> 
> cheers,
> 
> Rolf Turner
> 
> -- 
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
> On 19/11/15 08:28, Gang Chen wrote:
>>  I’m running R 3.2.2 on a Linux server (Redhat 4.4.7-16), and having the
>> following problem.
>> 
>> It works fine with the following:
>> 
>> require('MASS’)
>> var(mvrnorm(n = 1000, rep(0, 2), Sigma=matrix(c(10,3,3,2),2,2)))
>> 
>> However, when running the following in a loop with simulated data (Sigma):
>> 
>> # Sigma defined somewhere else
>> mvrnorm(n=1000, rep(0, 190), Sigma)
>> 
>> I get this opaque message:
>> 
>>  *** caught illegal operation ***
>> address 0x7fe78f8693d2, cause 'illegal operand'
>> 
>> Traceback:
>>  1: eigen(Sigma, symmetric = TRUE)
>>  2: mvrnorm(n = nr, rep(0, NN), Sigma)
>> 
>> Possible actions:
>> 1: abort (with core dump, if enabled)
>> 2: normal R exit
>> 3: exit R without saving workspace
>> 4: exit R saving workspace
>> 
>> I tried to do core dump (option 1), but it didn’t go anywhere (hanging
>> there forever). I also ran the same code on a Mac, and there was no problem
>> at all. What is causing the problem on the Linux server? In case the
>> variance-covariance matrix ‘Sigma’ is needed, I can provide its definition
>> later.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] Ayuda función plotMF paquete FRBS

2015-11-18 Thread Víctor Granda García
Hola Bernardo.

Si te da ese error, trata de hacer el panel de graficos de RStudio más
grande. A mi me ha dado ese error al hacer algunos heatmaps grandes y es
simplemente que no puede dibujarlo en el tamaño del panel.

Espero que te sirva, un saludo.


Víctor Granda

El jue., 19 nov. 2015 a las 0:09, Bernardo Mendoza ()
escribió:

> Hola a todos.
> Estoy probando los ejemplos del paquete en RStudio y cada vez que intento
> usar la función plotMV obtengo el error
>
> Error in plot.new() : figure margins too large
> ¿Podría ser un problema de la configuración de RStudio?
>
> Gracias de antemano
>
> Bernardo Mendoza
>
> [[alternative HTML version deleted]]
>
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
>

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] [FORGED] How to find the likelihood, MLE and plot it?

2015-11-18 Thread Rolf Turner

On 19/11/15 11:31, C W wrote:

Dear R list,

I am trying to find the MLE of the likelihood function.  I will plot the
log-likelihood to check my answer.

Here's my R code:

xvec <- c(2,5,3,7,-3,-2,0)

fn <- function(theta){

sum(0.5 * (xvec - rep(theta, 7)) ^ 2 / 1 + 0.5 * log(1))

}

gn <- Vectorize(fn)

curve(gn, -5, 20)

optimize(gn, c(-5, 20))

$minimum

[1] 1.714286

$objective

[1] 39.71429


The MLE using optimize() is 1.71, but what curve() gives me is the absolute
minimum.

I think 1.71 is the right answer, but why does the graph showing it's the
minimum?  What is going on here?


Your graph shows that there is indeed a *minimum* at 1.71.  And 
optimise() is correctly finding that minimum.


If you want optimise() to find the maximum, set maximum=TRUE.  In which 
case it will return "20" (or something very close to 20).


Your function fn() appears not to be the log likelihood that you had in 
mind.  Perhaps you the negative of fn()???


cheers,

Rolf Turner

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Get means of matrix

2015-11-18 Thread Dennis Murphy
Hi:

Here's another way to look at the problem. Instead of manually adding
a new column after k datasets have been read in, read your individual
data files into a list, as long as they all have the same variable
names and the same class (in this case, data.frame). Then create a
vector of names for the list components and use 'apply family' logic
to get the column means, returning the combined results to a data
frame or matrix. Here's a toy example to illustrate the point.
Firstly, three data frames are created and saved to external files:

# Create some artificial data and ship to external files

d1 <- data.frame(x1 = rpois(10, 20), x2 = rpois(10, 23), x3 = rpois(10, 25))
d2 <- data.frame(x1 = rpois(10, 20), x2 = rpois(10, 23), x3 = rpois(10, 25))
d3 <- data.frame(x1 = rpois(10, 20), x2 = rpois(10, 23), x3 = rpois(10, 25))

write.csv(d1, file = "d1.csv", row.names = TRUE, quote = FALSE)
write.csv(d2, file = "d2.csv", row.names = TRUE, quote = FALSE)
write.csv(d3, file = "d3.csv", row.names = TRUE, quote = FALSE)

###
# Now, read them back in and store them in a list object

# Vector of file names to process
files <- paste0("d", 1:3, ".csv")

# Create the list of data frames and assign names to list components
L <- lapply(files, function(x) read.csv(x, header = TRUE))
names(L) <- paste0("d", 1:3)

# Compute column means from each list component and row bind them
# Method 1: base R
do.call(rbind, lapply(L, colMeans))


# Method 2: plyr package
library(plyr)
ldply(L, colMeans)


Dennis

On Wed, Nov 18, 2015 at 2:19 AM, Jesús Para Fernández
 wrote:
> Hi everyone
>
> I have a dataframe "data" wich is the result of join multiple csv (400 rows 
> and 600cols every csv). The "data" dataframe has n rows and m columns (20 
> rows and 600 cols) , and I have add a new colum, "csvdata", in which I 
> specify the number of csv at wich those data belong.
>
> So, the dataframe "data" looks like:
>
> x1x2 x3xncsvdata
> 21   2332121
> 27   2139141
> 24   2230111
> ..
> 21   2432   19 2
> 27   2139142
> ..
> 27   22 3011n
>
>
>
> I want to store into a matrix the mean values of different substes of data of 
> every csv, for example:
>
> region1,1 (rows 1:20,columns 1:20) for every "csvdata" value
> region 2,1 (rows 21:40,columns 1:20) para every "csvdata" value
> 
>
> And so on for hole data.frame.
>
> I have tryed:
>
> area1<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
> area2<-tapply(as.matrix(data[1:20,1]),datos$csvdata,mean,na.rm=T)
>
> But this error is the output I obtain:
>
> Error in tapply(data[1:30, ], datos$nueva, mean, na.rm = T) :
>   arguments must have same length
>
> I´m sure that it is not very complex to do it, but I have no idea of how to 
> do it.
>
> Thanks for all.
>
>
> [[alternative HTML version deleted]]
>
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] Ayuda función plotMF paquete FRBS

2015-11-18 Thread Bernardo Mendoza
Hola a todos.
Estoy probando los ejemplos del paquete en RStudio y cada vez que intento
usar la función plotMV obtengo el error

Error in plot.new() : figure margins too large
¿Podría ser un problema de la configuración de RStudio?

Gracias de antemano

Bernardo Mendoza

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] is.na behavior

2015-11-18 Thread Richard M. Heiberger
It is in context of determining if an input argument for a graph title
is missing or null or na.  In any of those cases the function defines
a main title.
If the incoming title is not one of those, then I use the incoming title.
When the incoming title is an expression I see the warning.

library(lattice)

simple <- function(x, y, main) {
  if (missing(main) || is.null(main) || is.na(main))
 main <-"abcd"
  xyplot(y ~ x, main=main)
}

simple(1, 2)
simple(1, 2, main=expression("defg"))

## In the real case the constructed title is not a simple character
## string, but the result of function call with several incoming
## arguments and several computed arguments.  It is of a complexity
## that making it the default in the calling sequence would
## unnecessarily complicate the calling sequence.

On Wed, Nov 18, 2015 at 10:04 PM, William Dunlap  wrote:
> You can convert the expression to a list and use is.na on that:
>> e <- expression(1+NA, NA, 7, function(x)x+1)
>> is.na(as.list(e))
>[1] FALSE  TRUE FALSE FALSE
> and you can do the same for a call object
>> is.na(as.list(quote(func(arg1, tag2=NA, tag3=log(NA)
> tag2  tag3
>FALSE FALSE  TRUE FALSE
>
> However, what is your motivation for wanting to apply is.na to an expression?
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Wed, Nov 18, 2015 at 5:54 PM, Richard M. Heiberger  wrote:
>> What is the rationale for the following warning in R-3.2.2?
>>
>>> is.na(expression(abcd))
>> [1] FALSE
>> Warning message:
>> In is.na(expression(abcd)) :
>>   is.na() applied to non-(list or vector) of type 'expression'
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] is.na behavior

2015-11-18 Thread Richard M. Heiberger
Maybe the way to rephrase my question is to ask why there is not
an is.na.expression method that does that task for me?

> is.na(as.list(expression("defg")))
[1] FALSE
> is.na(expression("defg"))
[1] FALSE
Warning message:
In is.na(expression("defg")) :
  is.na() applied to non-(list or vector) of type 'expression'
>

On Wed, Nov 18, 2015 at 10:22 PM, Richard M. Heiberger  wrote:
> It is in context of determining if an input argument for a graph title
> is missing or null or na.  In any of those cases the function defines
> a main title.
> If the incoming title is not one of those, then I use the incoming title.
> When the incoming title is an expression I see the warning.
>
> library(lattice)
>
> simple <- function(x, y, main) {
>   if (missing(main) || is.null(main) || is.na(main))
>  main <-"abcd"
>   xyplot(y ~ x, main=main)
> }
>
> simple(1, 2)
> simple(1, 2, main=expression("defg"))
>
> ## In the real case the constructed title is not a simple character
> ## string, but the result of function call with several incoming
> ## arguments and several computed arguments.  It is of a complexity
> ## that making it the default in the calling sequence would
> ## unnecessarily complicate the calling sequence.
>
> On Wed, Nov 18, 2015 at 10:04 PM, William Dunlap  wrote:
>> You can convert the expression to a list and use is.na on that:
>>> e <- expression(1+NA, NA, 7, function(x)x+1)
>>> is.na(as.list(e))
>>[1] FALSE  TRUE FALSE FALSE
>> and you can do the same for a call object
>>> is.na(as.list(quote(func(arg1, tag2=NA, tag3=log(NA)
>> tag2  tag3
>>FALSE FALSE  TRUE FALSE
>>
>> However, what is your motivation for wanting to apply is.na to an expression?
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>>
>> On Wed, Nov 18, 2015 at 5:54 PM, Richard M. Heiberger  
>> wrote:
>>> What is the rationale for the following warning in R-3.2.2?
>>>
 is.na(expression(abcd))
>>> [1] FALSE
>>> Warning message:
>>> In is.na(expression(abcd)) :
>>>   is.na() applied to non-(list or vector) of type 'expression'
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.