[R] Simulation of strongly balanced panel data
Dear all, I have a strongly balanced panel dataset of 46 entities x11 years. Observed vars are not normally distributed How should I simulate the ov ? I do not know the distribution Can somebody pl help -- ** *Deva* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A Simple Question
Is m[[1]] what you need? On 14 Jul 2015, at 07:40, Alex Kim dumboisveryd...@gmail.com wrote: Hello, I am trying to create a matrix that looks like this, using the stri_locate_all function. x - ABCDJAKSLABCDAKJSABCD m - stri_locate_all_regex(x, 'ABCD') m [[1]] start end [1,] 1 4 [2,]10 13 [3,]18 21 I tried converting m into a matrix, however it always seems to wrap around the wrong way: output - matrix(unlist(m), ncol = 2, byrow = TRUE) output [,1] [,2] [1,]1 10 [2,] 184 [3,] 13 21 I want to output the start locations in the first column and the end locations in the second column into a matrix to look like this. [,1] [,2] [1,] 1 4 [2,]10 13 [3,]18 21 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] remove 0 and NA values
On 7/13/2015 6:02 PM, Lida Zeighami wrote: Hi Dan, Thanks for reply, Sorry the format of matrix is ruiend! Yes, this matrix is 6×6 but my orginal matrix is so biger than this!! No, I don't think so your code do that for me! I want to remove the columns which the sum of their values are equal to zero! On Jul 13, 2015 5:31 PM, Daniel Nordlund djnordl...@frontier.com mailto:djnordl...@frontier.com wrote: On 7/13/2015 3:01 PM, Lida Zeighami wrote: Hi there, I have a matrix which its elements are 0, 1,2,NA I want to remove the columns which the colsums are equal to 0 or NA and drop these columns from the original matrix and create the new matrix for the nonzero and NA value? (I think I have consider na.rm=True and remove the colums with colsum=0, because if I consider na.rm=False all the values of my colsums get NA) this is my matrix format: mat[1:5,1:5] 1:1105901701:110888172 1:110906406 1:110993854 1:110996710 1:44756 A05363 00 0 0 NA 0 A05370 00 0 0 0 NA A05380 1 NA 2 0 NA 0 A05397 00 0 1 0 2 A05400 20 0 0 00 A05426 0 0 NA 0 00 summat - colSums(mat,na.rm = TRUE) head(summat) [,1] 1:110590170 3 1:110888172 0 1:110906406 2 1:110993854 1 1:110996710 0 1:44756 2 The 2nd and 5th columns have colsum=0 so I Ishould remove them from the met and keep the rest of columns in another matrix. my out put should be like below: metnonzero 1:110590170 1:110906406 1:110993854 1:44756 A05363 0 0 0 0 A05370 0 0 0 NA A05380 1 2 0 0 A05397 0 0 1 2 A05400 2 0 0 0 A05426 0 NA 0 0 would you please let me know how can I do that? Many thanks, Lid First, you matrix appears to be 6x6. That being said, does this get you what you want? mat[, -which(summat[,1] ] Dan -- Daniel Nordlund Bothell, WA USA Lida, I seem to have cut-and-pasted something very badly, and for that I apologize. Here is a revised version: mat - structure(c(0L, 0L, 1L, 0L, 2L, 0L, 0L, 0L, NA, 0L, 0L, 0L, 0L, 0L, 2L, 0L, 0L, NA, 0L, 0L, 0L, 1L, 0L, 0L, NA, 0L, NA, 0L, 0L, 0L, 0L, NA, 0L, 2L, 0L, 0L), .Dim = c(6L, 6L), .Dimnames = list( c(A05363, A05370, A05380, A05397, A05400, A05426), c(X1.110590170, X1.110888172, X1.110906406, X1.110993854, X1.110996710, X1.44756))) summat - colSums(mat,na.rm = TRUE) mat[,-which(summat==0)] X1.110590170 X1.110906406 X1.110993854 X1.44756 A053630000 A05370000 NA A053801200 A053970012 A054002000 A054260 NA00 Hope this is more helpful, Dan -- Daniel Nordlund Bothell, WA USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A simple question
Hello, I am trying to create a matrix that looks like this, using the stri_locate_all function. x - ABCDJAKSLABCDAKJSABCD m - stri_locate_all_regex(x, 'ABCD') m [[1]] start end [1,] 1 4 [2,]10 13 [3,]18 21 I tried converting m into a matrix, however it always seems to wrap around the wrong way: output - matrix(unlist(m), ncol = 2, byrow = TRUE) output [,1] [,2] [1,]1 10 [2,] 184 [3,] 13 21 I want to output the start locations in the first column and the end locations in the second column into a matrix to look like this. [,1] [,2] [1,] 1 4 [2,]10 13 [3,]18 21 Thank you for your help, Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] A Simple Question
Hello, I am trying to create a matrix that looks like this, using the stri_locate_all function. x - ABCDJAKSLABCDAKJSABCD m - stri_locate_all_regex(x, 'ABCD') m [[1]] start end [1,] 1 4 [2,]10 13 [3,]18 21 I tried converting m into a matrix, however it always seems to wrap around the wrong way: output - matrix(unlist(m), ncol = 2, byrow = TRUE) output [,1] [,2] [1,]1 10 [2,] 184 [3,] 13 21 I want to output the start locations in the first column and the end locations in the second column into a matrix to look like this. [,1] [,2] [1,] 1 4 [2,]10 13 [3,]18 21 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] : Ramanujan and the accuracy of floating point computations - using Rmpfr in R
P == ProfJCNash profjcn...@gmail.com on Sat, 4 Jul 2015 21:42:27 -0400 writes: P n163 - mpfr(163, 500) P is how I set up the number. Yes, and you have needed to specify the desired precision. As author and maintainer of Rmpfr, let me give my summary of this overly long thread (with many wrong statements before finally RK give the obvious answer): 1) Rmpfr is about high (or low, for didactical reasons, see Rich Heiberger's talk at 'useR! 2015') precision __numerical__ computations. This is what the GNU MPFR library is about, and Rmpfr wants to be a smart and R-like {e.g., automatic coercions wherever they make sense} R interface to MPFR. 2) If you use Rmpfr, you as user should decide about the desired accuracy of your inputs. I think it would be a very bad idea to redefine 'pi' (in Rmpfr) to be Const(pi, 120). R-like also means that in a computation a * b the properties of a and b determine the result, and hence if a - pi then we know that pi is a double precision number with 53-bit mantissa. p On 15-07-04 05:10 PM, Ravi Varadhan wrote: What about numeric constants, like `163'? well, if you _combine_ them with an mpfr() number, they are used in their precision. An integer like 163 is exact of course; but pi (another numeric constant) is 53-bit as mentioned above, *and* sqrt(163) is (R-like) a double precision number computed by R's internal double precision arithmetic. My summary: 1. Rmpfr behaves entirely correctly and as documented (in all the examples given here). 2. The idea of substituting expressions containing pi with something like Const(pi, 120) is not a good one. (*) *) One could think of adding a new class, say symbolicNumber or rather numericExpression which in arithmetic would try to behave correctly, namely find out what precision all the other components in the arithmetic have and make sure all parts of the 'numericExpression' are coerced to 'mpfr' with the correct precision, and only then start evaluating the arithmetic. But that *does* look like implementation of Maple/Mathematica/... in R and that seems silly; rather R should interface to Free (as in speech) aka open source symbolic math packages such as Pari, macysma, .. Martin Maechler ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] Conservar el nombre de la variable entre varias funciones
Hola: Gracias por interesar-te per el problema! Explico las líneas generales de lo que quiero hacer. La mayor parte de mi trabajo es analizar datos de estudios ajenos. Hasta ahora lo hacia con SAS y tengo una macro que le paso el nombre del fichero y las variables y la macro realiza, según si la variable es cualitativa o cuantitativa (no afino más el tipo de variable): - Descriptiva: - Tabla de frecuencias para variables cualitativas. - Variables cuantitativas: estadísticos básicos, listado de los 5 casos extremos (máximo y mínimo) y box-plot - Análisis univariado con variable dependiente cualitativa: - Variables independientes cualitativas: - Tabla de contingencia con Chi2 i Fisher - Odds ratios - Variables independientes cuantitativas: - Estadísticos básicos según los valores de la variable dependiente. - Box-plot para cada categoría de la variable dependiente. - Prueba de Kruskal-Wallis - Odds ratio - Análisis univariado con variable dependiente quantitiativa: - Variables independientes cualitativas: - Estadísticos i bosd plot de la var dependiente para cada categoría de la independiente. - Test de Kruskal-Wallis. - Variables independientes cuantitativas: - Regresión lineal simple con los diagramas de dispersión. Hace de forma automática alguna cosa más, pero no quiero cansarte. Te envío al mail personal dos pdf con dos ejemplos, un análisis descriptivo y una univariado como variable dependiente cualitativa. En R estoy haciendo es enviar a una función el nombre de la tabla de datos, las variables que no se han de analizar y el nombre de la variable dependiente si se quiere un análisis univariado: La función DESUNI llama a la función DES (análisis descriptivo) y UNI (análisis univariado). Éstas recorren los nombres de las variables de la tabla de datos y según el tipo de variable independiente (numérica o factor) ejecuta la función correspondiente. Adjunto al final como ejemplo las funciones DESUNI y DES. Con esto se genera una descriptiva básica y un análisis univariado básico que permite conocer las variables del fichero y empezar a comentar con el investigador que análisis realizar. El ahorro de tiempo es considerable. Espero que se me haya entendido y disculpad esta respuesta tan larga. Gracias por la ayuda y saludos. = DESUNI = function(XDADES, XDROP=NULL, XVD=NULL, XSPV=NULL # Si és una anàlisi de SPV # Pot tenir el valor TRUE ) { # Camí de les funcions XCAMIF=~/sys/utils/apps/r/r_funcions/ options(digits = 3, OutDec=,, scipen=999) ## No existeix VD: descriptiva if(is.null(XVD)) # No existeix VD: descriptiva { cat(\n*** Descriptiva (no existeix variable dependent)\n) source(paste(XCAMIF, des.r, sep=)) DES(XDADES=XDADES, XDROP=XDROP, XCAMIF=XCAMIF) } ## Existeis VD: anàlisi univariat else # Existeis VD: anàlisi univariat { source(paste(XCAMIF, uni.r, sep=)) UNI(XDADES=XDADES, XDROP=XDROP, XVD=XVD, XSPV=XSPV, XCAMIF=XCAMIF) } } = Funció DES: DES = function(XDADES, XDROP=NULL, XCAMIF) { ifelse(is.null(XDROP), DADES_S - XDADES, DADES_S - XDADES[, setdiff(names(XDADES), XDROP) ]) attach(DADES_S, warn.conflicts = F) XVARLLI=names(DADES_S) for (XVARNOM in names(DADES_S)) { if(is.numeric(get(XVARNOM))) { source(paste(XCAMIF, des_quanti.r, sep=)) DES_QUANTI (XVARNOM) } else if(is.factor(get(XVARNOM))) { source(paste(XCAMIF, des_quali.r, sep=)) DES_QUALI (XVARNOM) } else { cat(La variable , XVARNOM, no és de cap dels tipus coneguts, \n) } } detach(DADES_S) } = On Mon, 13 Jul 2015 20:56:59 +0200 Carlos Ortega c...@qualityexcellence.es wrote: Hola, ¿Qué tipo de análisis quieres hacer? Porque de fábrica R ya viene con múltiples funciones para el cálculo de diferentes estadísticas descriptivas para todo tipo de variables. Saludos, Carlos Ortega www.qualityexcellence.es El 13 de julio de 2015, 15:33, Griera gri...@yandex.com escribió: Hola: Con esto del R me da la impresión que avanzo un paso y retrocedo dos! El caso es que tengo una cascada de funciones, para realizar un análisis descriptivo automático en función del tipo de variable. Y en los resultados, en lugar de aparecer el nombre de la variable, aparece el nombre del argumento. Esto ocurre tanto si utilizo o no la función get(). Un ejemplo resumen reproducible: = # Con la función get() A - function (XVD, XVI, XDATOS) { attach(XDATOS) B(XVD, XVI) detach(XDATOS) } B - function (XVD, XVI) { TBL
Re: [R] overlap between line segments
Hi Karla, This might help. I haven't tested it exhaustively. transect_overlap-function(x) { if(!is.matrix(x)) stop(x must be a 2x2 matrix) if(x[1,1] = x[2,1]) { if(x[2,2] x[1,2]) overlap-x[1,2]-x[2,1] else overlap-x[2,2]-x[2,1] } else { if(x[1,2] x[2,2]) overlap-x[2,2]-x[1,1] else overlap-x[1,2]- x[1,1] } if(overlap 0) overlap-0 return(overlap) } Jim On Tue, Jul 14, 2015 at 7:44 AM, Karla Shikev karlashi...@gmail.com wrote: Hi there, This is a newbie question, and I'm sure there are simple ways to do this, but I've spent my entire afternoon and I couldn't get it to work. Imagine that I got my samples distributed along a transect and my data refer to the first and last occurrences of each sample. For instance: dat-matrix(c(1,3,2.5,4), ncol=2, byrow=TRUE) dat [,1] [,2] [1,] 1.03 [2,] 2.54 The first line indicates that the first and last occurrences of this subject were 1 and 3, respectively, whereas the second subject was found between 2.5 and 4. I need a simple way to calculate the overlap of their extents (0.5 in this case). This way should provide 0 if there is no overlap, and it should also work in the case where one subject is found only within the extent of the second subject. Any help will be greatly appreciated. Karla [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function returning multiple objects but printing only one
On 13/07/2015 5:06 PM, Daniel Caro wrote: Hello, Sorry if this has already been addressed before but I could not find any helpful references. I would like to create a function that outputs a single element of a list but stores all elements, similar to 'lm' and many other functions. There are several answers on how to return multiple objects with lists, for example: http://r.789695.n4.nabble.com/How-to-return-multiple-values-in-a-function-td858528.html http://stackoverflow.com/questions/8936099/returning-multiple-objects-in-an-r-function But the examples show how to print multiple outputs, such as functionReturningTwoValues - function() {return(list(first=1, second=2))} functionReturningTwoValues() And I only want the function to print a single element from the list but still store the other elements such that they can be retrieved with functionReturningTwoValues$first, for example. My function produces bootstrap coefficients so clearly I don't want to print the bootstrap output but I do want users to be able to access it. You need to give your object a class, and define a print method for that class. It's pretty simple: functionReturningTwoValues - function() {return(structure(list(first=1, second=2), class=MyClass))} print.MyClass - function(x, ...) { print(x$first, ...) } This is using S3 classes. There are other systems (S4, etc.) that let you do this, but none are simpler. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] A simple question
R-help readers, For your information ... The package stringi is required to run Alex's code. Alex's message was cross posted to StackOverflow, and seems to have been answered there, http://stackoverflow.com/questions/31398466/r-stri-locate-all-creating-a-start-and-end-matrix Jean On Tue, Jul 14, 2015 at 12:33 AM, Alex Kim via R-help r-help@r-project.org wrote: Hello, I am trying to create a matrix that looks like this, using the stri_locate_all function. x - ABCDJAKSLABCDAKJSABCD m - stri_locate_all_regex(x, 'ABCD') m [[1]] start end [1,] 1 4 [2,]10 13 [3,]18 21 I tried converting m into a matrix, however it always seems to wrap around the wrong way: output - matrix(unlist(m), ncol = 2, byrow = TRUE) output [,1] [,2] [1,]1 10 [2,] 184 [3,] 13 21 I want to output the start locations in the first column and the end locations in the second column into a matrix to look like this. [,1] [,2] [1,] 1 4 [2,]10 13 [3,]18 21 Thank you for your help, Alex [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Function returning multiple objects but printing only one
Daniel, I'm not sure if this is what you're after, but you could include a print() call in your function. For example: myfun - function(x) { m1 - min(x) m2 - mean(x) m3 - max(x) out - list(m1, m2, m3) print(out[[2]]) return(out) } result - myfun(1:10) Jean On Mon, Jul 13, 2015 at 4:06 PM, Daniel Caro dca...@gmail.com wrote: Hello, Sorry if this has already been addressed before but I could not find any helpful references. I would like to create a function that outputs a single element of a list but stores all elements, similar to 'lm' and many other functions. There are several answers on how to return multiple objects with lists, for example: http://r.789695.n4.nabble.com/How-to-return-multiple-values-in-a-function-td858528.html http://stackoverflow.com/questions/8936099/returning-multiple-objects-in-an-r-function But the examples show how to print multiple outputs, such as functionReturningTwoValues - function() {return(list(first=1, second=2))} functionReturningTwoValues() And I only want the function to print a single element from the list but still store the other elements such that they can be retrieved with functionReturningTwoValues$first, for example. My function produces bootstrap coefficients so clearly I don't want to print the bootstrap output but I do want users to be able to access it. Many thanks, Daniel [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Problem accessing xslx on CRAN mirrir
I am very new to the use of R and trying to install it in order to use a package called mh1823 for determining Probability of Detection stats for Non-Destructive Evaluation. I have hit an immediate problem installing the xlsx package and get the following messages: Warning: unable to access index for repository http://star-www.st-andrews.ac.uk/cran/src/contrib Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/src/contrib Error in install.packages(NULL, .libPaths()[1L], dependencies = NA, type = type) : no packages were specified In addition: Warning message: In open.connection(con, r) : cannot open: HTTP status was '0 (nil)' I do not seem to be able to get past this issue, though am able to load the mh1823 POD package successfully from local zip file Regards, Tom Tom Knox NDE Subject Matter Expert Upstream Engineering Centre BP Sunbury-on-Thames Mobile: +44 (0)7796 182926 Fax: +44 (0)1932 763439 E-mail: tom.k...@bp.com Postal Address: Building H, BP Exploration, Chertsey Road, Sunbury-on-Thames, UK, TW16 7LN [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] overlap between line segments
Fantastic. Thanks! On Tue, Jul 14, 2015 at 7:03 AM, Jim Lemon drjimle...@gmail.com wrote: Hi Karla, This might help. I haven't tested it exhaustively. transect_overlap-function(x) { if(!is.matrix(x)) stop(x must be a 2x2 matrix) if(x[1,1] = x[2,1]) { if(x[2,2] x[1,2]) overlap-x[1,2]-x[2,1] else overlap-x[2,2]-x[2,1] } else { if(x[1,2] x[2,2]) overlap-x[2,2]-x[1,1] else overlap-x[1,2]- x[1,1] } if(overlap 0) overlap-0 return(overlap) } Jim On Tue, Jul 14, 2015 at 7:44 AM, Karla Shikev karlashi...@gmail.com wrote: Hi there, This is a newbie question, and I'm sure there are simple ways to do this, but I've spent my entire afternoon and I couldn't get it to work. Imagine that I got my samples distributed along a transect and my data refer to the first and last occurrences of each sample. For instance: dat-matrix(c(1,3,2.5,4), ncol=2, byrow=TRUE) dat [,1] [,2] [1,] 1.03 [2,] 2.54 The first line indicates that the first and last occurrences of this subject were 1 and 3, respectively, whereas the second subject was found between 2.5 and 4. I need a simple way to calculate the overlap of their extents (0.5 in this case). This way should provide 0 if there is no overlap, and it should also work in the case where one subject is found only within the extent of the second subject. Any help will be greatly appreciated. Karla [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem accessing xslx on CRAN mirrir
On 14/07/2015 6:09 AM, Knox, Tom wrote: I am very new to the use of R and trying to install it in order to use a package called mh1823 for determining Probability of Detection stats for Non-Destructive Evaluation. I have hit an immediate problem installing the xlsx package and get the following messages: Warning: unable to access index for repository http://star-www.st-andrews.ac.uk/cran/src/contrib Warning: unable to access index for repository http://www.stats.ox.ac.uk/pub/RWin/src/contrib Error in install.packages(NULL, .libPaths()[1L], dependencies = NA, type = type) : no packages were specified In addition: Warning message: In open.connection(con, r) : cannot open: HTTP status was '0 (nil)' I do not seem to be able to get past this issue, though am able to load the mh1823 POD package successfully from local zip file It looks as though R couldn't make any connection. Perhaps you are using a proxy? It looks as though you are on Windows; if so, you could try running setInternet2(TRUE) before the install; that will use the proxy settings for Internet Explorer. Duncan Murdoch Regards, Tom Tom Knox NDE Subject Matter Expert Upstream Engineering Centre BP Sunbury-on-Thames Mobile: +44 (0)7796 182926 Fax: +44 (0)1932 763439 E-mail: tom.k...@bp.com Postal Address: Building H, BP Exploration, Chertsey Road, Sunbury-on-Thames, UK, TW16 7LN [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Plot in Rcmdr
Hello, I wondered if anyone could help me with a small issue in Rcmdr. I have used the 'Graphs' function in the drop-down menu to create a scatterplot for groups (gender). But when I do this the legend (telling me the symbols which represent male etc.) keeps obscuring the title of the plot. Does anyone know how to fix this problem - within Rcmdr? Please note I am not looking for help with creating the graph in another way (for example in R). I am specifically trying to figure out if this can be fixed in Rcmdr. If the answer is No - this cannot currently be changed within Rcmdr I would still like to hear from you. Many thanks for any help. Joanne Ingram Research Associate (Medical Statistics) Centre for Population Health Science University of Edinburgh -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] advice on making HTML tables
This might be a little off topic, but I am starting to produce some HTML reports that contain mostly tables and they look great in Chrome but really bad in IE, so I wanted to see if anyone knows of a better way or an easy fix. One option I have used is to convert to PDF, but sometimes it is nice to have the report in HTML format. For my reproducible example, copy the text below into R Studio and hit the knit button. If you look at the HTML output in Chrome the columns are nicely spread out and in IE the columns are jammed right next to each other with minimal/no spacing. Maybe there is a CSS fix? --- title: Untitled output: html_document --- ```{r} knitr::kable(cars) ``` Here is my session info: R version 3.2.1 Patched (2015-07-11 r68646) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] datasets tools utils stats graphics grDevices methods base other attached packages: [1] xtable_1.7-4sqldf_0.4-10RSQLite_1.0.0 DBI_0.3.1 gsubfn_0.6-6 [6] proto_0.3-10Rcpp_0.11.6 Quandl_2.6.0 testthat_0.10.0 lubridate_1.3.3 [11] sendmailR_1.2-1 rmarkdown_0.7 devtools_1.8.0 data.table_1.9.4Rpad_1.3.0 [16] formatR_1.2 dplyr_0.4.2.9002plyr_1.8.3 reshape2_1.4.1 ggplot2_1.0.1 [21] xts_0.9-7 zoo_1.7-12 XLConnect_0.2-11 XLConnectJars_0.2-9 timeDate_3012.100 [26] R2HTML_2.3.1RODBC_1.3-12quadprog_1.5-5 prettyR_2.1-1 MASS_7.3-42 [31] fortunes_1.5-2 corpcor_1.6.8 manipulate_1.0.1 loaded via a namespace (and not attached): [1] rJava_0.9-6 lattice_0.20-31 tcltk_3.2.1 colorspace_1.2-6 htmltools_0.2.6 [6] yaml_2.1.13 base64enc_0.1-2 chron_2.3-47 stringr_1.0.0 munsell_0.4.2 [11] gtable_0.1.2 memoise_0.2.1evaluate_0.7 knitr_1.10.5 parallel_3.2.1 [16] curl_0.9.1 highr_0.5scales_0.2.5 rversions_1.0.2 digest_0.6.8 [21] stringi_0.5-5grid_3.2.1 magrittr_1.5 crayon_1.3.1 xml2_0.1.1 [26] assertthat_0.1 R6_2.1.0 git2r_0.10.1 *** This message and any attachments are for the intended recipient's use only. This message may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the sender by e-mail, delete the message, any attachments and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message or any attachments if you are not the intended recipient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] Crear datos aleatorios con restriciones
Genial Carlos! Tu codigo produce lo que quiero! Estoy tratando de entender cada paso y hacer algunos cambios. Mi problema es con como usar `str_plit_fixed`. Con tu codigo tengo eso: separoPairs - as.data.frame(str_split_fixed(AllpairsTmp, , 6)) head(separoPairs) V1 V2 V3 V4 V5 V6 1 e1 g1 c1 e2 g1 c1 2 e1 g1 c1 e3 g1 c1 3 e1 g1 c1 e4 g1 c1 4 e1 g1 c1 e5 g1 c1 5 e1 g1 c1 e6 g1 c1 6 e1 g1 c1 e7 g1 c1 V1 y V4 son el nombre de las escuelas, V2 y V5 del grado y V3 y V6 de la division. Yo hice unos cambios para tener datos un poco mas complejos, pero como resultado inintencional no puedo producir `separoPairs` Esto es lo que mi codigo produce: head(separoPairs) V1 V2 V3V4 V5 V6 1 Aslamy School 3 grade A Maruyama School 3 grade A 2 Aslamy School 3 grade A Smith School 3 grade A 3 Aslamy School 3 grade A Linares School 3 grade A 4 Aslamy School 3 grade A Dieyleh School 3 grade A 5 Aslamy School 3 grade A Hernandez School 3 grade A 6 Aslamy School 3 grade A Padgett School 3 grade A Se puede arreglar? Este es mi codigo library(dplyr) library(randomNames) library(geosphere) set.seed(7142015) # Define Parameters n.Schools - 20 first.grade-3 last.grade-5 n.Grades -last.grade-first.grade+1 n.Classrooms - 4 n.Teachers - (n.Schools*n.Grades*n.Classrooms)/2 #Two classrooms per teacher # Define Random names function: gen.names - function(n, which.names = both, name.order = last.first){ names - unique(randomNames(n=n, which.names = which.names, name.order = name.order)) need - n - length(names) while(need0){ names - unique(c(randomNames(n=need, which.names = which.names, name.order = name.order), names)) need - n - length(names) } return(names) } # Generate n.Schools names gen.schools - function(n.schools) { School.ID - paste0(gen.names(n = n.schools, which.names = last), ' School') School.long - rnorm(n = n.schools, mean = 21.7672, sd = 0.025) School.lat - rnorm(n = n.schools, mean = 58.8471, sd = 0.025) School.RE - rnorm(n = n.schools, mean = 0, sd = 1) Schools - data.frame(School.ID, School.lat, School.long, School.RE) %% mutate(School.ID = as.character(School.ID)) %% rowwise() %% mutate (School.distance = distHaversine( p1 = c(School.long, School.lat), p2 = c(21.7672, 58.8471), r = 3961 )) return(Schools) } Schools - gen.schools(n.schools = n.Schools) # Generate Grades Grades - c(first.grade:last.grade) # Generate n.Classrooms Classrooms - LETTERS[1:n.Classrooms] # Group schools and grades SchGr - outer(Schools$School.ID, Grades, 'grade', FUN=paste) # Group SchGr and Classrooms SchGrClss - outer(SchGr, Classrooms, FUN=paste) # These are the combination of School-Grades-Classroom SchGrClssTmp - as.matrix(SchGrClss, ncol=1, nrow=length(SchGrClss) ) SchGrClssEnd - as.data.frame(SchGrClssTmp) # Assign n.Teachers (2 classroom in a given school-grade) Allpairs - as.data.frame(t(combn(SchGrClssTmp, 2))) AllpairsTmp - paste(Allpairs$V1, Allpairs$V2, sep= ) library(stringr) separoPairs - as.data.frame(str_split_fixed(AllpairsTmp, , 6)) head(separoPairs) Muchas gracias! Estoy aprendiendo un monto gracias a vos! Ignacio On Tue, Jul 14, 2015 at 3:31 AM Carlos Ortega c...@qualityexcellence.es wrote: OK. Bueno, para esa última parte para tener un data.frame con toda la información, ya filtrada y con los datos de los profesores puedes hacer esto: #-- #Si a los validPairs tengo que asignar T profesores t - 10 teachers - data.frame( Name=sample(paste(Prof_,1:t, sep=),t) ,Speciality=sample(paste(Spec_,1:t, sep=),t) ,Age=sample(25:60,t) ) placesEnd - validPairs[sample(1:nrow(validPairs), t), ] row.names(placesEnd) - NULL placesEndRed - placesEnd[,c(1,2,3,6)] names(placesEndRed) - c(School, Grade, Class_1, Class_2) endAssig - cbind.data.frame(placesEndRed, teachers) endAssig #-- Que produce este tipo de resultado: endAssig School Grade Class_1 Class_2Name Speciality Age 1 e11g2 c3 c18 Prof_2 Spec_5 39 2 e11g2 c5 c16 Prof_8 Spec_1 49 3 e12g1 c3 c17 Prof_1Spec_10 36 4 e2g2 c15 c17 Prof_10 Spec_9 29 5 e1g3 c9 c15 Prof_3 Spec_6 55 6 e6g3 c2 c18 Prof_6 Spec_8 42 7 e17g2 c9 c14 Prof_4 Spec_3 27 8 e18g3 c2 c12 Prof_7 Spec_2 53 9 e13g1 c10 c20 Prof_9 Spec_4 58 10e18g2 c4 c19 Prof_5 Spec_7 59 Saludos, Carlos Ortega www.qualityexcellence.es El 14 de julio de 2015, 1:00, Ignacio Martinez ignaci...@gmail.com escribió: Perdon por no se lo suficientemente claro :( Tu codigo produce `validPairs` que tiene 7 variables y 360 observaciones. Donde
[R] open connection to system
Dear list, Probably not the best subject line, but hopefully I can explain. I would like to use R and open a connection to a (system) command line base chess engine (for example, there is an open source one at stockfishchess.org) In the Terminal window (using MacOS), I can type two commands: $ ./stockfish-6-64 -- this is the first command Stockfish 6 64 by Tord Romstad, Marco Costalba and Joona Kiiski go movetime 3000 -- this is the second command (then lots of lines calculated by the engine, with a final answer after 3 seconds) First command opens a connection to the chess engine, the seconds one tells it to search for a move. The question is, can I do this via R? I tried the system() command, which works with the first command: system(./stockfish-6-64, intern=TRUE) [1] Stockfish 6 64 by Tord Romstad, Marco Costalba and Joona Kiiski but it closes the connection and returns an error if I attempt the second command: system(./stockfish-6-64\ngo movetime 3000, intern=TRUE) Error in system(./stockfish-6-64\ngo movetime 3000, intern = TRUE) : error in running command sh: line 1: go: command not found Any hint would be really appreciated, thanks in advance, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr.90 050663 Bucharest sector 5 Romania [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] advice on making HTML tables
Answering my own question, I was able to make the tables look better in IE using some simple CSS: td { padding: 6px; } -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bos, Roger Sent: Tuesday, July 14, 2015 11:36 AM To: r-help@r-project.org Subject: [R] advice on making HTML tables This might be a little off topic, but I am starting to produce some HTML reports that contain mostly tables and they look great in Chrome but really bad in IE, so I wanted to see if anyone knows of a better way or an easy fix. One option I have used is to convert to PDF, but sometimes it is nice to have the report in HTML format. For my reproducible example, copy the text below into R Studio and hit the knit button. If you look at the HTML output in Chrome the columns are nicely spread out and in IE the columns are jammed right next to each other with minimal/no spacing. Maybe there is a CSS fix? --- title: Untitled output: html_document --- ```{r} knitr::kable(cars) ``` Here is my session info: R version 3.2.1 Patched (2015-07-11 r68646) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] datasets tools utils stats graphics grDevices methods base other attached packages: [1] xtable_1.7-4sqldf_0.4-10RSQLite_1.0.0 DBI_0.3.1 gsubfn_0.6-6 [6] proto_0.3-10Rcpp_0.11.6 Quandl_2.6.0 testthat_0.10.0 lubridate_1.3.3 [11] sendmailR_1.2-1 rmarkdown_0.7 devtools_1.8.0 data.table_1.9.4Rpad_1.3.0 [16] formatR_1.2 dplyr_0.4.2.9002plyr_1.8.3 reshape2_1.4.1 ggplot2_1.0.1 [21] xts_0.9-7 zoo_1.7-12 XLConnect_0.2-11 XLConnectJars_0.2-9 timeDate_3012.100 [26] R2HTML_2.3.1RODBC_1.3-12quadprog_1.5-5 prettyR_2.1-1 MASS_7.3-42 [31] fortunes_1.5-2 corpcor_1.6.8 manipulate_1.0.1 loaded via a namespace (and not attached): [1] rJava_0.9-6 lattice_0.20-31 tcltk_3.2.1 colorspace_1.2-6 htmltools_0.2.6 [6] yaml_2.1.13 base64enc_0.1-2 chron_2.3-47 stringr_1.0.0 munsell_0.4.2 [11] gtable_0.1.2 memoise_0.2.1evaluate_0.7 knitr_1.10.5 parallel_3.2.1 [16] curl_0.9.1 highr_0.5scales_0.2.5 rversions_1.0.2 digest_0.6.8 [21] stringi_0.5-5grid_3.2.1 magrittr_1.5 crayon_1.3.1 xml2_0.1.1 [26] assertthat_0.1 R6_2.1.0 git2r_0.10.1 *** This message and any attachments are for the intended recipient's use only. This message may contain confidential, proprietary or legally privileged information. No right to confidential or privileged treatment of this message is waived or lost by an error in transmission. If you have received this message in error, please immediately notify the sender by e-mail, delete the message, any attachments and all copies from your system and destroy any hard copies. You must not, directly or indirectly, use, disclose, distribute, print or copy any part of this message or any attachments if you are not the intended recipient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot in Rcmdr
It can be changed by slightly modifying the scatterplot() command in the R Script window and re-submitting it. From the top menu select Data | Data in packages | Read data set from an attached package. Then type Pottery in the space next to Enter name of data set (notice that Pottery is capitalized). From the top menu select Graphs | Scatterplot and then select Al as the x-variable and Ca as the y-variable. Click on Plot by groups... and select Site (and unselect Plot lines by group). Click OK and OK again to produce the plot. The legend is outside the plot region and the top margin has been expanded to make room for it. In the R Script window you will see the command: scatterplot(Ca~Al | Site, reg.line=lm, smooth=TRUE, spread=TRUE, id.method='mahal', id.n = 2, boxplots='xy', span=0.5, by.groups=FALSE, data=Pottery) add a single argument to the end of the command so that it looks like this: scatterplot(Ca~Al | Site, reg.line=lm, smooth=TRUE, spread=TRUE, id.method='mahal', id.n = 2, boxplots='xy', span=0.5, by.groups=FALSE, data=Pottery, legend.coords=topright) Then select all three lines and click Submit: The new plot puts the legend in the upper right corner of the plot region. R Commander uses the scatterplot() function from package ca to create the plot. It has several options that are not included on the options dialog window in R Commander, but can be accessed simply by editing the command that R Commander creates. To see these options type ?scatterplot On an empty line in the R Script window, put the cursor on the line and click Submit. This will open your web browser with the manual page for scatterplot. - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of INGRAM Joanne Sent: Tuesday, July 14, 2015 9:53 AM To: r-help@R-project.org Subject: [R] Plot in Rcmdr Hello, I wondered if anyone could help me with a small issue in Rcmdr. I have used the 'Graphs' function in the drop-down menu to create a scatterplot for groups (gender). But when I do this the legend (telling me the symbols which represent male etc.) keeps obscuring the title of the plot. Does anyone know how to fix this problem - within Rcmdr? Please note I am not looking for help with creating the graph in another way (for example in R). I am specifically trying to figure out if this can be fixed in Rcmdr. If the answer is No - this cannot currently be changed within Rcmdr I would still like to hear from you. Many thanks for any help. Joanne Ingram Research Associate (Medical Statistics) Centre for Population Health Science University of Edinburgh -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Spline curve
I wish to fit spline curves to longitudinal data Which package should I use and how should data be structured to facilitate the analysis - Deva __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] Crear datos aleatorios con restriciones
Este codigo resuelve mi problema. Estoy usando `str_split` y como separador '- '. Tambien tengo que usar 'trimws'. Supongo que se podria limpiar el codigo para hacerlo mas eficiente, pero todavia no se me ocurrio como. *Muchas gracias Carlos!* library(dplyr) library(randomNames) library(geosphere) set.seed(7142015) # Define Parameters n.Schools - 20 first.grade-3 last.grade-5 n.Grades -last.grade-first.grade+1 n.Classrooms - 4 n.Teachers - (n.Schools*n.Grades*n.Classrooms)/2 #Two classrooms per teacher # Define Random names function: gen.names - function(n, which.names = both, name.order = last.first){ names - unique(randomNames(n=n, which.names = which.names, name.order = name.order)) need - n - length(names) while(need0){ names - unique(c(randomNames(n=need, which.names = which.names, name.order = name.order), names)) need - n - length(names) } return(names) } # Generate n.Schools names gen.schools - function(n.schools) { School.ID - paste0(gen.names(n = n.schools, which.names = last), ' School') School.long - rnorm(n = n.schools, mean = 21.7672, sd = 0.025) School.lat - rnorm(n = n.schools, mean = 58.8471, sd = 0.025) School.RE - rnorm(n = n.schools, mean = 0, sd = 1) Schools - data.frame(School.ID, School.lat, School.long, School.RE) %% mutate(School.ID = as.character(School.ID)) %% rowwise() %% mutate (School.distance = distHaversine( p1 = c(School.long, School.lat), p2 = c(21.7672, 58.8471), r = 3961 )) return(Schools) } Schools - gen.schools(n.schools = n.Schools) # Generate Grades Grades - c(first.grade:last.grade) # Generate n.Classrooms Classrooms - LETTERS[1:n.Classrooms] # Group schools and grades SchGr - outer(paste0(Schools$School.ID, '-'), paste0(Grades, '-'), FUN=paste) #head(SchGr) # Group SchGr and Classrooms SchGrClss - outer(SchGr, paste0(Classrooms, '-'), FUN=paste) #head(SchGrClss) # These are the combination of School-Grades-Classroom SchGrClssTmp - as.matrix(SchGrClss, ncol=1, nrow=length(SchGrClss) ) SchGrClssEnd - as.data.frame(SchGrClssTmp) # Assign n.Teachers (2 classroom in a given school-grade) Allpairs - as.data.frame(t(combn(SchGrClssTmp, 2))) AllpairsTmp - paste(Allpairs$V1, Allpairs$V2, sep= ) library(stringr) separoPairs - as.data.frame(str_split(string = AllpairsTmp, pattern = -)) separoPairs - as.data.frame(t(separoPairs)) row.names(separoPairs) - NULL separoPairs - separoPairs %% select(-V7) %% #Drops empty column mutate(V1=as.character(V1), V4=as.character(V4), V2=as.numeric(V2), V5=as.numeric(V5)) %% mutate(V4 = trimws(V4, which = both)) separoPairs[120,]$V4 #Only the rows with V1=V4 and V2=V5 are valid validPairs - separoPairs %% filter(V1==V4 V2==V5) %% select(V1, V2, V3, V6) # Generate n.Teachers gen.teachers - function(n.teachers){ Teacher.ID - gen.names(n = n.teachers, name.order = last.first) Teacher.exp - runif(n = n.teachers, min = 1, max = 30) Teacher.Other - sample(c(0,1), replace = T, prob = c(0.5, 0.5), size = n.teachers) Teacher.RE - rnorm(n = n.teachers, mean = 0, sd = 1) Teachers - data.frame(Teacher.ID, Teacher.exp, Teacher.Other, Teacher.RE) return(Teachers) } Teachers - gen.teachers(n.teachers = n.Teachers) %% mutate(Teacher.ID = as.character(Teacher.ID)) # Randomly assign n.Teachers teachers to the ValidPairs TmpAssignments - validPairs[sample(1:nrow(validPairs), n.Teachers), ] Assignments - cbind.data.frame(Teachers$Teacher.ID, TmpAssignments) names(Assignments) - c(Teacher.ID, School.ID, Grade, Class_1, Class_2) # Tidy Data library(tidyr) TeacherClassroom - Assignments %% gather(x, Classroom, Class_1,Class_2) %% select(-x) %% mutate(Teacher.ID = as.character(Teacher.ID)) # Merge DF_Classrooms - TeacherClassroom %% full_join(Teachers, by=Teacher.ID) %% full_join(Schools, by=School.ID) On Tue, Jul 14, 2015 at 10:35 AM Ignacio Martinez ignaci...@gmail.com wrote: Genial Carlos! Tu codigo produce lo que quiero! Estoy tratando de entender cada paso y hacer algunos cambios. Mi problema es con como usar `str_plit_fixed`. Con tu codigo tengo eso: separoPairs - as.data.frame(str_split_fixed(AllpairsTmp, , 6)) head(separoPairs) V1 V2 V3 V4 V5 V6 1 e1 g1 c1 e2 g1 c1 2 e1 g1 c1 e3 g1 c1 3 e1 g1 c1 e4 g1 c1 4 e1 g1 c1 e5 g1 c1 5 e1 g1 c1 e6 g1 c1 6 e1 g1 c1 e7 g1 c1 V1 y V4 son el nombre de las escuelas, V2 y V5 del grado y V3 y V6 de la division. Yo hice unos cambios para tener datos un poco mas complejos, pero como resultado inintencional no puedo producir `separoPairs` Esto es lo que mi codigo produce: head(separoPairs) V1 V2 V3V4 V5 V6 1 Aslamy School 3 grade A Maruyama School 3 grade A 2 Aslamy School 3 grade A Smith School 3 grade A 3 Aslamy School 3 grade A Linares School 3 grade A 4 Aslamy School 3 grade A Dieyleh School 3 grade A 5 Aslamy School 3 grade A Hernandez School 3 grade A 6 Aslamy School 3 grade A Padgett School 3 grade A Se puede
[R-es] Manejo de valores perdidos usando predict con CHAID
Hola Estoy tratando de aplicar probabilidades mediante predict a una base de prueba y no consigo manejar los valores perdidos. No tengo problemas en la generación de las probabilidades, el comando es: #genera las probabilidades en prueba prob.P.T1 - predict(chaid2.T1, pruebaT1, type=prob) El data frame 'pruebaT1' tiene 33527 registros La matriz 'prob.P.T1' tiene 66724 elementos, o sea 33362 registros. Esto me dice que hay algunos casos para los cuales no fué capaz de establecer la probabilidad. Esperaba esa situación. El problema es cuando trato de unir las probabilidades desde la matriz al data frame. El comando: pruebaT1$prob.T1-prob.P.T1[,2] Me da el siguiente error Error in `$-.data.frame`(`*tmp*`, prob.T1, value = c(0.42910447761194, : replacement has 33362 rows, data has 33527 El comando prueba.T1-merge(pruebaT1, prob.P.T1[,2],by=row.names, all.x = TRUE) No da error, pero no estoy seguro que la unión por el número de fila sea correcto. Estuve inspeccionando la matriz resultante del predict y no veo saltos en la numeración de las filas Para poner un ejemplo, Digamos que en mi dataframe 'pruebaT1´tengo lo siguiente RowN Var1 Var2 1 a 1 2 b 2 3 a 3 4 b 4 5 a 1 6 b 7 a 3 8 b 4 9 a 1 10 b 2 Cuando aplico la función predict a estos datos obtengo solamente 9 filas, lo cual es correcto. Pero cuando las veo con el comando: view(prob.T1) las filas están numeradas de 1 a 9... Eso me lleva a preguntarme si el comando merge es el mejor para estos casos Para que quede claro, lo que quiero que ocurra es lo siguiente: RowN Var1 Var2 prob.T1 1 a 1 0.03 2 b 2 0.05 3 a 3 0.06 4 b 4 0.03 5 a 1 0.05 6 b NA 7 a 3 0.03 8 b 4 0.03 9 a 1 0.05 10 b 2 0.06 Y no encuentro la manera de hacerlo. Cualquier sugerencia es bienvenida. Gracias de antemano, Oscar -- Oscar Benitez [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Altering Forest plot in Metafor package
Use the 'alim' argument (or the 'at' argument) to restrict the axis limits, so that CI bounds below/above are indicated with an arrow. And play around with the 'xlim' argument to make better use of the space in the plotting region. And the 'ilab' argument allows you to add columns with additional information to the plot. Please read help(forest.rma) carefully and especially try out all of the examples. They illustrate the use of these arguments. Best, Wolfgang From: R-help [r-help-boun...@r-project.org] On Behalf Of Fosulli [fosu...@tcd.ie] Sent: Tuesday, July 14, 2015 6:42 PM To: r-help@r-project.org Subject: [R] Altering Forest plot in Metafor package Dear All, I'm having trouble tweaking a forest plot made using the R meta-analysis package metafor. My main problem is that I have two studies which have very large Confidence intervals and as such my forest plot is very wide, and not neat. As I would like to add more descriptive columns into the plot too, I was wondering if there was a way to cut the confidence interval in the graph and add arrows to suggest that it continues on, while keeping the OR values correct so that the reader can view the CI clearly. http://r.789695.n4.nabble.com/file/n4709857/SNIP.png I hope I am clear in what I am asking, but here is an example of what I am hoping is possible in Metafor http://r.789695.n4.nabble.com/file/n4709857/arrows.png -- View this message in context: http://r.789695.n4.nabble.com/Altering-Forest-plot-in-Metafor-package-tp4709857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] Conservar el nombre de la variable entre varias funciones: ejemplos de resultados
Hola, Gracias por el código. Lo he ejecutado y he visto los resultados. Salvo la parte de los test como te dije, todo lo demás creo que se puede hacer más automático. Probaré a hacer alguna prueba de lo que te comento utilizando el conjunto de MASS. Sobre la duda de los nombres, si le pasas el data.frame tal cual, te debiera de conservar los nombres. Si no es así, pásale como argumento adicional a las funciones los nombres de las columnas/variables... Saludos, Carlos. El 14 de julio de 2015, 22:49, Griera gri...@yandex.com escribió: Hola Carlos: Te adjunto un ejemplo de aplicación: las funciones (he borrado los path de las funciones y las ordenes source() que las carga ) y un ejemplo para ejecutarlas para las opciones que tengo implementadas con la tabla de datos birthwt del paqueteMASS: - Descriptiva de todas las variables de una tabla. - Análisis univariado de todas las variables de una tabla cruzadas con una variable dependiente cualitativa. =Inicio funciones ##-- ## DESUNI ##-- DESUNI = function(XDADES, XDROP=NULL, XVD=NULL, XSPV=NULL # Si és una anàlisi de SPV # Pot tenir el valor TRUE ) { options(digits = 3, OutDec=,, scipen=999) ## No existeix VD: descriptiva if(is.null(XVD)) # No existeix VD: descriptiva { cat(\n*** Descriptiva (no existeix variable dependent)\n) DES(XDADES=XDADES, XDROP=XDROP, XCAMIF=XCAMIF) } ## Existeis VD: anàlisi univariat else # Existeis VD: anàlisi univariat { UNI(XDADES=XDADES, XDROP=XDROP, XVD=XVD, XSPV=XSPV, XCAMIF=XCAMIF) } } ##-- ## DES: Descriptiva de todas las variables ##-- DES = function(XDADES, XDROP=NULL, XCAMIF) { ifelse(is.null(XDROP), DADES_S - XDADES, DADES_S - XDADES[, setdiff(names(XDADES), XDROP) ]) # setdiff Selecciona les variables de XDADES que són diferents de XDROP attach(DADES_S, warn.conflicts = F) XVARLLI=names(DADES_S) for (XVARNOM in names(DADES_S)) { if(is.numeric(get(XVARNOM))) { DES_QUANTI (XVARNOM) } else if(is.factor(get(XVARNOM))) { DES_QUALI (XVARNOM) } else { cat(La variable , XVARNOM, no és de cap dels tipus coneguts, \n) } } # Fi de la funció detach(DADES_S) } ##-- ## DES_QUANTI: Descriptiva variables factor ##-- DES_QUANTI - function(X) { OP - par(no.readonly = TRUE); # save old parameters par(mfrow=c(1,3)) hist(get(X),main=c(Histograma de, X), xlab=X);rug(get(X)) boxplot(get(X), main=c(Diagrama de caixa de, X), ylab=X);rug(get(X),side=2) qqnorm(get(X), main=c(Diagrama Q-Q de, X));qqline(get(X)) cat(\n) par(OP) ESTA_1-data.frame(Variable = X, N_total = length(get(X)), N_valids = sum(!is.na(get(X))), N_desconeguts = sum(is.na(get(X))) ) ESTA_2-data.frame(Variable = X, N = sum(!is.na(get(X))), Mitjana = if (mean(get(X) 10)) {round(mean(get(X), na.rm = TRUE), 2)} else {round(mean(get(X), na.rm = TRUE), 3)}, Err_tipic = if (sd (get(X) 10)) {round(sd (get(X), na.rm = TRUE), 2)} else {round(sd (get(X), na.rm = TRUE), 3)}, Min = min(get(X), na.rm = TRUE), Perc_25 = quantile(get(X),.25), Mediana = median(get(X), na.rm = TRUE), Perc_75 = quantile(get(X),.75), Max = max(get(X), na.rm = TRUE), Interval = max(get(X), na.rm = TRUE) - min(get(X), na.rm = TRUE) ) cat(, \n) cat(Valors valids i desconeguts, \n) print(ESTA_1, row.names = FALSE) cat(, \n) cat(Estadistics, \n) print(ESTA_2, row.names = FALSE) cat(, \n) return(summary(get(X))) } ##-- ## DES_QUALI: Descriptiva variables factor ##-- DES_QUALI - function(X) { cat(Var factor: ,X,\n) XOUT - as.data.frame(table(get(X))) names(XOUT)[1] = X XOUT - transform(XOUT, cumFreq = cumsum(Freq), Percentatge = prop.table(Freq)) print(XOUT)
[R-es] una inquietud de forma
Cordial saludo. Aunque he buscado no he logrado establecer como se hace para lograr salidas en R con el formato de esta página, http://www.um.es/ae/FEIR/10/#stat.desc-del-paquete-pastecs agradezco si alguien me puede orientar, agradezco su amabilidad. Heber [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R-es] Conservar el nombre de la variable entre varias funciones: ejemplos de resultados
Hola Carlos: Te adjunto un ejemplo de aplicación: las funciones (he borrado los path de las funciones y las ordenes source() que las carga ) y un ejemplo para ejecutarlas para las opciones que tengo implementadas con la tabla de datos birthwt del paqueteMASS: - Descriptiva de todas las variables de una tabla. - Análisis univariado de todas las variables de una tabla cruzadas con una variable dependiente cualitativa. =Inicio funciones ##-- ## DESUNI ##-- DESUNI = function(XDADES, XDROP=NULL, XVD=NULL, XSPV=NULL # Si és una anàlisi de SPV # Pot tenir el valor TRUE ) { options(digits = 3, OutDec=,, scipen=999) ## No existeix VD: descriptiva if(is.null(XVD)) # No existeix VD: descriptiva { cat(\n*** Descriptiva (no existeix variable dependent)\n) DES(XDADES=XDADES, XDROP=XDROP, XCAMIF=XCAMIF) } ## Existeis VD: anàlisi univariat else # Existeis VD: anàlisi univariat { UNI(XDADES=XDADES, XDROP=XDROP, XVD=XVD, XSPV=XSPV, XCAMIF=XCAMIF) } } ##-- ## DES: Descriptiva de todas las variables ##-- DES = function(XDADES, XDROP=NULL, XCAMIF) { ifelse(is.null(XDROP), DADES_S - XDADES, DADES_S - XDADES[, setdiff(names(XDADES), XDROP) ]) # setdiff Selecciona les variables de XDADES que són diferents de XDROP attach(DADES_S, warn.conflicts = F) XVARLLI=names(DADES_S) for (XVARNOM in names(DADES_S)) { if(is.numeric(get(XVARNOM))) { DES_QUANTI (XVARNOM) } else if(is.factor(get(XVARNOM))) { DES_QUALI (XVARNOM) } else { cat(La variable , XVARNOM, no és de cap dels tipus coneguts, \n) } } # Fi de la funció detach(DADES_S) } ##-- ## DES_QUANTI: Descriptiva variables factor ##-- DES_QUANTI - function(X) { OP - par(no.readonly = TRUE); # save old parameters par(mfrow=c(1,3)) hist(get(X),main=c(Histograma de, X), xlab=X);rug(get(X)) boxplot(get(X), main=c(Diagrama de caixa de, X), ylab=X);rug(get(X),side=2) qqnorm(get(X), main=c(Diagrama Q-Q de, X));qqline(get(X)) cat(\n) par(OP) ESTA_1-data.frame(Variable = X, N_total = length(get(X)), N_valids = sum(!is.na(get(X))), N_desconeguts = sum(is.na(get(X))) ) ESTA_2-data.frame(Variable = X, N = sum(!is.na(get(X))), Mitjana = if (mean(get(X) 10)) {round(mean(get(X), na.rm = TRUE), 2)} else {round(mean(get(X), na.rm = TRUE), 3)}, Err_tipic = if (sd (get(X) 10)) {round(sd (get(X), na.rm = TRUE), 2)} else {round(sd (get(X), na.rm = TRUE), 3)}, Min = min(get(X), na.rm = TRUE), Perc_25 = quantile(get(X),.25), Mediana = median(get(X), na.rm = TRUE), Perc_75 = quantile(get(X),.75), Max = max(get(X), na.rm = TRUE), Interval = max(get(X), na.rm = TRUE) - min(get(X), na.rm = TRUE) ) cat(, \n) cat(Valors valids i desconeguts, \n) print(ESTA_1, row.names = FALSE) cat(, \n) cat(Estadistics, \n) print(ESTA_2, row.names = FALSE) cat(, \n) return(summary(get(X))) } ##-- ## DES_QUALI: Descriptiva variables factor ##-- DES_QUALI - function(X) { cat(Var factor: ,X,\n) XOUT - as.data.frame(table(get(X))) names(XOUT)[1] = X XOUT - transform(XOUT, cumFreq = cumsum(Freq), Percentatge = prop.table(Freq)) print(XOUT) print(-) } ##-- ## UNI: Análisis univarido ##-- UNI = function(XDADES, XDROP=NULL, XVD, XSPV=NULL, # Si és una anàlisi de SPV XCAMIF ) { ifelse(is.null(XDROP), DADES_S - XDADES, DADES_S - XDADES[, setdiff(names(XDADES), XDROP) ]) attach(DADES_S, warn.conflicts = F) cat(\n Descriptiva de totes les variables seleccionades\n) print(summary(DADES_S)) for (XVARNOMT in
Re: [R] sum some columns for each row
Well it is pretty obvious that all of your columns have non-numeric data in them, but you are the only one who can tell which ones should have been numeric, and you are also the one who can peruse your data file in a text editor. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 4:05:37 PM PDT, Dawn dawn1...@gmail.com wrote: I used two rows to test the data frame, as follows. dat - read.table(TOV_43_Protein_Clusters_abundance1.tab, header=TRUE,sep = \t) dat1 - dat[1:2,] str(dat1) 'data.frame':2 obs. of 44 variables: $ X : Factor w/ 1075762 levels ,POV_Cluster_101,..: 305266 625028 $ X109DCM: Factor w/ 46 levels ,1,10,109DCM,..: 1 1 $ X109SUR: Factor w/ 41 levels ,1,10,109SUR,..: 1 1 $ X18DCM : Factor w/ 31 levels ,1,10,11,..: 1 1 $ X18SUR : Factor w/ 25 levels ,1,10,11,..: 1 1 $ X22SUR : Factor w/ 50 levels ,1,10,11,..: 1 2 $ X23DCM : Factor w/ 46 levels ,1,10,11,..: 1 1 $ X25DCM : Factor w/ 42 levels ,1,10,11,..: 1 1 $ X25SUR : Factor w/ 47 levels ,1,10,11,..: 1 1 $ X30DCM : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X31SUR : Factor w/ 43 levels ,1,10,11,..: 1 1 $ X32DCM : Factor w/ 15 levels ,1,10,11,..: 1 1 $ X32SUR : Factor w/ 58 levels ,1,10,11,..: 1 1 $ X34DCM : Factor w/ 53 levels ,1,10,11,..: 1 35 $ X34SUR : Factor w/ 47 levels ,1,10,11,..: 10 14 $ X36DCM : Factor w/ 48 levels ,1,10,11,..: 2 43 $ X36SUR : Factor w/ 45 levels ,1,10,11,..: 23 38 $ X38DCM : Factor w/ 40 levels ,1,10,11,..: 3 23 $ X38SUR : Factor w/ 44 levels ,1,10,11,..: 7 41 $ X39DCM : Factor w/ 38 levels ,1,10,11,..: 34 38 $ X39SUR : Factor w/ 40 levels ,1,10,11,..: 13 40 $ X41DCM : Factor w/ 47 levels ,1,10,11,..: 13 40 $ X41SUR : Factor w/ 40 levels ,1,10,11,..: 1 1 $ X42DCM : Factor w/ 48 levels ,1,10,11,..: 2 3 $ X42SUR : Factor w/ 41 levels ,1,10,11,..: 2 1 $ X46SUR : Factor w/ 31 levels ,1,10,11,..: 2 2 $ X52DCM : Factor w/ 49 levels ,1,10,11,..: 13 23 $ X64DCM : Factor w/ 35 levels ,1,10,11,..: 1 2 $ X64SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X65DCM : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X65SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X66DCM : Factor w/ 27 levels ,1,10,11,..: 1 1 $ X66SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X67SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X68DCM : Factor w/ 33 levels ,1,10,11,..: 1 1 $ X68SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X70MES : Factor w/ 23 levels ,1,10,11,..: 1 1 $ X70SUR : Factor w/ 37 levels ,1,10,11,..: 1 1 $ X72DCM : Factor w/ 40 levels ,1,10,11,..: 13 27 $ X72SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X76DCM : Factor w/ 44 levels ,1,10,11,..: 1 1 $ X76SUR : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X82DCM : Factor w/ 29 levels ,1,10,11,..: 1 1 $ X85DCM : Factor w/ 30 levels ,1,10,11,..: 1 1 Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I suspect your data frame dat has non-numeric data in some of the columns that have ABC in their names. Any column of a data frame can be numeric or not, but the data frame as a unit cannot be numeric. If your data file has odd characters in done of the otherwise-numeric columns, the whole column will be read in as a factor or character strings. Look at the output of str(dat) for columns that don't show num'. If you can find the column, and then one of the bad rows, you can use a text editor to fix them manually, or show us examples of the bad data and we can suggest ways to fix it in R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 2:35:38 PM PDT, Dawn dawn1...@gmail.com wrote: Hi, I used a small set of data (several columns and rows) and it works fine using the following command: abc - rowSums(test[,grep(ABC,names(test),fixed=T)],na.rm=T) But when I used the real big data table, Error in rowSums(dat[, grep(ABC, names(dat), fixed = T)], na.rm = T) : 'x' must be numeric Then it didn't work either using as.numeric():
[R] Parsing large amounts of csv data with limited RAM
I'm relatively new to using R, and I am trying to find a decent solution for my current dilemma. Right now, I am currently trying to parse second data from a 7 months of CSV data. This is over 10GB of data, and I've run into some memory issues loading them all into a single dataset to be plotted. If possible, I'd really like to keep both the one second resolution, and all 100 or so columns intact to make things easier on myself. The problem I have is that the machine that is running this script only has 8GB of RAM. I've had issues parsing files with lapply, and some sort of csv reader. So far I've tried read.csv, readr.read_table, and data.table.fread with only fread having any sort of memory management (fread seems to crash on me however). The basic approach I am using is as follows: # Get the data files = list.files(pattern=*.csv) set - lapply(files, function(x) fread(x, header = T, sep = ',')) #replace fread with something that can parse csv data # Handle the data (Do my plotting down here) ... These processes work with smaller data sets, but I would like to in a worse case scenario be able to parse through 1 year data which would be around 20GB. Thank you for your time, Robert Dupuis __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Troubleshooting DCchoice for dichotomous choice CV data
On Jul 14, 2015, at 6:02 PM, Rogério Araújo wrote: I'm trying to run dichotomous choice CV data using DCchoice package. But I didn't succeed. I keep receiving those messages for 'sbchoice' and 'dbchoice' functions: Warning in install.packages : package ‘DCchoice’ is not available (for R version 3.2.1) The webpage for this package on R-Forge says is fails to build. You really should be corresponding with its author. Error: could not find function sbchoice Anyone could give me a hint how to solve this problem? -- Prof. Rogério César Pereira de Araújo, Ph.D. Departamento de Economia Agrícola Universidade Federal do Ceará Campus do Pici, CP 6017 Fortaleza-CE, Brasil, CEP 60455-970 E-mail: r...@ufc.br Tel.: +55 85 3366-9716 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum some columns for each row
On Jul 14, 2015, at 4:49 PM, Dawn wrote: I attached the file Well, you may have attached it, but you evidently did not read the posting guide about which filetypes are accepted by the mailserver. including the first two rows and please help to make it the numeric data frame. Hopefully the following command works: dcm - rowSums(dat1[,grep(DCM,names(dat1),fixed=T)],na.rm=T) How do you expect that to deliver anything meaningful if all of your columns are factor class? That was the reason for this error in an earlier posting of yours: But when I used the real big data table, Error in rowSums(dat[, grep(ABC, names(dat), fixed = T)], na.rm = T) : 'x' must be numeric You are not paying attention to the responses you have received so far. I think Bert Gunter's suggestion that you need to work through more introductory tutorials is on point. -- David. Thank you very much! Dawn On Tue, Jul 14, 2015 at 4:36 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: Well it is pretty obvious that all of your columns have non-numeric data in them, but you are the only one who can tell which ones should have been numeric, and you are also the one who can peruse your data file in a text editor. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 4:05:37 PM PDT, Dawn dawn1...@gmail.com wrote: I used two rows to test the data frame, as follows. dat - read.table(TOV_43_Protein_Clusters_abundance1.tab, header=TRUE,sep = \t) dat1 - dat[1:2,] str(dat1) 'data.frame':2 obs. of 44 variables: $ X : Factor w/ 1075762 levels ,POV_Cluster_101,..: 305266 625028 $ X109DCM: Factor w/ 46 levels ,1,10,109DCM,..: 1 1 $ X109SUR: Factor w/ 41 levels ,1,10,109SUR,..: 1 1 $ X18DCM : Factor w/ 31 levels ,1,10,11,..: 1 1 $ X18SUR : Factor w/ 25 levels ,1,10,11,..: 1 1 $ X22SUR : Factor w/ 50 levels ,1,10,11,..: 1 2 $ X23DCM : Factor w/ 46 levels ,1,10,11,..: 1 1 $ X25DCM : Factor w/ 42 levels ,1,10,11,..: 1 1 $ X25SUR : Factor w/ 47 levels ,1,10,11,..: 1 1 $ X30DCM : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X31SUR : Factor w/ 43 levels ,1,10,11,..: 1 1 $ X32DCM : Factor w/ 15 levels ,1,10,11,..: 1 1 $ X32SUR : Factor w/ 58 levels ,1,10,11,..: 1 1 $ X34DCM : Factor w/ 53 levels ,1,10,11,..: 1 35 $ X34SUR : Factor w/ 47 levels ,1,10,11,..: 10 14 $ X36DCM : Factor w/ 48 levels ,1,10,11,..: 2 43 $ X36SUR : Factor w/ 45 levels ,1,10,11,..: 23 38 $ X38DCM : Factor w/ 40 levels ,1,10,11,..: 3 23 $ X38SUR : Factor w/ 44 levels ,1,10,11,..: 7 41 $ X39DCM : Factor w/ 38 levels ,1,10,11,..: 34 38 $ X39SUR : Factor w/ 40 levels ,1,10,11,..: 13 40 $ X41DCM : Factor w/ 47 levels ,1,10,11,..: 13 40 $ X41SUR : Factor w/ 40 levels ,1,10,11,..: 1 1 $ X42DCM : Factor w/ 48 levels ,1,10,11,..: 2 3 $ X42SUR : Factor w/ 41 levels ,1,10,11,..: 2 1 $ X46SUR : Factor w/ 31 levels ,1,10,11,..: 2 2 $ X52DCM : Factor w/ 49 levels ,1,10,11,..: 13 23 $ X64DCM : Factor w/ 35 levels ,1,10,11,..: 1 2 $ X64SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X65DCM : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X65SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X66DCM : Factor w/ 27 levels ,1,10,11,..: 1 1 $ X66SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X67SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X68DCM : Factor w/ 33 levels ,1,10,11,..: 1 1 $ X68SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X70MES : Factor w/ 23 levels ,1,10,11,..: 1 1 $ X70SUR : Factor w/ 37 levels ,1,10,11,..: 1 1 $ X72DCM : Factor w/ 40 levels ,1,10,11,..: 13 27 $ X72SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X76DCM : Factor w/ 44 levels ,1,10,11,..: 1 1 $ X76SUR : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X82DCM : Factor w/ 29 levels ,1,10,11,..: 1 1 $ X85DCM : Factor w/ 30 levels ,1,10,11,..: 1 1 Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I suspect your data frame dat has non-numeric data in some of the columns that have ABC in their names. Any column of a data frame can be numeric or not, but the data frame as a unit cannot be numeric. If your data file has odd characters in done of the otherwise-numeric columns, the whole column will be read in as a factor or character strings. Look at the output of str(dat) for columns that don't show num'. If you can find the column, and then one of the bad rows, you can use a text editor to fix them manually, or show us examples of the bad data and we can suggest ways to
Re: [R] sum some columns for each row
Hi, I used a small set of data (several columns and rows) and it works fine using the following command: abc - rowSums(test[,grep(ABC,names(test),fixed=T)],na.rm=T) But when I used the real big data table, Error in rowSums(dat[, grep(ABC, names(dat), fixed = T)], na.rm = T) : 'x' must be numeric Then it didn't work either using as.numeric(): as.numeric(dat) Error: (list) object cannot be coerced to type 'double' Thanks! Dawn On Fri, Jul 10, 2015 at 4:35 PM, Dawn dawn1...@gmail.com wrote: Thank you all and sorry for the data messing. It has worked! Best, Dawn On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon drjimle...@gmail.com wrote: Hi Dawn, Your data are a bit messed up, but try the following: colSums(dat[,grep(ABC,names(dat),fixed=TRUE)],na.rm=TRUE) colSums(dat[,grep(XYZ,names(dat),fixed=TRUE)],na.rm=TRUE) I'm assuming that you want to discard the NA values. Jim On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Please use ?dput to give a data example, like this it's completely unreadable. If your data.frame is named 'dat' use dput(head(dat, 30)) # paste the outut of this in your mail And don't post in html, use plain text only, like the posting guide says. Rui Barradas Em 09-07-2015 18:12, Dawn escreveu: Hi, I have a big dataframe as follows 109ABC109XYZ18ABC18XYZ22XYZ23ABC25ABC 25XYZ 30ABC31XYZ32ABC32XYZ34DCM34XYZ36ABC 36SUR 38DCM38XYZ39DCM39SUR41DCM41SUR42DCM42SUR 46SUR52DCM64ABC64XYZ65ABC65XYZ66ABC66XYZ 67XYZ68ABC68SUR70MES70SUR72ABC72XYZ76ABC 76XYZ82ABC85ABCPOV Cluster_117 1 310145221112 2TT:61 Cluster_21420 653699610131 4TT:88 Cluster_3336417 1718131719221152185184 79 TT:227 I want to get two columns, i.e, one is to sum columns for all including ABC for each row and the other is to sum columns for all including XYZ for each row. Is there some help? Thank you! Dawn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Troubleshooting DCchoice for dichotomous choice CV data
I'm trying to run dichotomous choice CV data using DCchoice package. But I didn't succeed. I keep receiving those messages for 'sbchoice' and 'dbchoice' functions: Warning in install.packages : package ‘DCchoice’ is not available (for R version 3.2.1) Error: could not find function sbchoice Anyone could give me a hint how to solve this problem? -- Prof. Rogério César Pereira de Araújo, Ph.D. Departamento de Economia Agrícola Universidade Federal do Ceará Campus do Pici, CP 6017 Fortaleza-CE, Brasil, CEP 60455-970 E-mail: r...@ufc.br Tel.: +55 85 3366-9716 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Parsing large amounts of csv data with limited RAM
take a look at the sqldf package because it has the ability to load a csv file to a database from which you can then process the data in pieces Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Tue, Jul 14, 2015 at 10:27 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: You seem to want your cake and eat it too. Not unexpected, but you may have your work cut out to learn about the price of having it all. Plotting: pretty silly to stick with gigabytes of data in your plots. Some kind of aggregation seems required here, with the raw data being a stepping stone to that goal. Loading: if you don't have RAM, buy more or use one of the disk-based solutions. There are proprietary solutions for a fee, and there are packages like ff. When I have dealt with large data sets I have used sqldf or RODBC (which I think works best for read-only access), so I cannot advise you on ff. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 3:21:42 PM PDT, Dupuis, Robert dup...@beaconpower.com wrote: I'm relatively new to using R, and I am trying to find a decent solution for my current dilemma. Right now, I am currently trying to parse second data from a 7 months of CSV data. This is over 10GB of data, and I've run into some memory issues loading them all into a single dataset to be plotted. If possible, I'd really like to keep both the one second resolution, and all 100 or so columns intact to make things easier on myself. The problem I have is that the machine that is running this script only has 8GB of RAM. I've had issues parsing files with lapply, and some sort of csv reader. So far I've tried read.csv, readr.read_table, and data.table.fread with only fread having any sort of memory management (fread seems to crash on me however). The basic approach I am using is as follows: # Get the data files = list.files(pattern=*.csv) set - lapply(files, function(x) fread(x, header = T, sep = ',')) #replace fread with something that can parse csv data # Handle the data (Do my plotting down here) ... These processes work with smaller data sets, but I would like to in a worse case scenario be able to parse through 1 year data which would be around 20GB. Thank you for your time, Robert Dupuis __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum some columns for each row
I suspect your data frame dat has non-numeric data in some of the columns that have ABC in their names. Any column of a data frame can be numeric or not, but the data frame as a unit cannot be numeric. If your data file has odd characters in done of the otherwise-numeric columns, the whole column will be read in as a factor or character strings. Look at the output of str(dat) for columns that don't show num'. If you can find the column, and then one of the bad rows, you can use a text editor to fix them manually, or show us examples of the bad data and we can suggest ways to fix it in R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 2:35:38 PM PDT, Dawn dawn1...@gmail.com wrote: Hi, I used a small set of data (several columns and rows) and it works fine using the following command: abc - rowSums(test[,grep(ABC,names(test),fixed=T)],na.rm=T) But when I used the real big data table, Error in rowSums(dat[, grep(ABC, names(dat), fixed = T)], na.rm = T) : 'x' must be numeric Then it didn't work either using as.numeric(): as.numeric(dat) Error: (list) object cannot be coerced to type 'double' Thanks! Dawn On Fri, Jul 10, 2015 at 4:35 PM, Dawn dawn1...@gmail.com wrote: Thank you all and sorry for the data messing. It has worked! Best, Dawn On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon drjimle...@gmail.com wrote: Hi Dawn, Your data are a bit messed up, but try the following: colSums(dat[,grep(ABC,names(dat),fixed=TRUE)],na.rm=TRUE) colSums(dat[,grep(XYZ,names(dat),fixed=TRUE)],na.rm=TRUE) I'm assuming that you want to discard the NA values. Jim On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Please use ?dput to give a data example, like this it's completely unreadable. If your data.frame is named 'dat' use dput(head(dat, 30)) # paste the outut of this in your mail And don't post in html, use plain text only, like the posting guide says. Rui Barradas Em 09-07-2015 18:12, Dawn escreveu: Hi, I have a big dataframe as follows 109ABC109XYZ18ABC18XYZ22XYZ23ABC 25ABC 25XYZ 30ABC31XYZ32ABC32XYZ34DCM34XYZ36ABC 36SUR 38DCM38XYZ39DCM39SUR41DCM41SUR42DCM 42SUR 46SUR52DCM64ABC64XYZ65ABC65XYZ66ABC 66XYZ 67XYZ68ABC68SUR70MES70SUR72ABC72XYZ 76ABC 76XYZ82ABC85ABCPOV Cluster_1 17 1 310145221112 2TT:61 Cluster_214 20 653699610131 4TT:88 Cluster_33364 17 171813171922115218518 4 79 TT:227 I want to get two columns, i.e, one is to sum columns for all including ABC for each row and the other is to sum columns for all including XYZ for each row. Is there some help? Thank you! Dawn [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sum some columns for each row
It seems that Dawn could really benefit from spending some time with an online R tutorial or two, as she appears not to have much of a clue about R's basic data structures. Cheers, Bert Bert Gunter Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. -- Clifford Stoll On Tue, Jul 14, 2015 at 4:36 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: Well it is pretty obvious that all of your columns have non-numeric data in them, but you are the only one who can tell which ones should have been numeric, and you are also the one who can peruse your data file in a text editor. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 4:05:37 PM PDT, Dawn dawn1...@gmail.com wrote: I used two rows to test the data frame, as follows. dat - read.table(TOV_43_Protein_Clusters_abundance1.tab, header=TRUE,sep = \t) dat1 - dat[1:2,] str(dat1) 'data.frame':2 obs. of 44 variables: $ X : Factor w/ 1075762 levels ,POV_Cluster_101,..: 305266 625028 $ X109DCM: Factor w/ 46 levels ,1,10,109DCM,..: 1 1 $ X109SUR: Factor w/ 41 levels ,1,10,109SUR,..: 1 1 $ X18DCM : Factor w/ 31 levels ,1,10,11,..: 1 1 $ X18SUR : Factor w/ 25 levels ,1,10,11,..: 1 1 $ X22SUR : Factor w/ 50 levels ,1,10,11,..: 1 2 $ X23DCM : Factor w/ 46 levels ,1,10,11,..: 1 1 $ X25DCM : Factor w/ 42 levels ,1,10,11,..: 1 1 $ X25SUR : Factor w/ 47 levels ,1,10,11,..: 1 1 $ X30DCM : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X31SUR : Factor w/ 43 levels ,1,10,11,..: 1 1 $ X32DCM : Factor w/ 15 levels ,1,10,11,..: 1 1 $ X32SUR : Factor w/ 58 levels ,1,10,11,..: 1 1 $ X34DCM : Factor w/ 53 levels ,1,10,11,..: 1 35 $ X34SUR : Factor w/ 47 levels ,1,10,11,..: 10 14 $ X36DCM : Factor w/ 48 levels ,1,10,11,..: 2 43 $ X36SUR : Factor w/ 45 levels ,1,10,11,..: 23 38 $ X38DCM : Factor w/ 40 levels ,1,10,11,..: 3 23 $ X38SUR : Factor w/ 44 levels ,1,10,11,..: 7 41 $ X39DCM : Factor w/ 38 levels ,1,10,11,..: 34 38 $ X39SUR : Factor w/ 40 levels ,1,10,11,..: 13 40 $ X41DCM : Factor w/ 47 levels ,1,10,11,..: 13 40 $ X41SUR : Factor w/ 40 levels ,1,10,11,..: 1 1 $ X42DCM : Factor w/ 48 levels ,1,10,11,..: 2 3 $ X42SUR : Factor w/ 41 levels ,1,10,11,..: 2 1 $ X46SUR : Factor w/ 31 levels ,1,10,11,..: 2 2 $ X52DCM : Factor w/ 49 levels ,1,10,11,..: 13 23 $ X64DCM : Factor w/ 35 levels ,1,10,11,..: 1 2 $ X64SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X65DCM : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X65SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X66DCM : Factor w/ 27 levels ,1,10,11,..: 1 1 $ X66SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X67SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X68DCM : Factor w/ 33 levels ,1,10,11,..: 1 1 $ X68SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X70MES : Factor w/ 23 levels ,1,10,11,..: 1 1 $ X70SUR : Factor w/ 37 levels ,1,10,11,..: 1 1 $ X72DCM : Factor w/ 40 levels ,1,10,11,..: 13 27 $ X72SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X76DCM : Factor w/ 44 levels ,1,10,11,..: 1 1 $ X76SUR : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X82DCM : Factor w/ 29 levels ,1,10,11,..: 1 1 $ X85DCM : Factor w/ 30 levels ,1,10,11,..: 1 1 Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I suspect your data frame dat has non-numeric data in some of the columns that have ABC in their names. Any column of a data frame can be numeric or not, but the data frame as a unit cannot be numeric. If your data file has odd characters in done of the otherwise-numeric columns, the whole column will be read in as a factor or character strings. Look at the output of str(dat) for columns that don't show num'. If you can find the column, and then one of the bad rows, you can use a text editor to fix them manually, or show us examples of the bad data and we can suggest ways to fix it in R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 2:35:38 PM
Re: [R] sum some columns for each row
I used two rows to test the data frame, as follows. dat - read.table(TOV_43_Protein_Clusters_abundance1.tab, header=TRUE,sep = \t) dat1 - dat[1:2,] str(dat1) 'data.frame':2 obs. of 44 variables: $ X : Factor w/ 1075762 levels ,POV_Cluster_101,..: 305266 625028 $ X109DCM: Factor w/ 46 levels ,1,10,109DCM,..: 1 1 $ X109SUR: Factor w/ 41 levels ,1,10,109SUR,..: 1 1 $ X18DCM : Factor w/ 31 levels ,1,10,11,..: 1 1 $ X18SUR : Factor w/ 25 levels ,1,10,11,..: 1 1 $ X22SUR : Factor w/ 50 levels ,1,10,11,..: 1 2 $ X23DCM : Factor w/ 46 levels ,1,10,11,..: 1 1 $ X25DCM : Factor w/ 42 levels ,1,10,11,..: 1 1 $ X25SUR : Factor w/ 47 levels ,1,10,11,..: 1 1 $ X30DCM : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X31SUR : Factor w/ 43 levels ,1,10,11,..: 1 1 $ X32DCM : Factor w/ 15 levels ,1,10,11,..: 1 1 $ X32SUR : Factor w/ 58 levels ,1,10,11,..: 1 1 $ X34DCM : Factor w/ 53 levels ,1,10,11,..: 1 35 $ X34SUR : Factor w/ 47 levels ,1,10,11,..: 10 14 $ X36DCM : Factor w/ 48 levels ,1,10,11,..: 2 43 $ X36SUR : Factor w/ 45 levels ,1,10,11,..: 23 38 $ X38DCM : Factor w/ 40 levels ,1,10,11,..: 3 23 $ X38SUR : Factor w/ 44 levels ,1,10,11,..: 7 41 $ X39DCM : Factor w/ 38 levels ,1,10,11,..: 34 38 $ X39SUR : Factor w/ 40 levels ,1,10,11,..: 13 40 $ X41DCM : Factor w/ 47 levels ,1,10,11,..: 13 40 $ X41SUR : Factor w/ 40 levels ,1,10,11,..: 1 1 $ X42DCM : Factor w/ 48 levels ,1,10,11,..: 2 3 $ X42SUR : Factor w/ 41 levels ,1,10,11,..: 2 1 $ X46SUR : Factor w/ 31 levels ,1,10,11,..: 2 2 $ X52DCM : Factor w/ 49 levels ,1,10,11,..: 13 23 $ X64DCM : Factor w/ 35 levels ,1,10,11,..: 1 2 $ X64SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X65DCM : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X65SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X66DCM : Factor w/ 27 levels ,1,10,11,..: 1 1 $ X66SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X67SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X68DCM : Factor w/ 33 levels ,1,10,11,..: 1 1 $ X68SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X70MES : Factor w/ 23 levels ,1,10,11,..: 1 1 $ X70SUR : Factor w/ 37 levels ,1,10,11,..: 1 1 $ X72DCM : Factor w/ 40 levels ,1,10,11,..: 13 27 $ X72SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X76DCM : Factor w/ 44 levels ,1,10,11,..: 1 1 $ X76SUR : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X82DCM : Factor w/ 29 levels ,1,10,11,..: 1 1 $ X85DCM : Factor w/ 30 levels ,1,10,11,..: 1 1 Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I suspect your data frame dat has non-numeric data in some of the columns that have ABC in their names. Any column of a data frame can be numeric or not, but the data frame as a unit cannot be numeric. If your data file has odd characters in done of the otherwise-numeric columns, the whole column will be read in as a factor or character strings. Look at the output of str(dat) for columns that don't show num'. If you can find the column, and then one of the bad rows, you can use a text editor to fix them manually, or show us examples of the bad data and we can suggest ways to fix it in R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 2:35:38 PM PDT, Dawn dawn1...@gmail.com wrote: Hi, I used a small set of data (several columns and rows) and it works fine using the following command: abc - rowSums(test[,grep(ABC,names(test),fixed=T)],na.rm=T) But when I used the real big data table, Error in rowSums(dat[, grep(ABC, names(dat), fixed = T)], na.rm = T) : 'x' must be numeric Then it didn't work either using as.numeric(): as.numeric(dat) Error: (list) object cannot be coerced to type 'double' Thanks! Dawn On Fri, Jul 10, 2015 at 4:35 PM, Dawn dawn1...@gmail.com wrote: Thank you all and sorry for the data messing. It has worked! Best, Dawn On Fri, Jul 10, 2015 at 4:15 AM, Jim Lemon drjimle...@gmail.com wrote: Hi Dawn, Your data are a bit messed up, but try the following: colSums(dat[,grep(ABC,names(dat),fixed=TRUE)],na.rm=TRUE) colSums(dat[,grep(XYZ,names(dat),fixed=TRUE)],na.rm=TRUE) I'm assuming that you want to discard the NA values. Jim On Fri, Jul 10, 2015 at 6:52 AM, Rui Barradas ruipbarra...@sapo.pt wrote: Hello, Please use ?dput to give a data example, like this it's completely unreadable. If your data.frame is named 'dat' use dput(head(dat, 30)) # paste the outut of this in your mail And don't
[R-es] Subiendo a CRAN un paquete que incluye ejemplos con paralelización
Hola: Estoy mejorando un paquete que incluía varios procedimientos bootstrap mediante paralelización (con foreach, en Windows). El problema es que al hacer el R CMD check (uso RStudio) me da error, ya que CRAN no permite paralelización con más de dos núcleos (If running a package uses multiple threads/cores it must never use more than two simultaneously: the check farm is a shared resource and will typically be running many checks simultaneously.). He encontrado una sugerencia para evitar el problema: incluir un argumento opcional en mis funciones para evitar la paralelización en los ejemplos que suba a CRAN, es decir, incluir un argumento lógico del tipo 'parallel = TRUE' en las funciones que fijaría en FALSE en los ejemplos que incluya al subir el paquete a CRAN. Lo que ocurre es que me parece un poco chapuza y casi duplica el código innecesariamente. Otra opción sería utilizar la opción \donttest() en la parte de los ejemplos que usan paralelización, pero por experiencias anteriores me consta que CRAN sólo permite esta opción muy ocasionalmente. ¿Alguna idea para evitar el problema? Gracias de antemano. *Dr. Antonio José Sáez-Castillo*Department of Statistics and Operational Research Escuela Politécnica Superior de Linares UNIVERSIDAD DE JAÉN C/ Alfonso X el Sabio, 28 23700 Linares (Jaén) Teléfono: +34 953 64 85 78 e-mail: ajs...@ujaen.es https://www.researchgate.net/profile/Antonio_Saez-Castillo http://twitter.com/ajsaezUJA [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] sum some columns for each row
I attached the file including the first two rows and please help to make it the numeric data frame. Hopefully the following command works: dcm - rowSums(dat1[,grep(DCM,names(dat1),fixed=T)],na.rm=T) Thank you very much! Dawn On Tue, Jul 14, 2015 at 4:36 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: Well it is pretty obvious that all of your columns have non-numeric data in them, but you are the only one who can tell which ones should have been numeric, and you are also the one who can peruse your data file in a text editor. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 4:05:37 PM PDT, Dawn dawn1...@gmail.com wrote: I used two rows to test the data frame, as follows. dat - read.table(TOV_43_Protein_Clusters_abundance1.tab, header=TRUE,sep = \t) dat1 - dat[1:2,] str(dat1) 'data.frame':2 obs. of 44 variables: $ X : Factor w/ 1075762 levels ,POV_Cluster_101,..: 305266 625028 $ X109DCM: Factor w/ 46 levels ,1,10,109DCM,..: 1 1 $ X109SUR: Factor w/ 41 levels ,1,10,109SUR,..: 1 1 $ X18DCM : Factor w/ 31 levels ,1,10,11,..: 1 1 $ X18SUR : Factor w/ 25 levels ,1,10,11,..: 1 1 $ X22SUR : Factor w/ 50 levels ,1,10,11,..: 1 2 $ X23DCM : Factor w/ 46 levels ,1,10,11,..: 1 1 $ X25DCM : Factor w/ 42 levels ,1,10,11,..: 1 1 $ X25SUR : Factor w/ 47 levels ,1,10,11,..: 1 1 $ X30DCM : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X31SUR : Factor w/ 43 levels ,1,10,11,..: 1 1 $ X32DCM : Factor w/ 15 levels ,1,10,11,..: 1 1 $ X32SUR : Factor w/ 58 levels ,1,10,11,..: 1 1 $ X34DCM : Factor w/ 53 levels ,1,10,11,..: 1 35 $ X34SUR : Factor w/ 47 levels ,1,10,11,..: 10 14 $ X36DCM : Factor w/ 48 levels ,1,10,11,..: 2 43 $ X36SUR : Factor w/ 45 levels ,1,10,11,..: 23 38 $ X38DCM : Factor w/ 40 levels ,1,10,11,..: 3 23 $ X38SUR : Factor w/ 44 levels ,1,10,11,..: 7 41 $ X39DCM : Factor w/ 38 levels ,1,10,11,..: 34 38 $ X39SUR : Factor w/ 40 levels ,1,10,11,..: 13 40 $ X41DCM : Factor w/ 47 levels ,1,10,11,..: 13 40 $ X41SUR : Factor w/ 40 levels ,1,10,11,..: 1 1 $ X42DCM : Factor w/ 48 levels ,1,10,11,..: 2 3 $ X42SUR : Factor w/ 41 levels ,1,10,11,..: 2 1 $ X46SUR : Factor w/ 31 levels ,1,10,11,..: 2 2 $ X52DCM : Factor w/ 49 levels ,1,10,11,..: 13 23 $ X64DCM : Factor w/ 35 levels ,1,10,11,..: 1 2 $ X64SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X65DCM : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X65SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X66DCM : Factor w/ 27 levels ,1,10,11,..: 1 1 $ X66SUR : Factor w/ 35 levels ,1,10,11,..: 1 1 $ X67SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X68DCM : Factor w/ 33 levels ,1,10,11,..: 1 1 $ X68SUR : Factor w/ 36 levels ,1,10,11,..: 1 1 $ X70MES : Factor w/ 23 levels ,1,10,11,..: 1 1 $ X70SUR : Factor w/ 37 levels ,1,10,11,..: 1 1 $ X72DCM : Factor w/ 40 levels ,1,10,11,..: 13 27 $ X72SUR : Factor w/ 38 levels ,1,10,11,..: 1 1 $ X76DCM : Factor w/ 44 levels ,1,10,11,..: 1 1 $ X76SUR : Factor w/ 34 levels ,1,10,11,..: 1 1 $ X82DCM : Factor w/ 29 levels ,1,10,11,..: 1 1 $ X85DCM : Factor w/ 30 levels ,1,10,11,..: 1 1 Thank you!! Dawn On Tue, Jul 14, 2015 at 3:48 PM, Jeff Newmiller jdnew...@dcn.davis.ca.us wrote: I suspect your data frame dat has non-numeric data in some of the columns that have ABC in their names. Any column of a data frame can be numeric or not, but the data frame as a unit cannot be numeric. If your data file has odd characters in done of the otherwise-numeric columns, the whole column will be read in as a factor or character strings. Look at the output of str(dat) for columns that don't show num'. If you can find the column, and then one of the bad rows, you can use a text editor to fix them manually, or show us examples of the bad data and we can suggest ways to fix it in R. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 2:35:38
Re: [R] Parsing large amounts of csv data with limited RAM
You seem to want your cake and eat it too. Not unexpected, but you may have your work cut out to learn about the price of having it all. Plotting: pretty silly to stick with gigabytes of data in your plots. Some kind of aggregation seems required here, with the raw data being a stepping stone to that goal. Loading: if you don't have RAM, buy more or use one of the disk-based solutions. There are proprietary solutions for a fee, and there are packages like ff. When I have dealt with large data sets I have used sqldf or RODBC (which I think works best for read-only access), so I cannot advise you on ff. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On July 14, 2015 3:21:42 PM PDT, Dupuis, Robert dup...@beaconpower.com wrote: I'm relatively new to using R, and I am trying to find a decent solution for my current dilemma. Right now, I am currently trying to parse second data from a 7 months of CSV data. This is over 10GB of data, and I've run into some memory issues loading them all into a single dataset to be plotted. If possible, I'd really like to keep both the one second resolution, and all 100 or so columns intact to make things easier on myself. The problem I have is that the machine that is running this script only has 8GB of RAM. I've had issues parsing files with lapply, and some sort of csv reader. So far I've tried read.csv, readr.read_table, and data.table.fread with only fread having any sort of memory management (fread seems to crash on me however). The basic approach I am using is as follows: # Get the data files = list.files(pattern=*.csv) set - lapply(files, function(x) fread(x, header = T, sep = ',')) #replace fread with something that can parse csv data # Handle the data (Do my plotting down here) ... These processes work with smaller data sets, but I would like to in a worse case scenario be able to parse through 1 year data which would be around 20GB. Thank you for your time, Robert Dupuis __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Altering Forest plot in Metafor package
Dear All, I'm having trouble tweaking a forest plot made using the R meta-analysis package metafor. My main problem is that I have two studies which have very large Confidence intervals and as such my forest plot is very wide, and not neat. As I would like to add more descriptive columns into the plot too, I was wondering if there was a way to cut the confidence interval in the graph and add arrows to suggest that it continues on, while keeping the OR values correct so that the reader can view the CI clearly. http://r.789695.n4.nabble.com/file/n4709857/SNIP.png I hope I am clear in what I am asking, but here is an example of what I am hoping is possible in Metafor http://r.789695.n4.nabble.com/file/n4709857/arrows.png -- View this message in context: http://r.789695.n4.nabble.com/Altering-Forest-plot-in-Metafor-package-tp4709857.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot in Rcmdr
Dear David and Joanne, David, thank you for answering Joanne's question before I saw it. The help page for car::scatterplot() is also accessible via the Help button in the Rcmdr scatterplot dialog. I'll think about whether to add a control for legend position to the scatterplot dialog. There are already some enhancements to the dialog in the forthcoming version 2.2-0 of the Rcmdr package, due late this summer, but I try not to make the dialogs too complicated. Best, John --- John Fox, Professor McMaster University Hamilton, Ontario, Canada http://socserv.socsci.mcmaster.ca/jfox/ -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David L Carlson Sent: July-14-15 1:17 PM To: INGRAM Joanne; r-help@R-project.org Subject: Re: [R] Plot in Rcmdr It can be changed by slightly modifying the scatterplot() command in the R Script window and re-submitting it. From the top menu select Data | Data in packages | Read data set from an attached package. Then type Pottery in the space next to Enter name of data set (notice that Pottery is capitalized). From the top menu select Graphs | Scatterplot and then select Al as the x-variable and Ca as the y-variable. Click on Plot by groups... and select Site (and unselect Plot lines by group). Click OK and OK again to produce the plot. The legend is outside the plot region and the top margin has been expanded to make room for it. In the R Script window you will see the command: scatterplot(Ca~Al | Site, reg.line=lm, smooth=TRUE, spread=TRUE, id.method='mahal', id.n = 2, boxplots='xy', span=0.5, by.groups=FALSE, data=Pottery) add a single argument to the end of the command so that it looks like this: scatterplot(Ca~Al | Site, reg.line=lm, smooth=TRUE, spread=TRUE, id.method='mahal', id.n = 2, boxplots='xy', span=0.5, by.groups=FALSE, data=Pottery, legend.coords=topright) Then select all three lines and click Submit: The new plot puts the legend in the upper right corner of the plot region. R Commander uses the scatterplot() function from package ca to create the plot. It has several options that are not included on the options dialog window in R Commander, but can be accessed simply by editing the command that R Commander creates. To see these options type ?scatterplot On an empty line in the R Script window, put the cursor on the line and click Submit. This will open your web browser with the manual page for scatterplot. - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of INGRAM Joanne Sent: Tuesday, July 14, 2015 9:53 AM To: r-help@R-project.org Subject: [R] Plot in Rcmdr Hello, I wondered if anyone could help me with a small issue in Rcmdr. I have used the 'Graphs' function in the drop-down menu to create a scatterplot for groups (gender). But when I do this the legend (telling me the symbols which represent male etc.) keeps obscuring the title of the plot. Does anyone know how to fix this problem - within Rcmdr? Please note I am not looking for help with creating the graph in another way (for example in R). I am specifically trying to figure out if this can be fixed in Rcmdr. If the answer is No - this cannot currently be changed within Rcmdr I would still like to hear from you. Many thanks for any help. Joanne Ingram Research Associate (Medical Statistics) Centre for Population Health Science University of Edinburgh -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] open connection to system
On Tue, Jul 14, 2015 at 11:45 AM, Adrian Dușa dusa.adr...@unibuc.ro wrote: Dear list, Probably not the best subject line, but hopefully I can explain. I would like to use R and open a connection to a (system) command line base chess engine (for example, there is an open source one at stockfishchess.org) In the Terminal window (using MacOS), I can type two commands: $ ./stockfish-6-64 -- this is the first command Stockfish 6 64 by Tord Romstad, Marco Costalba and Joona Kiiski go movetime 3000 -- this is the second command (then lots of lines calculated by the engine, with a final answer after 3 seconds) First command opens a connection to the chess engine, the seconds one tells it to search for a move. The question is, can I do this via R? I tried the system() command, which works with the first command: system(./stockfish-6-64, intern=TRUE) [1] Stockfish 6 64 by Tord Romstad, Marco Costalba and Joona Kiiski but it closes the connection and returns an error if I attempt the second command: system(./stockfish-6-64\ngo movetime 3000, intern=TRUE) Error in system(./stockfish-6-64\ngo movetime 3000, intern = TRUE) : error in running command sh: line 1: go: command not found Any hint would be really appreciated, thanks in advance, Adrian -- Adrian Dusa University of Bucharest What system() does is run a command wait for it to end. I take it you are running on a Mac. Do you want to send multiple command to stockfish, or only one command? If the latter, you can do something like: commands=c(go movetime 3000); system(./stockfish-6-64,intern=TRUE,input=commands); If you want to send a number of commands, and not interact with the stockfish command, you can: commands=c(first command,second command); # and so on system(./stockfish-6-64,intern=TRUE,input=commands); But if you want to interact with stockfish, that's much more difficult and I don't have any example available. I _think_ you'd need to look at using mcparallel() and the parallel package. Or maybe the socketConnection() function in some way. -- Schrodinger's backup: The condition of any backup is unknown until a restore is attempted. Yoda of Borg, we are. Futile, resistance is, yes. Assimilated, you will be. He's about as useful as a wax frying pan. 10 to the 12th power microphones = 1 Megaphone Maranatha! John McKown [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Spline curve
Yikes! That almost requires a book. Why don't you start by doing some of your own homework instead. Search on: fit spline curves to longitudinal data in R and then go through the tutorials/presentations that pop up. There will also be books that you will find, I'm sure if you're serious about this. Cheers, Bert Bert Gunter Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. -- Clifford Stoll On Sun, Jun 28, 2015 at 9:56 PM, DzR devazresea...@gmail.com wrote: I wish to fit spline curves to longitudinal data Which package should I use and how should data be structured to facilitate the analysis - Deva __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.