Re: [R] R help - Web Scraping of Google News using R

2016-05-24 Thread boB Rudis
What you are doing wrong is both trying yourself and asking others to violate Google's Terms of Service and (amongst other things) get your IP banned along with anyone who aids you (or worse). Please don't. Just because something can be done does not mean it should be done. On Tue, May 24, 2016

Re: [R] Mixed model analysis

2016-05-24 Thread Bert Gunter
This has nothing to do with R, per se. This is a statistical issue. You need to work with a statistician, as your statistical background is inadequate (google "mixed effects models") if you really need this. Cheers, Bert On Tue, May 24, 2016 at 7:27 PM Neny Sitorus

[R] Mixed model analysis

2016-05-24 Thread Neny Sitorus
Hi, what is exactly mixed model analysis in R? could someone give me a better description. Thank you, Neny [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] Sprintf to call data frame from environment

2016-05-24 Thread Jim Lemon
Hi Beatriz, I'll guess that you have a number of files with names like this: Samples_1.txt Samples_2.txt ... Each one can be read with a function like read.table and will return a data frame with default names (V1, V2, ...). You then want to extract the first element (column) of the data frame.

Re: [R] Sprintf to call data frame from environment

2016-05-24 Thread David Winsemius
> On May 24, 2016, at 2:01 PM, Beatriz wrote: > > > In my environment I have a data frame called Samples_1.txt. > From this data frame I need to get variable V1. My code doesn't work. Thanks! > > $V1 > > Note: I need to do it in this way because I have the

Re: [R] Sprintf to call data frame from environment

2016-05-24 Thread Nordlund, Dan (DSHS/RDA)
It is not clear (at least to me) what your actual task is. But, if Samples_1.txt is the actual name of a data frame that exists in memory (and not a filename), then you need to wrap the sprintf() in a get() function. get(sprintf("Samples_%s.txt", 1))$V1 I am no expert on "computing on the

[R] Sprintf to call data frame from environment

2016-05-24 Thread Beatriz
In my environment I have a data frame called Samples_1.txt. From this data frame I need to get variable V1. My code doesn't work. Thanks! sprintf("Samples_%s.txt", 1)$V1 Note: I need to do it in this way because I have the code into a for loop.

[R] Sprintf to call data frame from environment

2016-05-24 Thread Beatriz
In my environment I have a data frame called Samples_1.txt. From this data frame I need to get variable V1. My code doesn't work. Note: I need to do it in this way because I have the code into a for loop. sprintf("Samples_%s.txt", 1)$V1 __

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
Thank you very much, Dan. These work great. Two more great answers to my question. Matthew On 5/24/2016 4:15 PM, Nordlund, Dan (DSHS/RDA) wrote: You have several options. 1. You could use the aggregate function. If your data frame is called DF, you could do something like with(DF,

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
Thanks, Tom. I was making a mistake looking at your example and that's what my problem was. Cool answer, works great. Thank you very much. Matthew On 5/24/2016 4:23 PM, Tom Wright wrote: > Don't see that as being a big problem. If your data grows then dplyr > supports connections to external

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Tom Wright
Don't see that as being a big problem. If your data grows then dplyr supports connections to external databases. Alternately if you just want a mean, most databases can do that directly in SQL. On Tue, May 24, 2016 at 4:17 PM, Matthew wrote: > Thank you very

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
Thank you very much, Tom. This gets me thinking in the right direction. One thing I should have mentioned that I did not is that the number of rows in the data frame will be a little over 40,000 rows. On 5/24/2016 4:08 PM, Tom Wright wrote: > Using dplyr > > $ library(dplyr) > $

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Nordlund, Dan (DSHS/RDA)
You have several options. 1. You could use the aggregate function. If your data frame is called DF, you could do something like with(DF, aggregate(Length, list(Identifier), mean)) 2. You could use the dplyr package like this library(dplyr) summarize(group_by(DF, Identifier),

Re: [R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Tom Wright
Using dplyr $ library(dplyr) $ x<-data.frame(Length=c(321,350,340,180,198), ID=c(rep('A234',3),'B123','B225') ) $ x %>% group_by(ID) %>% summarise(m=mean(Length)) On Tue, May 24, 2016 at 3:46 PM, Matthew wrote: > I have a data frame

[R] identify duplicate entries in data frame and calculate mean

2016-05-24 Thread Matthew
I have a data frame with 10 columns. In the last column is an alphaneumaric identifier. For most rows, this alphaneumaric identifier is unique to the file, however some of these alphanemeric idenitifiers occur in duplicate, triplicate or more. When they do occur more than once they are in

Re: [R] Creating a data frame from scratch

2016-05-24 Thread Nordlund, Dan (DSHS/RDA)
I would probably write the function something like this: t_count_na <- function(dataset, variables = "all") { if (identical(variables, "all")) { variable_list <- names(dataset) } else { variable_list <- variables } apply(dataset[,variable_list], 1,

[R] Creating a data frame from scratch

2016-05-24 Thread G . Maubach
Hi All, I need to create a data frame from scratch and fill variables created on the fly with values. What I have so far: -- schnipp -- # Example dataset gene <- c("ENSG0208234","ENSG0199674","ENSG0221622","ENSG0207604",

[R] R help - Web Scraping of Google News using R

2016-05-24 Thread Kumar Gauraw
Hello Experts, I am trying to scrap data from Google news for a particular topic using XML and Curl Package of R. I am able to extract the summary part of the news through *XPath* but in a similar way, I am trying to extract title and Links of news which is not working.Please note this work is

[R] Downloading attachment from gmail

2016-05-24 Thread Christofer Bogaso
Hi folks, I am wondering if it is really possible via some R code which shall do the following 1. Login to a Gmail account (account name and password will be provided to R) 2. Search for all mails which has a word "ABCD" in the mail body 3. Download all the attachments (if available) which will

Re: [R] numeric inputs to sweep produce NaN...

2016-05-24 Thread David Winsemius
> On May 24, 2016, at 8:49 AM, Witold E Wolski wrote: > > I have two inputs to sweep which are numeric (with a few NA's) but the > output is NaN. How Why? > > >> sum(!is.numeric(unlist(protquant))) > [1] 0 >> sum(!is.numeric(normalize)) > [1] 0 >> normprotquant <-

Re: [R] Factor Variable frequency

2016-05-24 Thread ruipbarradas
Hello, Maybe the following (untested). table(df$Protocol[df$Speed == "SLOW"]) Hope this helps, Rui Barradas   Citando ch.elahe via R-help : > Hi all, > I have the following df: > >    $ Protocol       : Factor w/ 48 levels "DP FS QTSE SAG",..: 2 3 > 43 42 31 36 37 30

[R] file connection when using parallel

2016-05-24 Thread Arnaud Mosnier
Dear UserRs, I have a little problem creating a file connection when working in parallel (see the reproducable script below). I am sure this is something obvious, Can you enlighten me ? Thanks, Arnaud # This part works # cat("This is a test file" , file={f <- tempfile()}) con

[R] Factor Variable frequency

2016-05-24 Thread ch.elahe via R-help
Hi all, I have the following df: $ Protocol : Factor w/ 48 levels "DP FS QTSE SAG",..: 2 3 43 42 31 36 37 30 28 5 ... $ Speed : chr "SLOW" "SLOW" "SLOW" "VerySLOW" ... How can I get the most frequent Protocol when Speed is "SLOW"? Thanks for any help! Elahe

[R] numeric inputs to sweep produce NaN...

2016-05-24 Thread Witold E Wolski
I have two inputs to sweep which are numeric (with a few NA's) but the output is NaN. How Why? > sum(!is.numeric(unlist(protquant))) [1] 0 > sum(!is.numeric(normalize)) [1] 0 > normprotquant <- sweep(protquant, 2, normalize, "-" ) > sum(is.nan(unlist(normprotquant))) [1] 31 version R 3.3.0

Re: [R] mgcv::gam(): NA parametric coefficient in a model with two categorical variables + model interpretation

2016-05-24 Thread Fotis Fotiadis
Dear Prof. Wood Thank you, again, for your immediate response. Best, Fotis On Mon, May 23, 2016 at 4:32 PM, Simon Wood wrote: > Q1: It looks like the model is not fully identifiably given the data and > as a result igcCAT.ideo has been set to zero - there is no sensible

[R-es] RV: aov

2016-05-24 Thread Dr. José A. Betancourt Bethencourt
Estimados ¿Cómo se podría modificar el script o su source para lograr que en la salida se vean de manera clara los números? En este ejemplo quedan parcialmente afuera. Adjunto datos y script Saludos José ##

[R] R Course in Dublin (July 20th-22nd, 2016) Intoductory -> Modern

2016-05-24 Thread Antony Unwin
An R course from introductory to modern will be given by Louis Aslett (Oxford University, author of the packages PhaseType and ReliabilityTheory) and Antony Unwin (author of the book “Graphical Data Analysis with R” CRC Press 2015 http://www.gradaanwr.net). The course will be offered again on

[R] Course: Introduction to Zero Inflated Models

2016-05-24 Thread Highland Statistics Ltd
There are places available on the following course: Course: Introduction to Zero Inflated Models (Bayesian and frequentist approaches) When: 13-17 June 2016 Where: Australian Institute of Marine Science, Perth, Australia Course website: http://highstat.com/statscourse.htm Course flyer: