Re: [R-es] help awk y shells en R
Estimado Javier Villacampa González Hace mucho que no uso awk o gawk, pero recuerdo cygwin y en lo personal no tuve inconvenientes con awk. No se como está esa tecnología hoy en día, pero yo evalué usar R con awk, al respecto hay una integración en http://www.inside-r.org/packages/cran/Kmisc/docs/awk Javier Rubén Marcuzzi Técnico en Industrias Lácteas Veterinario De: Javier Villacampa González Enviado el: lunes, 08 de junio de 2015 03:05 p.m. Para: Carlos Ortega CC: R-help-es@r-project.org Al final resulto más fácil de lo esperado. Hay que instalar cywin y utilizar los comandos de la siguiente manera system('C:/cygwin/bin/wc -l var_risco_2012.csv') Esto en principio funciona El 8 de junio de 2015, 17:41, Carlos Ortega c...@qualityexcellence.es escribió: Hola, Mira esto: http://stackoverflow.com/questions/18603984/using-system-with-windows Saludos, Carlos Ortega www.qualityexcellence.es El 8 de junio de 2015, 17:14, Javier Villacampa González javier.villacampa.gonza...@gmail.com escribió: Hola buenas, a veces empleo desde R shells de unix, Existe alguna manera de utilizar estos shelss desde windows o el lenguaje awk. La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows es posible. Pero no me queda claro Un abrazo y gracias por adelntado Javier #_ # EJEMPLO, ¿Que habría que poner en # ¿¿??? # suponiendoq que tengo cygwin instalado #_ # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee bien los Ë file.rename(from = Data/data.csv, to = Data/data_2.csv) switch(OS, WIN = system( ¿¿???), MAC = system( command = awk \'{gsub( \M.?G\,\MEG\); print}\' Data/data_2.csv Data/data_2_2.csv) ) file.rename(from = Data/data.csv, to = Data/data_2.csv) -- [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es -- Saludos, Carlos Ortega www.qualityexcellence.es -- [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] web scraping image
Thanks to Jim's prompting, I think I came up with a fairly painless way to parse the HTML without having to write any parsing code myself using the function getHTMLExternalFiles in the XML package. A working version of the code follows: ## Code to process USGS peak flow data require(dataRetrieval) require(XML) ## Need to start with list of gauge ids to process siteno - c('12142000','12134500','12149000') lstas -length(siteno) #length of locator list print(paste('Processsing...',siteno[1],' ',siteno[1], sep = )) datall - readNWISpeak(siteno[1]) for (a in 2:lstas) { # Print station being processed print(paste('Processsing...',siteno[a], sep = )) dat- readNWISpeak(siteno[a]) datall - rbind(datall,dat) } write.csv(datall, file = usgs_peaks.csv) # Retrieve ascii text files and graphics for (a in 1:lstas) { print(paste('Processsing...',siteno[a], sep = )) graphic.url - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'agency_cd=USGSformat=img', sep = ) usgs.img - getHTMLExternalFiles(graphic.url) graphic.img - paste('http://nwis.waterdata.usgs.gov',usgs.img, sep = ) peakfq.url - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'agency_cd=USGSformat=hn2', sep = ) tab.url - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'agency_cd=USGSformat=rdb', sep = ) graphic.fn - paste('graphic_',siteno[a],'.gif', sep = ) peakfq.fn - paste('peakfq_',siteno[a],'.txt', sep = ) tab.fn - paste('tab_',siteno[a],'.txt', sep = ) download.file(graphic.img,graphic.fn,mode='wb') download.file(peakfq.url,peakfq.fn) download.file(tab.url,tab.fn) } -- Message: 34 Date: Fri, 5 Jun 2015 08:59:04 +1000 From: Jim Lemon drjimle...@gmail.com To: Curtis DeGasperi curtis.degasp...@gmail.com Cc: r-help mailing list r-help@r-project.org Subject: Re: [R] web scraping image Message-ID: ca+8x3fv0ajw+e22jayv1gfm6jr_tazua5fwgd3t_mfgfqy2...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Hi Chris, I don't have the packages you are using, but tracing this indicates that the page source contains the relative path of the graphic, in this case: /nwisweb/data/img/USGS.12144500.19581112.20140309..0.peak.pres.gif and you already have the server URL: nwis.waterdata.usgs.gov getting the path out of the page source isn't difficult, just split the text at double quotes and get the token following img src=. If I understand the arguments of download.file correctly, the path is the graphic.fn argument and the server URL is the graphic.url argument. I would paste them together and display the result to make sure that it matches the image you want. When I did this, the correct image appeared in my browser. I'm using Google Chrome, so I don't have to prepend the http:// Jim On Fri, Jun 5, 2015 at 2:31 AM, Curtis DeGasperi curtis.degasp...@gmail.com wrote: I'm working on a script that downloads data from the USGS NWIS server. dataRetrieval makes it easy to quickly get the data in a neat tabular format, but I was also interested in getting the tabular text files - also fairly easy for me using download.file. However, I'm not skilled enough to work out how to download the nice graphic files that can be produced dynamically from the USGS NWIS server (for example: http://nwis.waterdata.usgs.gov/nwis/peak?site_no=12144500agency_cd=USGSformat=img ) My question is how do I get the image from this web page and save it to a local directory? scrapeR returns the information from the page and I suspect this is a possible solution path, but I don't know what the next step is. My code provided below works from a list I've created of USGS flow gauging stations. Curtis ## Code to process USGS daily flow data for high and low flow analysis ## Need to start with list of gauge ids to process ## Can't figure out how to automate download of images require(dataRetrieval) require(data.table) require(scrapeR) df - read.csv(usgs_stations.csv, header=TRUE) lstas -length(df$siteno) #length of locator list print(paste('Processsing...',df$name[1],' ',df$siteno[1], sep = )) datall - readNWISpeak(df$siteno[1]) for (a in 2:lstas) { # Print station being processed print(paste('Processsing...',df$name[a],' ',df$siteno[a], sep = )) dat- readNWISpeak(df$siteno[a]) datall - rbind(datall,dat) } write.csv(datall, file = usgs_peaks.csv) # Retrieve ascii text files and graphics for (a in 1:lstas) { print(paste('Processsing...',df$name[1],' ',df$siteno[1], sep = )) graphic.url - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no= ',df$siteno[a],'agency_cd=USGSformat=img', sep = ) peakfq.url - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no= ',df$siteno[a],'agency_cd=USGSformat=hn2', sep = ) tab.url - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no= ',df$siteno[a],'agency_cd=USGSformat=rdb', sep = ) graphic.fn -
Re: [R-es] help awk y shells en R
Al final resulto más fácil de lo esperado. Hay que instalar cywin y utilizar los comandos de la siguiente manera system('C:/cygwin/bin/wc -l var_risco_2012.csv') Esto en principio funciona El 8 de junio de 2015, 17:41, Carlos Ortega c...@qualityexcellence.es escribió: Hola, Mira esto: http://stackoverflow.com/questions/18603984/using-system-with-windows Saludos, Carlos Ortega www.qualityexcellence.es El 8 de junio de 2015, 17:14, Javier Villacampa González javier.villacampa.gonza...@gmail.com escribió: Hola buenas, a veces empleo desde R shells de unix, Existe alguna manera de utilizar estos shelss desde windows o el lenguaje awk. La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows es posible. Pero no me queda claro Un abrazo y gracias por adelntado Javier #_ # EJEMPLO, ¿Que habría que poner en # ¿¿??? # suponiendoq que tengo cygwin instalado #_ # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee bien los Ë file.rename(from = Data/data.csv, to = Data/data_2.csv) switch(OS, WIN = system( ¿¿???), MAC = system( command = awk \'{gsub( \M.?G\,\MEG\); print}\' Data/data_2.csv Data/data_2_2.csv) ) file.rename(from = Data/data.csv, to = Data/data_2.csv) -- [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es -- Saludos, Carlos Ortega www.qualityexcellence.es -- [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R] subsetting a dataframe
Dear all, would appreciate your suggestions on subsetting a dataframe : please let's consider an example dataframe df: dd-c(1,2,3) rows-c(A1,A2,A3) columns-c(B1,B2,B3) numbers - c(400, 500, 600) df - dataframe(dd,rows,columns, numbers) and a vector : test_rows -c(A1,A3) ; how could I subset the dataframe df function of vector test_rows, in such a way that only the lines of dataframe df (df$rows) that match the elements of test_rows (A1 and A3) are listed ? thank you very much, -- bogdan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] load a very big .RData - error reading from connection
Hi,How is it possible to load a very big .RData that can't be loaded it's very big and the following error msg is displayed load(.RData) Error: error reading from connection Thanks Carol [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Looping Through List of .csv Files to Work with Subsets of the Data
Hello, I want to subset specific rows of data from 80 .csv files and write those subsets into new .csv files. The data I want to subset starts on a different row for each original .csv file. I've created variables that identify which row the subset should start and end on, but I want to loop through this process and I am not sure what to do. I've attempted to write the loop below, albeit, much of it is pseudo code. If anyone can provide me with some tips I'd appreciate it. This data file is used to create the variables where the subsetting starts and ends for each participant mig.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/mig.data.csv) # These are the variable names for the start and end of each subset of relevant data (baseline, audio, and free) participant.ids - mig.processed.data$participant.id participant.baseline.start - mig.processed.data$baseline.row.start participant.baseline.end - mig.processed.data$baseline.row.end participant.audio.start - mig.processed.data$audio.meditation.row.start participant.audio.end - mig.processed.data$audio.meditation.row.end participant.free.start - mig.processed.data$free.meditation.row.start participant.free.end - mig.processed.data$free.meditation.row.end # read into a list the individual files from which to subset the data participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/MIG_RAW DATA TXT Files/Plain Text Files) # loop through each participant for (i in 1:length(participant.files)) { # get baseline rows results.baseline - participant.files[participant.baseline.start[i]:participant.baseline.end[i],] # get audio rows results.audio - participant.files[participant.audio.start[i]:participant.audio.end[i],] # get free rows results.free - participant.files[participant.free.start[i]:participant.free.end[i],] # write out participant relevant data write.csv(results.baseline, file=baseline[i].csv) write.csv(results.audio, file = audio[i].csv) write.csv(results.free, file = free[i].csv) } -- Chad M. Danyluck, MA PhD Candidate, Psychology University of Toronto “There is nothing either good or bad but thinking makes it so.” - William Shakespeare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] subsetting a dataframe
Use is.element(elements,set), or its equivalent, elements %in% set: df - data.frame(dd = c(1, 2, 3), rows = c(A1, A2, A3), columns = c(B1, B2, B3), numbers = c(400, 500, 600)) test_rows -c(A1,A3) df[ is.element(df$rows, test_rows), ] # dd rows columns numbers #1 1 A1 B1 400 #3 3 A3 B3 600 Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Jun 8, 2015 at 3:44 PM, Bogdan Tanasa tan...@gmail.com wrote: Dear all, would appreciate your suggestions on subsetting a dataframe : please let's consider an example dataframe df: dd-c(1,2,3) rows-c(A1,A2,A3) columns-c(B1,B2,B3) numbers - c(400, 500, 600) df - dataframe(dd,rows,columns, numbers) and a vector : test_rows -c(A1,A3) ; how could I subset the dataframe df function of vector test_rows, in such a way that only the lines of dataframe df (df$rows) that match the elements of test_rows (A1 and A3) are listed ? thank you very much, -- bogdan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looping Through List of .csv Files to Work with Subsets of the Data
Thank you Don. I've incorporated your suggestions which have helped me to understand how loops work better than previously. However, the loop gets stuck trying to read the current file: mig.processed.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/mig.log.data.addition.csv) ## ASSUMPTION: Starting with augmented processedbook and correct free.meditation.end Read in all data files and Loop through to create new data files segmented by the rows identified before # get required data participant.ids - mig.processed.data$participant.id participant.baseline.start - mig.processed.data$baseline.row.start participant.baseline.end - mig.processed.data$baseline.row.end participant.audio.start - mig.processed.data$audio.meditation.row.start participant.audio.end - mig.processed.data$audio.meditation.row.end participant.free.start - mig.processed.data$free.meditation.row.start participant.free.end - mig.processed.data$free.meditation.row.end participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/MIG_RAW DATA TXT Files/Plain Text Files) for (i in 1:length(participant.files)) { id - participant.files[i] ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this ## to ensure that the files sort properly when viewed by the operating #system idc - formatC(id, width=3, flag='0') #current file crnt.file[i] - read.csv( participant.files[i] ) ## base tmp.base - crnt.file[participant.baseline.start:participant.baseline.end, ] write.csv(tmp.base, file=paste0('baseline',idc,'.csv')) ## audio tmp.audio - crnt.file[participant.audio.start:participant.audio.end, ] write.csv(tmp.audio, file=paste0('audio',idc,'.csv')) ## free tmp.free - crnt.file[participant.free.start:participant.free.end, ] write.csv(tmp.free, file=paste0('free',idc,'.csv')) } The error message reads: Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file '103.csv': No such file or directory So it seems to be calling the first file in the list but getting stuck. Any suggestions? Best, Chad On Mon, Jun 8, 2015 at 8:07 PM, MacQueen, Don macque...@llnl.gov wrote: So you have 80 files, one for each participant? It appears that from each of the 80 files you want to extract three subsets of rows, one set for baseline one set for audio one set for free What I think I would do, if the above is correct, is create one master file. This file will have eight columns: (I'll show an example column name, followed by a description) id participant id fn file name for that participant srb start row for baseline erb end row for baseline sra start row for audio era end row for audio srf start row for free erf end row for free This may be fairly close to what you already have, but I'm not sure. I would then load the master file into R mstf - read.csv( {the master file} ) Then loop through its rows, and since each row has all the information necessary to read the participant's individual file and identify which rows to subset, a loop like this should work. for (irow in seq(nrow(mstf$id))) { id - mstf$id[irow] ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this ## to ensure that the files sort properly when viewed by the operating system idc - formatC(id, width=2, flag='0') crnt.file - read.csv( mstf$fn[irow] ) ## base tmp.base - crnt.file[ mstf$srb[irow]:mstf$erb[irow] , ] write.csv(tmp.base, file=paste0('baseline',idc,'.csv') ## audio tmp.audio - crnt.file[ mstf$sra[irow]:mstf$era[irow] , ] write.csv(tmp.audio, file=paste0('audio',idc,'.csv') ## free tmp.free - crnt.file[ mstf$srf[irow]:mstf$erf[irow] , ] write.csv(tmp.free, file=paste0('free',idc,'.csv') } Obviously, I can't test this. And there may be (likely are!) some typos in it. Note that it's not necessary to create variables that identify which row the subset should start and end on; these are just looked up from the master file when needed. Similarly, the three respective subsets are stored in temporary data frames, because they are not (I presume) needed when the whole thing is done. (if they were needed, then a different strategy would be more appropriate) There are different ways to index the loop. I just picked one. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/8/15, 2:48 PM, Chad Danyluck c.danyl...@gmail.com wrote: Hello, I want to subset specific rows of data from 80 .csv files and write those subsets into new .csv files. The data I want to subset starts on a different row for each original .csv file. I've created variables that identify which row the subset should start and end on, but I want to loop through this process and I am not sure what to do. I've attempted to write the loop below,
Re: [R] Looping Through List of .csv Files to Work with Subsets of the Data
participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/MIG_RAW DATA TXT Files/Plain Text Files) Try adding the argument full.names=TRUE to that call to list.files(). Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Jun 8, 2015 at 7:15 PM, Chad Danyluck c.danyl...@gmail.com wrote: Thank you Don. I've incorporated your suggestions which have helped me to understand how loops work better than previously. However, the loop gets stuck trying to read the current file: mig.processed.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/mig.log.data.addition.csv) ## ASSUMPTION: Starting with augmented processedbook and correct free.meditation.end Read in all data files and Loop through to create new data files segmented by the rows identified before # get required data participant.ids - mig.processed.data$participant.id participant.baseline.start - mig.processed.data$baseline.row.start participant.baseline.end - mig.processed.data$baseline.row.end participant.audio.start - mig.processed.data$audio.meditation.row.start participant.audio.end - mig.processed.data$audio.meditation.row.end participant.free.start - mig.processed.data$free.meditation.row.start participant.free.end - mig.processed.data$free.meditation.row.end participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/MIG_RAW DATA TXT Files/Plain Text Files) for (i in 1:length(participant.files)) { id - participant.files[i] ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this ## to ensure that the files sort properly when viewed by the operating #system idc - formatC(id, width=3, flag='0') #current file crnt.file[i] - read.csv( participant.files[i] ) ## base tmp.base - crnt.file[participant.baseline.start:participant.baseline.end, ] write.csv(tmp.base, file=paste0('baseline',idc,'.csv')) ## audio tmp.audio - crnt.file[participant.audio.start:participant.audio.end, ] write.csv(tmp.audio, file=paste0('audio',idc,'.csv')) ## free tmp.free - crnt.file[participant.free.start:participant.free.end, ] write.csv(tmp.free, file=paste0('free',idc,'.csv')) } The error message reads: Error in file(file, rt) : cannot open the connection In addition: Warning message: In file(file, rt) : cannot open file '103.csv': No such file or directory So it seems to be calling the first file in the list but getting stuck. Any suggestions? Best, Chad On Mon, Jun 8, 2015 at 8:07 PM, MacQueen, Don macque...@llnl.gov wrote: So you have 80 files, one for each participant? It appears that from each of the 80 files you want to extract three subsets of rows, one set for baseline one set for audio one set for free What I think I would do, if the above is correct, is create one master file. This file will have eight columns: (I'll show an example column name, followed by a description) id participant id fn file name for that participant srb start row for baseline erb end row for baseline sra start row for audio era end row for audio srf start row for free erf end row for free This may be fairly close to what you already have, but I'm not sure. I would then load the master file into R mstf - read.csv( {the master file} ) Then loop through its rows, and since each row has all the information necessary to read the participant's individual file and identify which rows to subset, a loop like this should work. for (irow in seq(nrow(mstf$id))) { id - mstf$id[irow] ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this ## to ensure that the files sort properly when viewed by the operating system idc - formatC(id, width=2, flag='0') crnt.file - read.csv( mstf$fn[irow] ) ## base tmp.base - crnt.file[ mstf$srb[irow]:mstf$erb[irow] , ] write.csv(tmp.base, file=paste0('baseline',idc,'.csv') ## audio tmp.audio - crnt.file[ mstf$sra[irow]:mstf$era[irow] , ] write.csv(tmp.audio, file=paste0('audio',idc,'.csv') ## free tmp.free - crnt.file[ mstf$srf[irow]:mstf$erf[irow] , ] write.csv(tmp.free, file=paste0('free',idc,'.csv') } Obviously, I can't test this. And there may be (likely are!) some typos in it. Note that it's not necessary to create variables that identify which row the subset should start and end on; these are just looked up from the master file when needed. Similarly, the three respective subsets are stored in temporary data frames, because they are not (I presume) needed when the whole thing is done. (if they were needed, then a different strategy would be more appropriate) There are different ways to index the loop. I just picked one. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062
Re: [R] Looping Through List of .csv Files to Work with Subsets of the Data
So you have 80 files, one for each participant? It appears that from each of the 80 files you want to extract three subsets of rows, one set for baseline one set for audio one set for free What I think I would do, if the above is correct, is create one master file. This file will have eight columns: (I'll show an example column name, followed by a description) id participant id fn file name for that participant srb start row for baseline erb end row for baseline sra start row for audio era end row for audio srf start row for free erf end row for free This may be fairly close to what you already have, but I'm not sure. I would then load the master file into R mstf - read.csv( {the master file} ) Then loop through its rows, and since each row has all the information necessary to read the participant's individual file and identify which rows to subset, a loop like this should work. for (irow in seq(nrow(mstf$id))) { id - mstf$id[irow] ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this ## to ensure that the files sort properly when viewed by the operating system idc - formatC(id, width=2, flag='0') crnt.file - read.csv( mstf$fn[irow] ) ## base tmp.base - crnt.file[ mstf$srb[irow]:mstf$erb[irow] , ] write.csv(tmp.base, file=paste0('baseline',idc,'.csv') ## audio tmp.audio - crnt.file[ mstf$sra[irow]:mstf$era[irow] , ] write.csv(tmp.audio, file=paste0('audio',idc,'.csv') ## free tmp.free - crnt.file[ mstf$srf[irow]:mstf$erf[irow] , ] write.csv(tmp.free, file=paste0('free',idc,'.csv') } Obviously, I can't test this. And there may be (likely are!) some typos in it. Note that it's not necessary to create variables that identify which row the subset should start and end on; these are just looked up from the master file when needed. Similarly, the three respective subsets are stored in temporary data frames, because they are not (I presume) needed when the whole thing is done. (if they were needed, then a different strategy would be more appropriate) There are different ways to index the loop. I just picked one. -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 6/8/15, 2:48 PM, Chad Danyluck c.danyl...@gmail.com wrote: Hello, I want to subset specific rows of data from 80 .csv files and write those subsets into new .csv files. The data I want to subset starts on a different row for each original .csv file. I've created variables that identify which row the subset should start and end on, but I want to loop through this process and I am not sure what to do. I've attempted to write the loop below, albeit, much of it is pseudo code. If anyone can provide me with some tips I'd appreciate it. This data file is used to create the variables where the subsetting starts and ends for each participant mig.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/mig.data.csv) # These are the variable names for the start and end of each subset of relevant data (baseline, audio, and free) participant.ids - mig.processed.data$participant.id participant.baseline.start - mig.processed.data$baseline.row.start participant.baseline.end - mig.processed.data$baseline.row.end participant.audio.start - mig.processed.data$audio.meditation.row.start participant.audio.end - mig.processed.data$audio.meditation.row.end participant.free.start - mig.processed.data$free.meditation.row.start participant.free.end - mig.processed.data$free.meditation.row.end # read into a list the individual files from which to subset the data participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG - Dissertation/Data Syntax/MIG_RAW DATA TXT Files/Plain Text Files) # loop through each participant for (i in 1:length(participant.files)) { # get baseline rows results.baseline - participant.files[participant.baseline.start[i]:participant.baseline.end[i ],] # get audio rows results.audio - participant.files[participant.audio.start[i]:participant.audio.end[i],] # get free rows results.free - participant.files[participant.free.start[i]:participant.free.end[i],] # write out participant relevant data write.csv(results.baseline, file=baseline[i].csv) write.csv(results.audio, file = audio[i].csv) write.csv(results.free, file = free[i].csv) } -- Chad M. Danyluck, MA PhD Candidate, Psychology University of Toronto ³There is nothing either good or bad but thinking makes it so.² - William Shakespeare [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org
[R] Summarizing data based on Date
Hi All, I have a data set with 11000 rows 19 columns. I have 2 columns on which I need to summarize the data:- Date Weight. Snapshot is : Date 13/03/2015 31/03/2015 15/03/2015 17/03/2015 17/03/2015 11/3/2015 11/3/2015 19/03/2015 CHG_WT 0 0 0 770 3,730 70 10 500 Now I need to summarize this data based on Day wise trend of weight however I have tried bifurcating and truncating the date and saw multiple options over the web - zoo package, iso week etc but I am not sure on how to reach to this analysis. If you experts can please suggest how to achieve the requirement. Thanks, Shivi -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Mean error message missing
Dear list, I found an odd behavior of the mean function; it is allowed to do something that you probably shouldn't: If you calculate mean() of a sequence of numbers (without declaring them as vector), mean() then just computes mean() of the first element. Is there a reason why there is no warning, like in sd for example? Example code: mean(1,2,3,4) sd(1,2,3,4) Best regards Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Text Mining - Remove punctuation not removing quotes and dashes
Hi, I have been doing some text mining. I created the DTM matrix using the following steps. corpus1-VCorpus(VectorSource(resume1$Dat1)) corpus1-tm_map(corpus1,content_transformer(tolower)) dtm-DocumentTermMatrix(corpus1, control = list(removePunctuation = TRUE, removeNumbers = TRUE, removeSparseTerms=TRUE, stopwords = TRUE)) After all the run I am still getting words like -quotation, fun, model , etc. What can I do about it. I do not need this dahses and extra quotations. -- Anindya Sankar Dey [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean error message missing
On Mon, 8 Jun 2015, Christian Brandstätter wrote: Dear list, I found an odd behavior of the mean function; it is allowed to do something that you probably shouldn't: If you calculate mean() of a sequence of numbers (without declaring them as vector), mean() then just computes mean() of the first element. Is there a reason why there is no warning, like in sd for example? mean() - unlike sd() - is a generic function that has a '...' argument that is passed on to its methods. The default method which is called in your example also has a '...' argument (because the generic has it) but doesn't use it. Example code: mean(1,2,3,4) sd(1,2,3,4) Best regards Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems editing R console
Thanks all off you ;) I think I got it. I was saving the workplace and loading it, but after that I wasn’t calling my data ;) really naive. Thanks very much. best RO Atenciosamente, Rosa Oliveira -- Rosa Celeste dos Santos Oliveira, E-mail: rosit...@gmail.com Tlm: +351 939355143 Linkedin: https://pt.linkedin.com/in/rosacsoliveira Many admire, few know Hippocrates On 08 Jun 2015, at 01:30, Mark Sharp msh...@txbiomed.org wrote: Rosa, See save() and load() functions for background. However, I suspect you will want to do something as described in the article in this link http://www.fromthebottomoftheheap.net/2012/04/01/saving-and-loading-r-objects/ Mark R. Mark Sharp, Ph.D. Director of Primate Records Database Southwest National Primate Research Center Texas Biomedical Research Institute P.O. Box 760549 San Antonio, TX 78245-0549 Telephone: (210)258-9476 e-mail: msh...@txbiomed.org On Jun 7, 2015, at 5:58 PM, Rosa Oliveira rosit...@gmail.com wrote: Dear Mark, I’ll try to explain better. Imagine I write: library(foreign) library(nlme) set.seed(1000) n.sample-1 #sample size M - 5 DP_x - 2 x - rnorm(n.sample,M,DP_x) p - pnorm(-3+x) y - rbinom(n.sample,1,p) dp_erro - 0.01 erro - rnorm(n.sample,0,dp_erro) x.erro - x+erro but with a function, with 2000 simulations. I save my “output” and I get X.erro in a .txt file. (text edit file). I do another setting with DP_x=3 and save, and so on. For some reason I realize I’ve done my simulation the wrong way and I have to apply a correction, for example: x.erro = 1.4X+erro, i.e. in the truth I could use my first X and erro values in each setting, but as it is in a .txt file I can’t use them any more. Is there a way to save the results in a format that I can use the values? Just apply my corrections and don’t have to do the 2000 simulations for each setting again? My problem is that the function I use takes 3 days running, and just 500 simulations :( Best, RO Atenciosamente, Rosa Oliveira -- smile.jpg Rosa Celeste dos Santos Oliveira, E-mail: rosit...@gmail.com Tlm: +351 939355143 Linkedin: https://pt.linkedin.com/in/rosacsoliveira Many admire, few know Hippocrates On 07 Jun 2015, at 23:03, Mark Sharp msh...@txbiomed.org wrote: I cannot understand your request as stated. Can you provide a small example? Mark R. Mark Sharp, Ph.D. msh...@txbiomed.org On Jun 7, 2015, at 2:49 PM, Rosa Oliveira rosit...@gmail.com wrote: Dear all, I’m doing simulations on R, and as my code is being changed and improved I need to, sometimes, work in finished simulations, i.e, After my simulation is over I need to settle another setting. The problem is that I need to get back to the previous result. When I save the result it saves as txt, so I can’t edit that result any more. Imagine I save a setting and save the mean, nonetheless, in another setting the mean as problems, so I have to ask the median. As I have to have the same statistics to all settings, nowadays I have to run my first setting again. My advisor told me that I could save another way so I can “edit” my first result. Is it possible? I tried to save as save my workplace, … but after I don’t know what to do with it. Can you please help me? I know is a naive question, but I have to go through this every 3 days (time each simulation takes long). And my work is being delayed :( Best, RO Atenciosamente, Rosa Oliveira -- Rosa Celeste dos Santos Oliveira, E-mail: rosit...@gmail.com Tlm: +351 939355143 Linkedin: https://pt.linkedin.com/in/rosacsoliveira Many admire, few know Hippocrates __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] source code for dbeta
Hi, Thanks a lot. I downloaded the tar.gz file and I found the C code. I would really appreciate it if you could field another question: I have to use sql, and I have to perform various statistical calculations - like integrate, dbeta etc. Sql does not have these functions, plus they are very difficult to code. Would it be possible to use the C code, compile it and deploy it in sql? Is that feasible, or even permitted? Thanks once again, I'm very grateful. On Mon, Jun 8, 2015 at 2:06 AM, Duncan Murdoch murdoch.dun...@gmail.com wrote: On 07/06/2015 6:11 PM, Mark Sharp wrote: Varun, If you type dbeta at the command line you get the R source, which in this case tells you that the code is calling a compiled source. This is indicated by the line bytecode: 0x7fc3bb1b84e0 No, that says that the R code (what is shown) is compiled. What indicates that this is C code is the use of .Call. The C_dbeta and C_dnbeta objects are NativeSymbolInfo objects that hold the pointers to the C entry points. Since it is in a base package (stats), the source is in the R sources, somewhere in https://svn.r-project.org/R/trunk/src/library/stats/src. You can search through those files for the dbeta or dnbeta functions. The C_ prefix is conventionally used in the R sources to indicate that it is C code; generally you replace it with do_ in the actual C code. This particular function is actually not really in the package source; it's in the main part of the R sources, in file https://svn.r-project.org/R/trunk/src/nmath/dbeta.c (though it takes a few steps to get there, starting in the stats package function do_dbeta). Duncan Murdoch See the following. dbeta function (x, shape1, shape2, ncp = 0, log = FALSE) { if (missing(ncp)) .Call(C_dbeta, x, shape1, shape2, log) else .Call(C_dnbeta, x, shape1, shape2, ncp, log) } bytecode: 0x7fc3bb1b84e0 environment: namespace:stats Compiled code in a package If you want to view compiled code in a package, you will need to download/unpack the package source. The installed binaries are not sufficient. A package's source code is available from the same CRAN (or CRAN compatible) repository that the package was originally installed from. The download.packages() function can get the package source for you. Extracted from http://stackoverflow.com/questions/19226816/how-can-i-view-the-source-code-for-a-function Mark R. Mark Sharp, Ph.D. msh...@txbiomed.org On Jun 7, 2015, at 4:31 AM, Varun Sinha sinha.varun...@gmail.com wrote: Hi, I am trying to find the source code for dbeta function. I tried edit(dbeta) and this is what I got: edit(dbeta) function (x, shape1, shape2, ncp = 0, log = FALSE) { if (missing(ncp)) .Call(C_dbeta, x, shape1, shape2, log) else .Call(C_dnbeta, x, shape1, shape2, ncp, log) } environment: namespace:stats It looks like it is calling calling C_dbeta, but I'm not sure. If it does, how do I find it's source code? Thank you! Varun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summarizing data based on Date
Hi Is your Date really Date or is it character? What is result of str(Date) If you want to det summaries for dates you can use ?aggregate However in this case I strongly recommend to show us your data by dput(yourdata) and explain on the example what summary do you want. I can be completely wrong but maybe aggregate(CHG_WT, list(format(Date, %d), sum) can get you required values. Cheers Petr -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82 Sent: Monday, June 08, 2015 10:08 AM To: r-help@r-project.org Subject: [R] Summarizing data based on Date Hi All, I have a data set with 11000 rows 19 columns. I have 2 columns on which I need to summarize the data:- Date Weight. Snapshot is : Date 13/03/2015 31/03/2015 15/03/2015 17/03/2015 17/03/2015 11/3/2015 11/3/2015 19/03/2015 CHG_WT 0 0 0 770 3,730 70 10 500 Now I need to summarize this data based on Day wise trend of weight however I have tried bifurcating and truncating the date and saw multiple options over the web - zoo package, iso week etc but I am not sure on how to reach to this analysis. If you experts can please suggest how to achieve the requirement. Thanks, Shivi -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date- tp4708328.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R to HTML problem
Dear Jim Lemon, Thank you very much Jim for your help. Regards, Pijush [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail
Aehm, adding on this: I incorrectly *assumed* without testing that rounding would help; it doesn't: ecdf(round(test2,0))# a rounding that is way too rough for my application... #Error in xy.coords(x, y) : 'x' and 'y' lengths differ Digging deeper: The initially mentioned call to unique() is not very helpful, as test2 is a data frame, so I get what I deserve, an unchanged data frame with 1 row. Still, the issue remains and can even be simplified further: ecdf(data.frame(a=3, b=4)) Empirical CDF Call: ecdf(data.frame(a = 3, b = 4)) x[1:2] = 3, 4 works ok, but ecdf(data.frame(a=3, b=3)) Error in xy.coords(x, y) : 'x' and 'y' lengths differ doesn't (same for a=b=1 or 2, so likely the same for any a=b). Instead, ecdf(c(a=3, b=3)) Empirical CDF Call: ecdf(c(a = 3, b = 3)) x[1:1] = 3 does the trick. From ?ecdf, I get that x should be a numeric vector - apparently, my misuse of the function by applying it to a row of a data frame (i.e. a data frame with one row). In all my other (dozens of) cases that worked ok, though but not for this particular one. A simple unlist() helps: ecdf(unlist(data.frame(a=3, b=3))) Empirical CDF Call: ecdf(unlist(data.frame(a = 3, b = 3))) x[1:1] = 3 Yet, I'm even more confused than before: in my other data, there were also duplicated values in the vector (1-row-data frame), and it never caused any issue. For this particular example, it does. I must be missing something fundamental... Michael -Original Message- From: Meyners, Michael Sent: Montag, 8. Juni 2015 12:02 To: 'r-help@r-project.org' Subject: mismatch between match and unique causing ecdf (well, approxfun) to fail All, I encountered the following issue with ecdf which was originally on a vector of length 10,000, but I have been able to reduce it to a minimal reproducible example (just to avoid questions why I'd want to do this for a vector of length 2...): test2 = structure(list(X817 = 3.39824670255344, X4789 = 3.39824670255344), .Names = c(X817, X4789), row.names = 74L, class = data.frame) ecdf(test2) # Error in xy.coords(x, y) : 'x' and 'y' lengths differ In an attempt to track this down, it occurs that unique(test2) # X817X4789 #74 3.398247 3.398247 while match(test2, unique(test2)) #[1] 1 1 matches both values to the first one. This causes a hiccup in the call to ecdf, as this uses (an equivalent to) a call to approxfun with x = test2 and y = cumsum(tabulate(match(test2, unique(test2, the latter now containing one entry less than the former, so xy.coords fails. I understand that the issue should be somehow related to FAQ 7.31, but I would have hoped that unique and match would be using the same precision and hence both or neither would consider the two values identical, but not one match while unique doesn't. Last but not least, it doesn't really cause an issue on my end (other than breaking my code and hence out of a loop at first place...); rounding will help w/o noteworthy changes to the outcome, so no need to propose a workaround :-) I'd rather like to raise the issue and learn whether there is a purpose for this behavior, and/or whether there is a generic fix to this, or whether I am completely missing something. Version info (under Windows 7): R version 3.2.0 (2015-04-16) -- Full of Ingredients Platform: x86_64-w64-mingw32/x64 (64-bit) Cheers, Michael __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summarizing data based on Date
Hi Petr, Thanks for the explanation below. I tried the code you supplied however it seems as my date is a factor hence it is not working. The error I got from the code was : Error: unexpected symbol in: final-aggregate(test$CHG_WT,list(format(test$CR_DT,%d),sum) final str(test$CR_DT)- gives Factor with 31 levels -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708333.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] mismatch between match and unique causing ecdf (well, approxfun) to fail
All, I encountered the following issue with ecdf which was originally on a vector of length 10,000, but I have been able to reduce it to a minimal reproducible example (just to avoid questions why I'd want to do this for a vector of length 2...): test2 = structure(list(X817 = 3.39824670255344, X4789 = 3.39824670255344), .Names = c(X817, X4789), row.names = 74L, class = data.frame) ecdf(test2) # Error in xy.coords(x, y) : 'x' and 'y' lengths differ In an attempt to track this down, it occurs that unique(test2) # X817X4789 #74 3.398247 3.398247 while match(test2, unique(test2)) #[1] 1 1 matches both values to the first one. This causes a hiccup in the call to ecdf, as this uses (an equivalent to) a call to approxfun with x = test2 and y = cumsum(tabulate(match(test2, unique(test2, the latter now containing one entry less than the former, so xy.coords fails. I understand that the issue should be somehow related to FAQ 7.31, but I would have hoped that unique and match would be using the same precision and hence both or neither would consider the two values identical, but not one match while unique doesn't. Last but not least, it doesn't really cause an issue on my end (other than breaking my code and hence out of a loop at first place...); rounding will help w/o noteworthy changes to the outcome, so no need to propose a workaround :-) I'd rather like to raise the issue and learn whether there is a purpose for this behavior, and/or whether there is a generic fix to this, or whether I am completely missing something. Version info (under Windows 7): R version 3.2.0 (2015-04-16) -- Full of Ingredients Platform: x86_64-w64-mingw32/x64 (64-bit) Cheers, Michael __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] usar Selenium para web scraping
Hola, No sé si esta respuesta en Stack Overflow te puede ayudar: http://stackoverflow.com/questions/26938118/check-for-dialog-box-in-rselenium Saludos, Carlos Ortega www.qualityexcellence.es El 8 de junio de 2015, 9:09, José Luis Cañadas Reche canadasre...@gmail.com escribió: Gracias Javier y Carlos. El tema es que relenium me da error al iniciar firefox y lo cierra. En la página github del paquete https://github.com/LluisRamon/relenium dicen que lo discontinúan debido a la aparición de otro paquete RSelenium. Y aquí es dónde me pierdo, no he averiguado como acceder a los valores de un combo utilizando RSelenium. Saludos. El 05/06/15 a las 15:40, javier.ruben.marcu...@gmail.com escribió: Estimado José Luis Cañadas En lo personal el trabajo de Gregorio que cita Carlos me fue de mucha ayuda, lo único que Rselenium tiene un comportamiento algo extraño, mi problema es en dos líneas, la primera sobre ejemplos que no funcionan (algo cambió), pero la importante es sobre mi trabajo, luego de horas de web scraping por alguna razón da un error, este tiene que ver con el recorrido de todas las opciones de un combo (serán 200), y en la mitad informa un error relacionado con encontrar el id en HTML que tiene que recorrer (aunque ya lo recorrió varias veces). Este error no supe solucionarlo, en caso de no tener que llenar formularios HTML rvest suele ser más rápido. Javier Rubén Marcuzzi Técnico en Industrias Lácteas Veterinario *De:* Carlos Ortega c...@qualityexcellence.es *Enviado el:* viernes, 05 de junio de 2015 08:49 a.m. *Para:* jose luis cañadas canadasre...@gmail.com *CC:* R-help-es@r-project.org r-help-es@r-project.org Hola José Luis, Además de lo que puso en su blog, Gregorio hizo una presentación muy clara de cómo usar RSelenium en el grupo de R de Madrid. El video de lo que contó es este: https://vimeo.com/96023824 Por si en él encuentras la clave Saludos, Carlos Ortega www.qualityexcellence.es El 5 de junio de 2015, 13:28, José Luis Cañadas Reche canadasre...@gmail.com escribió: Hola. Tengo que bajarme varias tablas del INE y necesito interactuar con el navegador. Ví el fantástico post que escribió Gregorio Serrano (que la tierra le sea leve), en http://www.grserrano.net/wp/2014/01/relenium-el-siguiente-nivel-de-web-scraping-con-r/ y estoy intentando reproducirlo para aprender como funciona relenium Pero relenium me da error después de if(!require(relenium)) install.packages(relenium) precios - http://www.ine.es/jaxi/tabla.do?path=/t38/bme2/t07/a081/l0/file=1300010.pxtype=pcaxisL=0 firefox - firefoxClass$new() Error in exceptionTable[, 1] : subíndice fuera de los límites Total que me he puesto a trastear con RSelenium, y consigo seleccionar el elemento combobox pero no sé como obtener los valores que muestra ni como seleccionarlos. ¿Alguna idea? library(RSelenium) checkForServer() startServer() remDr - remoteDriver(remoteServerAddr = localhost , port = , browserName = firefox ) remDr$open() remDr$navigate(precios) # buscar por id webElem1 - remDr$findElement(using = 'id', value = 'cri1') ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Mean error message missing
Thank you for the explanation. But if you take for instance plot.default(), being another generic function, it would not work like that: plot(1,2,3,4), only plot(1,2) is accepted. From R-help (Usage): ## Default S3 method: mean(x, trim = 0, na.rm = FALSE, ...) What is puzzling, is that apparently na.rm (and trim, which is indicated in the help) is accepting numeric values. mean(c(1,NA,10),10,TRUE) mean(c(1,NA,10),10,FALSE) This should give at least a warning in my opinion. mean(c(1,NA,10),10,200) On 08/06/2015 09:27, Achim Zeileis wrote: On Mon, 8 Jun 2015, Christian Brandst�tter wrote: Dear list, I found an odd behavior of the mean function; it is allowed to do something that you probably shouldn't: If you calculate mean() of a sequence of numbers (without declaring them as vector), mean() then just computes mean() of the first element. Is there a reason why there is no warning, like in sd for example? mean() - unlike sd() - is a generic function that has a '...' argument that is passed on to its methods. The default method which is called in your example also has a '...' argument (because the generic has it) but doesn't use it. Example code: mean(1,2,3,4) sd(1,2,3,4) Best regards Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Cannot Sum with DDPLY
Hi All, Kindly see the below code I have used: maxorder-ddply(test, ~ ORIGIN,summarize,Weight=sum(CHG_WT)) Here I have written the code to summarize values based on origin and total weight however I am getting below error: Error: ‘sum’ not meaningful for factors Please advice. I need CHG_WT total for each state in the Origin column. -- View this message in context: http://r.789695.n4.nabble.com/Cannot-Sum-with-DDPLY-tp4708338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] source code for dbeta
On 07/06/2015 11:05 PM, Varun Sinha wrote: Hi, Thanks a lot. I downloaded the tar.gz file and I found the C code. I would really appreciate it if you could field another question: I have to use sql, and I have to perform various statistical calculations - like integrate, dbeta etc. Sql does not have these functions, plus they are very difficult to code. Would it be possible to use the C code, compile it and deploy it in sql? Is that feasible, or even permitted? It is permitted for local use only without conditions. If you want to deploy the application, your application must be licensed under the GPL. As to the practicality: see the Writing R Extensions manual, section 6.16, which describes how to link many math functions (I forget if dbeta is included) into your own C code. Linking those to your database system will strongly depend on which database system you're using, and I think for all of them the question would be off topic here. You need to ask on their help forum. Duncan Murdoch Thanks once again, I'm very grateful. On Mon, Jun 8, 2015 at 2:06 AM, Duncan Murdoch murdoch.dun...@gmail.com mailto:murdoch.dun...@gmail.com wrote: On 07/06/2015 6:11 PM, Mark Sharp wrote: Varun, If you type dbeta at the command line you get the R source, which in this case tells you that the code is calling a compiled source. This is indicated by the line bytecode: 0x7fc3bb1b84e0 No, that says that the R code (what is shown) is compiled. What indicates that this is C code is the use of .Call. The C_dbeta and C_dnbeta objects are NativeSymbolInfo objects that hold the pointers to the C entry points. Since it is in a base package (stats), the source is in the R sources, somewhere in https://svn.r-project.org/R/trunk/src/library/stats/src. You can search through those files for the dbeta or dnbeta functions. The C_ prefix is conventionally used in the R sources to indicate that it is C code; generally you replace it with do_ in the actual C code. This particular function is actually not really in the package source; it's in the main part of the R sources, in file https://svn.r-project.org/R/trunk/src/nmath/dbeta.c (though it takes a few steps to get there, starting in the stats package function do_dbeta). Duncan Murdoch See the following. dbeta function (x, shape1, shape2, ncp = 0, log = FALSE) { if (missing(ncp)) .Call(C_dbeta, x, shape1, shape2, log) else .Call(C_dnbeta, x, shape1, shape2, ncp, log) } bytecode: 0x7fc3bb1b84e0 environment: namespace:stats Compiled code in a package If you want to view compiled code in a package, you will need to download/unpack the package source. The installed binaries are not sufficient. A package's source code is available from the same CRAN (or CRAN compatible) repository that the package was originally installed from. The download.packages() function can get the package source for you. Extracted from http://stackoverflow.com/questions/19226816/how-can-i-view-the-source-code-for-a-function Mark R. Mark Sharp, Ph.D. msh...@txbiomed.org On Jun 7, 2015, at 4:31 AM, Varun Sinha sinha.varun...@gmail.com mailto:sinha.varun...@gmail.com wrote: Hi, I am trying to find the source code for dbeta function. I tried edit(dbeta) and this is what I got: edit(dbeta) function (x, shape1, shape2, ncp = 0, log = FALSE) { if (missing(ncp)) .Call(C_dbeta, x, shape1, shape2, log) else .Call(C_dnbeta, x, shape1, shape2, ncp, log) } environment: namespace:stats It looks like it is calling calling C_dbeta, but I'm not sure. If it does, how do I find it's source code? Thank you! Varun [[alternative HTML version deleted]] __ R-help@r-project.org mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailto:R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
David,I appreciate you suggestion, but it won't work for me. I need to replace the space for a period at the time the data are read, not afterward. My variables names have periods I want to keep, if I use your suggestion I will replace the period inserted when the data are read, as well as the period that I want to keep. Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) David L Carlson dcarl...@tamu.edu 06/08/15 10:21 AM You can use gsub() to change the names: dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15)) dat Var.1Var.2 1 9.627122 14.15376 2 10.741617 16.92937 3 8.492926 15.23767 4 12.226146 15.19834 5 8.829982 14.46957 names(dat) - gsub(\\., _, names(dat)) dat Var_1Var_2 1 9.627122 14.15376 2 10.741617 16.92937 3 8.492926 15.23767 4 12.226146 15.19834 5 8.829982 14.46957 - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Monday, June 8, 2015 9:16 AM Cc: r-help@r-project.org Subject: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Call Send SMS Call from mobile Add to Skype You'll need Skype CreditFree via Skype Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
John, I like using stringr or stringi for this type of thing. stringi is written in C and faster so I now typically use it. You can also use base functions. The main trick is the handy names() function. example - data.frame(Col 1 A = 1:3, Col 1 B = letters[1:3]) example Col.1.A Col.1.B 1 1 a 2 2 b 3 3 c library(stringi) names(example) - stri_replace_all_fixed(names(example), ., _) example Col_1_A Col_1_B 1 1 a 2 2 b 3 3 c R. Mark Sharp, Ph.D. Director of Primate Records Database Southwest National Primate Research Center Texas Biomedical Research Institute P.O. Box 760549 San Antonio, TX 78245-0549 Telephone: (210)258-9476 e-mail: msh...@txbiomed.org On Jun 8, 2015, at 9:15 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for ...{{dropped:12}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] columna de un data.table puede ser data.frame?
Hola, si, se que para trabajar con data.table se usan [] pero en esta línea estoy creando el data.table, es decir, data.table es la función a la que llamo y PesosParam es el data.table que creo Un saludo MªLuz Morales El 8 de junio de 2015, 15:17, Carlos J. Gil Bellosta c...@datanalytics.com escribió: Hola, ¿qué tal? data.table funciona con corchetes, no paréntesis. ¿Has leído la viñeta/tutorial? Un saludo, Carlos J. Gil Bellosta http://www.datanaytics.com El día 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com escribió: Hola, yo quiero construir un data.table donde una columna (Parametros) son caracteres y otra el resultado de la función information.gain, que devuelve un data.frame. El código que he usado es este, pero me da error PesosParam - data.table(,.(Parametros, Peso:= information.gain(In.hospital_death~., ParamCol))) Es posible hacer lo que digo? o debo hacer una transformación del data.frame a data.table explícitamente. Esto también lo he probado con el código: # Conversión de data.frame a data.table setattr(PesosParam, class, c(data.table, data.frame)) data.table:::settruelength(PesosParam, 0L) invisible(alloc.col(PesosParam)) pero no encuentra settruelength Gracias Un saludo MªLuz Morales [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
You can use gsub() to change the names: dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15)) dat Var.1Var.2 1 9.627122 14.15376 2 10.741617 16.92937 3 8.492926 15.23767 4 12.226146 15.19834 5 8.829982 14.46957 names(dat) - gsub(\\., _, names(dat)) dat Var_1Var_2 1 9.627122 14.15376 2 10.741617 16.92937 3 8.492926 15.23767 4 12.226146 15.19834 5 8.829982 14.46957 - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Monday, June 8, 2015 9:16 AM Cc: r-help@r-project.org Subject: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:12}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Summarizing data based on Date
Hi -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82 Sent: Monday, June 08, 2015 11:23 AM To: r-help@r-project.org Subject: Re: [R] Summarizing data based on Date Hi Petr, Thanks for the explanation below. I tried the code you supplied however it seems as my date is a factor hence it is not working. So you need to change your factor to date. see ?as.Date as.Date(as.factor(Sys.Date())) Cheers Petr The error I got from the code was : Error: unexpected symbol in: final-aggregate(test$CHG_WT,list(format(test$CR_DT,%d),sum) final str(test$CR_DT)- gives Factor with 31 levels -- View this message in context: http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date- tp4708328p4708333.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cannot Sum with DDPLY
Hi -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82 Sent: Monday, June 08, 2015 12:48 PM To: r-help@r-project.org Subject: [R] Cannot Sum with DDPLY Hi All, Kindly see the below code I have used: maxorder-ddply(test, ~ ORIGIN,summarize,Weight=sum(CHG_WT)) Here I have written the code to summarize values based on origin and total weight however I am getting below error: Error: ‘sum’ not meaningful for factors I consider this error message as ***extremely*** informative. Your CHG_WT is actually the factor object and sum is not meaningful for factors. Maybe it is time for you to check R-Intro document about objects and their differences before you continue with posting. If CHG_WT shall be numeric you probably did not read it correctly for whatever reason. Cheers Petr Please advice. I need CHG_WT total for each state in the Origin column. -- View this message in context: http://r.789695.n4.nabble.com/Cannot-Sum- with-DDPLY-tp4708338.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny pouze jeho adresátům. Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze svého systému. Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat. Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či zpožděním přenosu e-mailu. V případě, že je tento e-mail součástí obchodního jednání: - vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a to z jakéhokoliv důvodu i bez uvedení důvodu. - a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce s dodatkem či odchylkou. - trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným dosažením shody na všech jejích náležitostech. - odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi či osobě jím zastoupené známá. This e-mail and any documents attached to it may be confidential and are intended only for its intended recipients. If you received this e-mail by mistake, please immediately inform its sender. Delete the contents of this e-mail with all attachments and its copies from your system. If you are not the intended recipient of this e-mail, you are not authorized to use, disseminate, copy or disclose this e-mail in any manner. The sender of this e-mail shall not be liable for any possible damage caused by modifications of the e-mail or by delay with transfer of the email. In case that this e-mail forms part of business dealings: - the sender reserves the right to end negotiations about entering into a contract in any time, for any reason, and without stating any reasoning. - if the e-mail contains an offer, the recipient is entitled to immediately accept such offer; The sender of this e-mail (offer) excludes any acceptance of the offer on the part of the recipient containing any amendment or variation. - the sender insists on that the respective contract is concluded only upon an express mutual agreement on all its aspects. - the sender of this e-mail informs that he/she is not authorized to enter into any contracts on behalf of the company except for cases in which he/she is expressly authorized to do so in writing, and such authorization or power of attorney is submitted to the recipient or the person represented by the recipient, or the existence of such authorization is known to the recipient of the person represented by the recipient. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
Easiest? Use sub() to replace the periods after the fact. You can also use the check.names or the col.names arguments to read.table() to customize your import. Sarah On Mon, Jun 8, 2015 at 10:15 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mean error message missing
On 08/06/2015 6:04 AM, Christian Brandstätter wrote: Thank you for the explanation. But if you take for instance plot.default(), being another generic function, it would not work like that: plot(1,2,3,4), only plot(1,2) is accepted. From R-help (Usage): ## Default S3 method: mean(x, trim = 0, na.rm = FALSE, ...) What is puzzling, is that apparently na.rm (and trim, which is indicated in the help) is accepting numeric values. mean(c(1,NA,10),10,TRUE) mean(c(1,NA,10),10,FALSE) This should give at least a warning in my opinion. It is a common idiom in R programming to treat non-zero values as TRUE, and zero as FALSE. If every use of a number where a logical is needed generated a warning, you'd be swamped with them. Duncan Murdoch mean(c(1,NA,10),10,200) On 08/06/2015 09:27, Achim Zeileis wrote: On Mon, 8 Jun 2015, Christian Brandst�tter wrote: Dear list, I found an odd behavior of the mean function; it is allowed to do something that you probably shouldn't: If you calculate mean() of a sequence of numbers (without declaring them as vector), mean() then just computes mean() of the first element. Is there a reason why there is no warning, like in sd for example? mean() - unlike sd() - is a generic function that has a '...' argument that is passed on to its methods. The default method which is called in your example also has a '...' argument (because the generic has it) but doesn't use it. Example code: mean(1,2,3,4) sd(1,2,3,4) Best regards Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R-es] columna de un data.table puede ser data.frame?
Hola, yo quiero construir un data.table donde una columna (Parametros) son caracteres y otra el resultado de la función information.gain, que devuelve un data.frame. El código que he usado es este, pero me da error PesosParam - data.table(,.(Parametros, Peso:= information.gain(In.hospital_death~., ParamCol))) Es posible hacer lo que digo? o debo hacer una transformación del data.frame a data.table explícitamente. Esto también lo he probado con el código: # Conversión de data.frame a data.table setattr(PesosParam, class, c(data.table, data.frame)) data.table:::settruelength(PesosParam, 0L) invisible(alloc.col(PesosParam)) pero no encuentra settruelength Gracias Un saludo MªLuz Morales [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Mean error message missing
Thank you very much, I didn't know that. On 08/06/2015 6:04 AM, Christian Brandstätter wrote: Thank you for the explanation. But if you take for instance plot.default(), being another generic function, it would not work like that: plot(1,2,3,4), only plot(1,2) is accepted. From R-help (Usage): ## Default S3 method: mean(x, trim = 0, na.rm = FALSE, ...) What is puzzling, is that apparently na.rm (and trim, which is indicated in the help) is accepting numeric values. mean(c(1,NA,10),10,TRUE) mean(c(1,NA,10),10,FALSE) This should give at least a warning in my opinion. It is a common idiom in R programming to treat non-zero values as TRUE, and zero as FALSE. If every use of a number where a logical is needed generated a warning, you'd be swamped with them. Duncan Murdoch mean(c(1,NA,10),10,200) On 08/06/2015 09:27, Achim Zeileis wrote: On Mon, 8 Jun 2015, Christian Brandst�tter wrote: Dear list, I found an odd behavior of the mean function; it is allowed to do something that you probably shouldn't: If you calculate mean() of a sequence of numbers (without declaring them as vector), mean() then just computes mean() of the first element. Is there a reason why there is no warning, like in sd for example? mean() - unlike sd() - is a generic function that has a '...' argument that is passed on to its methods. The default method which is called in your example also has a '...' argument (because the generic has it) but doesn't use it. Example code: mean(1,2,3,4) sd(1,2,3,4) Best regards Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] columna de un data.table puede ser data.frame?
Hola, ¿qué tal? data.table funciona con corchetes, no paréntesis. ¿Has leído la viñeta/tutorial? Un saludo, Carlos J. Gil Bellosta http://www.datanaytics.com El día 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com escribió: Hola, yo quiero construir un data.table donde una columna (Parametros) son caracteres y otra el resultado de la función information.gain, que devuelve un data.frame. El código que he usado es este, pero me da error PesosParam - data.table(,.(Parametros, Peso:= information.gain(In.hospital_death~., ParamCol))) Es posible hacer lo que digo? o debo hacer una transformación del data.frame a data.table explícitamente. Esto también lo he probado con el código: # Conversión de data.frame a data.table setattr(PesosParam, class, c(data.table, data.frame)) data.table:::settruelength(PesosParam, 0L) invisible(alloc.col(PesosParam)) pero no encuentra settruelength Gracias Un saludo MªLuz Morales [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R-es] columna de un data.table puede ser data.frame?
Solucionado Efectivamente era un problema de notación. Esto funciona PesosParam - data.table(Param = Parametros, Peso = information.gain(In.hospital_death~., ParamCol)) Nota: Parametros es un vector de caracteres Gracias Un saludo MªLuz El 8 de junio de 2015, 16:08, Carlos Ortega c...@qualityexcellence.es escribió: Hola Mª Luz, Cuando vayas a crear el data.table lo haces igual que un data.frame. Fíjate en el ejemplo de la ayuda de data.table(): DF = data.frame(x=rep(c(a,b,c),each=3), y=c(1,3,6), v=1:9) DT = data.table(x=rep(c(a,b,c),each=3), y=c(1,3,6), v=1:9) Cuando creas el data.table, la opción que has utilizado de .() no aplica. Esto aplica una vez ya tienes el data.table y quierez crear una nueva variable. Entonces el código para crear el data.table sería algo parecido a esto: PesosParam - data.table(Parametros, Peso= information.gain(In.hospital_death~., ParamCol)) Aunque lo anterior si te da error, es porque tienes que decir explícitamente de dónde salen: Parametros, In.hospital_death y ParamCol. Por eso a lo mejor tienes que definir estas variables antes: Parametros - my_data_table$Parametros Y para information.gain(), entiendo que además de las variables tendrás que indicar el data.frame (el data) donde están esas variables... Saludos, Carlos Ortega www.qualityexcellence.es El 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com escribió: Hola, yo quiero construir un data.table donde una columna (Parametros) son caracteres y otra el resultado de la función information.gain, que devuelve un data.frame. El código que he usado es este, pero me da error PesosParam - data.table(,.(Parametros, Peso:= information.gain(In.hospital_death~., ParamCol))) Es posible hacer lo que digo? o debo hacer una transformación del data.frame a data.table explícitamente. Esto también lo he probado con el código: # Conversión de data.frame a data.table setattr(PesosParam, class, c(data.table, data.frame)) data.table:::settruelength(PesosParam, 0L) invisible(alloc.col(PesosParam)) pero no encuentra settruelength Gracias Un saludo MªLuz Morales [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
Then using Sarah's suggestion something like? dat - read.table(text= + 'Var 1' Var.2 + 1 6 + 2 7 + 3 8 + 4 9 + 5 10, header=TRUE, col.names=c(Var_1, Var.2)) dat Var_1 Var.2 1 1 6 2 2 7 3 3 8 4 4 9 5 510 David C From: John Sorkin [mailto:jsor...@grecc.umaryland.edu] Sent: Monday, June 8, 2015 9:25 AM To: David L Carlson Cc: r-help@r-project.org Subject: RE: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline David, I appreciate you suggestion, but it won't work for me. I need to replace the space for a period at the time the data are read, not afterward. My variables names have periods I want to keep, if I use your suggestion I will replace the period inserted when the data are read, as well as the period that I want to keep. Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) David L Carlson dcarl...@tamu.edu 06/08/15 10:21 AM You can use gsub() to change the names: dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15)) dat Var.1 Var.2 1 9.627122 14.15376 2 10.741617 16.92937 3 8.492926 15.23767 4 12.226146 15.19834 5 8.829982 14.46957 names(dat) - gsub(\\., _, names(dat)) dat Var_1 Var_2 1 9.627122 14.15376 2 10.741617 16.92937 3 8.492926 15.23767 4 12.226146 15.19834 5 8.829982 14.46957 - David L Carlson Department of Anthropology Texas AM University College Station, TX 77840-4352 -Original Message- From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin Sent: Monday, June 8, 2015 9:16 AM Cc: r-help@r-project.org Subject: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) Confidentiality Statement: This email message, including any attachments, is for th...{{dropped:24}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
I've taken the liberty of copying this back to the list, so that others can participate in or benefit from the discussion. On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: Sarah, I am not sure how I use check.names to replace every space in the names of my variables with an underline. Can you show me how to do this? My current code is as follows: check.names just tells R not to reformat your column names. If they aren't already what you want, you'll need to do something else. data - read.csv(C:\\Users\\john\\Dropbox (Personal)\\HanlonMatt\\fullgenus3.csv) The problem I has is that my column names are not unique, e.g., I have multiple columns whose column names are (in CSV format): X Y, X Y, X Y, X Y R reads the names as follows: X.Y, X.Y.1, X.Y.2, X.Y.3 I need to have the names look like: X_Y, X_Y.1, X_Y.2, X_Y.3 You've been saying that you want to replace every space with an underscore, but that's not what your example shows. Instead, you want to let R import the names and add the identifying number (though if you do it yourself you can get the number to match the column number, which is neater), then change the FIRST underscore to a period. I'd import them with check.names=FALSE, then modify them explicitly: mynames - c(x y, x y, x y, x y) mynames [1] x y x y x y x y mynames - sub( , ., mynames) mynames [1] x.y x.y x.y x.y mynames - paste(mynames, seq_along(mynames), sep=_) mynames [1] x.y_1 x.y_2 x.y_3 x.y_4 You could also let R modify them, then use sub() to change the first underscore to a period and leave the rest alone. Sarah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R-es] help awk y shells en R
Hola buenas, a veces empleo desde R shells de unix, Existe alguna manera de utilizar estos shelss desde windows o el lenguaje awk. La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows es posible. Pero no me queda claro Un abrazo y gracias por adelntado Javier #_ # EJEMPLO, ¿Que habría que poner en # ¿¿??? # suponiendoq que tengo cygwin instalado #_ # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee bien los Ë file.rename(from = Data/data.csv, to = Data/data_2.csv) switch(OS, WIN = system( ¿¿???), MAC = system( command = awk \'{gsub( \M.?G\,\MEG\); print}\' Data/data_2.csv Data/data_2_2.csv) ) file.rename(from = Data/data.csv, to = Data/data_2.csv) -- [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
On 08/06/2015 10:23 AM, Sarah Goslee wrote: Easiest? Use sub() to replace the periods after the fact. You can also use the check.names or the col.names arguments to read.table() to customize your import. Yes, check.names is the right idea. Use check.names = FALSE, then use sub() or gsub() to replace the spaces with underscores. Duncan Murdoch Sarah On Mon, Jun 8, 2015 at 10:15 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: I am reading a csv file. The column headers have spaces in them. The spaces are replaced by a period. I want to replace the space by another character (e.g. the underline) rather than the period. Can someone tell me how to accomplish this?Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New package stepR: fitting step-functions
Dear R users, It is my pleasure to announce the availability of package stepR (1.0-2) on CRAN. The main purpose of the package is to fit piecewise constant functions (a.k.a. step-functions or block signals) to serial data in a fully data-driven manner under certain (Gaussian or non-Gaussian) distributional assumptions. It mainly implements the algorithms described in the references below - in a (hopefully) user-friendly fashion. Try library(stepR) example(smuceR) # for [1] and [2] example(jsmurf) # for [3] example(stepsel) # for [4] to get an idea about what it can do, and how to use it. We hope it proves useful; community feedback is therefore very welcome! Best regards Thomas Hotz TU Ilmenau, Institute of Mathematics References: [1] Frick, K., Munk, A., and Sieling, H. (2014). Multiscale Change-Point Inference. With discussion and rejoinder by the authors. Journal of the Royal Statistical Society, Series B, 76(3), 495-580. [2] Futschik, A., Hotz, T., Munk, A. Sieling, H. (2014). Multiresolution DNA partitioning: statistical evidence for segments. Bioinformatics, 30(16), 2255-2262. [3] Hotz, T., Schütte, O., Sieling, H., Polupanow, T., Diederichsen, U., Steinem, C., and Munk, A. (2013). Idealizing Ion Channel Recordings by a Jump Segmentation Multiresolution Filter. IEEE Transactions on NanoBioscience, 12(4), 376-386. [4] Boysen, L., Kempe, A., Liebscher, V., Munk, A., Wittich, O. (2009). Consistencies and rates of convergence of jump-penalized least squares estimators. The Annals of Statistics, 37(1), 157-183. ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New R package kwb.hantush (0.2.1): calculation of groundwater mounding beneath an (stormwater) infiltration basin
Dear R users, It is a pleasure for me to announce the availability of the new package kwb.hantush (0.2.1) on CRAN. Its objective is the calculation of groundwater mounding beneath an (stormwater) infiltration basin by solving the Hantush (1967) equation. For checking the correct implementation of the algorithm the R modelling results were cross-checked against alternative models assessed in Carleton (2010) by using the same model parameterisation. References: [1] Carleton, G.B., 2010, Simulation of groundwater mounding beneath hypothetical stormwater infiltration basins: U.S. Geological Survey Scientific Investigations Report 2010-5102, 64 p.; http://pubs.usgs.gov/sir/2010/5102/ [2] Hantush, M.S. (1967): Growth and decay of groundwater-mounds in response to uniform percolation, Water Resources Research (March, 1967); http://doi.org/10.1029/WR003i001p00227 How to get started ? * Firstly install the package from CRAN by using: install.packages(kwb.hantush) * Secondly visit http://kwb-r.github.io/kwb.hantush for doing a short tutorial, which answers the following questions: - How do I perform model runs ? - How accurate is the solution of the Hantush equation implemented in R compared to the a reference model (e.g. U.S. Geological Survey EXCEL spreadsheet solution, http://pubs.usgs.gov/sir/2010/5102/support/Hantush_USGS_SIR_2010-5102-1110.xlsm) ? I hope it proves to be useful and community feedback is therefore very welcome! Best Regards, Michael Rustler Dipl.-Geoök. Michael Rustler KompetenzZentrum Wasser Berlin gGmbH Cicerostr. 24 D-10709 Berlin Tel. +49 (0)30 53653 825 Fax +49 (0)30 53653 888 Email: michael.rust...@kompetenz-wasser.de Homepage: www.kompetenz-wasser.de http://www.kompetenz-wasser.de Geschäftsführer: Dipl.-Ing. Andreas Hartmann Sitz der Gesellschaft: Berlin Amtsgericht Charlottenburg HRB 84461 ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New version of wikipediatrend
Dear UseRs, wikipediatrend - a package to retrieve Wikipedia page access statistics - has jumped from version 0.2 to 1.1.3 and now is more streamlined, feature richer, more tested and comes with a vignette as well as a lot of fun. packge information: http://cran.rstudio.com/web/packages/wikipediatrend vignette: http://cran.rstudio.com/web/packages/wikipediatrend/vignettes/using-wikipediatrend.html project page: https://github.com/petermeissner/wikipediatrend Best, Peter NEWS wikipediatrend == version 1.1.3 // 2015-06-04 ... -- - modifying vignette to comply with CRAN policies (dropping lines installing packages if not present) version 1.1.2 // 2015-05-23 ... -- - modifying caching to comply with CRAN policies - changing default folder of cache file from temp (basename(tempdir())) to Rtemp ( tempdir() ) version 1.1.1 // 2015-05-23 ... -- - adding ghrr as additional repo to comply with CRAN policies - changing default folder of cache file from home (~) to temp (basename(tempdir())) version 1.1.0 // 2015-05-21 ... -- - feature: caching has been overhauled - feature: wp_trend() now tries to guess if page was supplied as title with possible special characters or as (url-encoded) URL part and take care of further processing - bug-fix: special character support of the packages was lousy and preventing the usage of articles of non-standard languages ( - especially on Windows) * introduction of the wp_df class to allow for a print.wp_df that a) shortens long strings on print b) does not use format() (format() causes UTF-8 characters to be replaced by U+ strings (propably only)) * using a package specific write_utf8_csv() and read_utf8_csv() to be able to store and cache data for articles with special character names (even under Windows, write.csv() does not allow enforcing a specific encoding) - bug-fix / backward compatibility: with version 1.0.0 old parameters for wp_trend() were causing errors - bug-fix: wp_cache_reset() would stop with an error if called twice in a row - fixed version 1.0.0 // 2015-04-01 ... -- - api-change: option userAgent deleted: the default is to send information on versions of R, wikipediatrend, curl as well as RCurl - api-change: option requestFrom deleted: the default is to not send the header - feature: wp_trend() now by default caches data retrievals in a temporary file - feature: wp_trend(file=save.csv) now allows to specify a file where retrievals are stored (this will always add to the already existing data) - feature: wp_trend() now allows to specify more than one page and/or language at a time. data than will be retrieved for every combination of page-language and date - feature: caching system is persistant wp_cache_file() will report file used for caching; wp_cache_reset() will reset cache; wp_cache_load() will return its content as data.frame() - feature: while wp_trend() now (invisibly) returns only data from the current request at hand the new function wp_cache() will retrieve data from cache files (by default / if no file name is specified it retrieves data from .wp_trend_cache) - api-change: the data returned by wp_trend(), cached in cache-file, retrieved by wp_cache() does consist of more variables: date, count, project, title, rank, month - feature: testthat tests now check base functionality of the package - bug-fix: non-existing page views for a month have led to an error, fixed. - bug-fix: wp_trend() now checks date inputs better for logical inconsistencies version 0.2.0 // 2014-11-01 ... -- - first puplication on CRAN ___ R-packages mailing list r-packa...@r-project.org https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail
Aehm, adding on this: I incorrectly *assumed* without testing that rounding would help; it doesn't: ecdf(round(test2,0)) # a rounding that is way too rough for my application... #Error in xy.coords(x, y) : 'x' and 'y' lengths differ Digging deeper: The initially mentioned call to unique() is not very helpful, as test2 is a data frame, so I get what I deserve, an unchanged data frame with 1 row. Still, the issue remains and can even be simplified further: ecdf(data.frame(a=3, b=4)) Empirical CDF Call: ecdf(data.frame(a = 3, b = 4)) x[1:2] = 3, 4 works ok, but ecdf(data.frame(a=3, b=3)) Error in xy.coords(x, y) : 'x' and 'y' lengths differ doesn't (same for a=b=1 or 2, so likely the same for any a=b). Instead, ecdf(c(a=3, b=3)) Empirical CDF Call: ecdf(c(a = 3, b = 3)) x[1:1] = 3 does the trick. From ?ecdf, I get that x should be a numeric vector - apparently, my misuse of the function by applying it to a row of a data frame (i.e. a data frame with one row). In all my other (dozens of) cases that worked ok, though but not for this particular one. A simple unlist() helps: You were lucky. To use a one-row data frame instead of a numerical vector will typically *not* work unless ... well, you are lucky. No, do *not* pass data frame rows instead of numeric vectors. ecdf(unlist(data.frame(a=3, b=3))) Empirical CDF Call: ecdf(unlist(data.frame(a = 3, b = 3))) x[1:1] = 3 Yet, I'm even more confused than before: in my other data, there were also duplicated values in the vector (1-row-data frame), and it never caused any issue. For this particular example, it does. I must be missing something fundamental... well.. I'm confused about why you are confused, but if you are thinking about passing rows of data frames as numeric vectors, this means you are sure that your data frame only contains classical numbers (no factors, no 'Date's, no...). In such a case, transform your data frame to a numerical matrix *once* preferably using data.matrix(d.fr) instead of just as.matrix(d.fr) but in this case it should not matter. Then *check* the result and then work with that matrix from then on. All other things probably will continue to leave you confused .. ;-) Martin Maechler, ETH Zurich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
On 08 Jun 2015, at 17:03 , Sarah Goslee sarah.gos...@gmail.com wrote: I'd import them with check.names=FALSE, then modify them explicitly: mynames - c(x y, x y, x y, x y) mynames [1] x y x y x y x y mynames - sub( , ., mynames) mynames [1] x.y x.y x.y x.y mynames - paste(mynames, seq_along(mynames), sep=_) mynames [1] x.y_1 x.y_2 x.y_3 x.y_4 Didn't he want x_y.1, not x.y_1? Obviously, just switch . and _ for that. A potential improvement (in case not all columns are x y) is to replace the last bit with make.unique(mynames). -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
Sarah, Many, many thanks. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Jun 8, 2015, at 11:04 AM, Sarah Goslee sarah.gos...@gmail.com wrote: I've taken the liberty of copying this back to the list, so that others can participate in or benefit from the discussion. On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: Sarah, I am not sure how I use check.names to replace every space in the names of my variables with an underline. Can you show me how to do this? My current code is as follows: check.names just tells R not to reformat your column names. If they aren't already what you want, you'll need to do something else. data - read.csv(C:\\Users\\john\\Dropbox (Personal)\\HanlonMatt\\fullgenus3.csv) The problem I has is that my column names are not unique, e.g., I have multiple columns whose column names are (in CSV format): X Y, X Y, X Y, X Y R reads the names as follows: X.Y, X.Y.1, X.Y.2, X.Y.3 I need to have the names look like: X_Y, X_Y.1, X_Y.2, X_Y.3 You've been saying that you want to replace every space with an underscore, but that's not what your example shows. Instead, you want to let R import the names and add the identifying number (though if you do it yourself you can get the number to match the column number, which is neater), then change the FIRST underscore to a period. I'd import them with check.names=FALSE, then modify them explicitly: mynames - c(x y, x y, x y, x y) mynames [1] x y x y x y x y mynames - sub( , ., mynames) mynames [1] x.y x.y x.y x.y mynames - paste(mynames, seq_along(mynames), sep=_) mynames [1] x.y_1 x.y_2 x.y_3 x.y_4 You could also let R modify them, then use sub() to change the first underscore to a period and leave the rest alone. Sarah Confidentiality Statement: This email message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] help awk y shells en R
Hola, Mira esto: http://stackoverflow.com/questions/18603984/using-system-with-windows Saludos, Carlos Ortega www.qualityexcellence.es El 8 de junio de 2015, 17:14, Javier Villacampa González javier.villacampa.gonza...@gmail.com escribió: Hola buenas, a veces empleo desde R shells de unix, Existe alguna manera de utilizar estos shelss desde windows o el lenguaje awk. La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows es posible. Pero no me queda claro Un abrazo y gracias por adelntado Javier #_ # EJEMPLO, ¿Que habría que poner en # ¿¿??? # suponiendoq que tengo cygwin instalado #_ # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee bien los Ë file.rename(from = Data/data.csv, to = Data/data_2.csv) switch(OS, WIN = system( ¿¿???), MAC = system( command = awk \'{gsub( \M.?G\,\MEG\); print}\' Data/data_2.csv Data/data_2_2.csv) ) file.rename(from = Data/data.csv, to = Data/data_2.csv) -- [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline
mynames [1] x.y x.y x.y x.y mynames - paste(mynames, seq_along(mynames), sep=_) In addition, if there were a variety of names in mynames and you wanted to number each unique name separately you could use ave(): origNames - c(X, Y, Y, X, Z, X) ave(origNames, origNames, FUN=function(x)paste0(x, _, seq_along(x))) [1] X_1 Y_1 Y_2 X_2 Z_1 X_3 ave(origNames, origNames, FUN=function(x)if(length(x)==1) x else paste0(x, _, seq_along(x))) [1] X_1 Y_1 Y_2 X_2 Z X_3 Bill Dunlap TIBCO Software wdunlap tibco.com On Mon, Jun 8, 2015 at 8:03 AM, Sarah Goslee sarah.gos...@gmail.com wrote: I've taken the liberty of copying this back to the list, so that others can participate in or benefit from the discussion. On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu wrote: Sarah, I am not sure how I use check.names to replace every space in the names of my variables with an underline. Can you show me how to do this? My current code is as follows: check.names just tells R not to reformat your column names. If they aren't already what you want, you'll need to do something else. data - read.csv(C:\\Users\\john\\Dropbox (Personal)\\HanlonMatt\\fullgenus3.csv) The problem I has is that my column names are not unique, e.g., I have multiple columns whose column names are (in CSV format): X Y, X Y, X Y, X Y R reads the names as follows: X.Y, X.Y.1, X.Y.2, X.Y.3 I need to have the names look like: X_Y, X_Y.1, X_Y.2, X_Y.3 You've been saying that you want to replace every space with an underscore, but that's not what your example shows. Instead, you want to let R import the names and add the identifying number (though if you do it yourself you can get the number to match the column number, which is neater), then change the FIRST underscore to a period. I'd import them with check.names=FALSE, then modify them explicitly: mynames - c(x y, x y, x y, x y) mynames [1] x y x y x y x y mynames - sub( , ., mynames) mynames [1] x.y x.y x.y x.y mynames - paste(mynames, seq_along(mynames), sep=_) mynames [1] x.y_1 x.y_2 x.y_3 x.y_4 You could also let R modify them, then use sub() to change the first underscore to a period and leave the rest alone. Sarah [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.