date:20150608

Re: [R-es] help awk y shells en R

2015-06-08 Thread javier.ruben.marcuzzi

Estimado Javier Villacampa González


Hace mucho que no uso awk o gawk, pero recuerdo cygwin y en lo personal no tuve 
inconvenientes con awk.


No se como está esa tecnología hoy en día, pero yo evalué usar R con awk, al 
respecto hay una integración en 
http://www.inside-r.org/packages/cran/Kmisc/docs/awk






Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario





De: Javier Villacampa González
Enviado el: ‎lunes‎, ‎08‎ de ‎junio‎ de ‎2015 ‎03‎:‎05‎ ‎p.m.
Para: Carlos Ortega
CC: R-help-es@r-project.org





Al final resulto más fácil de lo esperado. Hay que instalar cywin y
utilizar los comandos de la siguiente manera

system('C:/cygwin/bin/wc -l var_risco_2012.csv')
Esto en principio funciona

El 8 de junio de 2015, 17:41, Carlos Ortega c...@qualityexcellence.es
escribió:

 Hola,

 Mira esto:

 http://stackoverflow.com/questions/18603984/using-system-with-windows

 Saludos,
 Carlos Ortega
 www.qualityexcellence.es

 El 8 de junio de 2015, 17:14, Javier Villacampa González 
 javier.villacampa.gonza...@gmail.com escribió:

 Hola buenas,

 a veces empleo desde R shells de unix, Existe alguna manera de utilizar
 estos shelss desde windows o el lenguaje awk.

 La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows
 es posible. Pero no me queda claro

 Un abrazo y gracias por adelntado

 Javier
 #_
 # EJEMPLO, ¿Que habría que poner en
 # ¿¿???
 # suponiendoq que tengo cygwin instalado
 #_

 # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee
 bien los Ë

 file.rename(from = Data/data.csv, to = Data/data_2.csv)
 switch(OS,
WIN = system( ¿¿???),
MAC = system( command =  awk \'{gsub( \M.?G\,\MEG\); print}\'
 Data/data_2.csv  Data/data_2_2.csv)
 )
 file.rename(from = Data/data.csv, to = Data/data_2.csv)

 --

 [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es




 --
 Saludos,
 Carlos Ortega
 www.qualityexcellence.es




--

 [[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es
[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] web scraping image

2015-06-08 Thread Curtis DeGasperi

Thanks to Jim's prompting, I think I came up with a fairly painless way to
parse the HTML without having to write any parsing code myself using the
function getHTMLExternalFiles in the XML package. A working version of the
code follows:

## Code to process USGS peak flow data

require(dataRetrieval)
require(XML)

## Need to start with list of gauge ids to process

siteno - c('12142000','12134500','12149000')

lstas -length(siteno) #length of locator list

print(paste('Processsing...',siteno[1],' ',siteno[1], sep = ))

datall -  readNWISpeak(siteno[1])

for (a in 2:lstas) {
  # Print station being processed
  print(paste('Processsing...',siteno[a], sep = ))

  dat-  readNWISpeak(siteno[a])

  datall - rbind(datall,dat)

}

write.csv(datall, file = usgs_peaks.csv)

# Retrieve ascii text files and graphics
for (a in 1:lstas) {

  print(paste('Processsing...',siteno[a], sep = ))

  graphic.url -
paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'agency_cd=USGSformat=img',
sep = )
  usgs.img - getHTMLExternalFiles(graphic.url)
  graphic.img - paste('http://nwis.waterdata.usgs.gov',usgs.img, sep = )

  peakfq.url -
paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'agency_cd=USGSformat=hn2',
sep = )
  tab.url  - 
paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=',siteno[a],'agency_cd=USGSformat=rdb',
sep = )

  graphic.fn - paste('graphic_',siteno[a],'.gif', sep = )
  peakfq.fn - paste('peakfq_',siteno[a],'.txt', sep = )
  tab.fn  - paste('tab_',siteno[a],'.txt', sep = )
  download.file(graphic.img,graphic.fn,mode='wb')
  download.file(peakfq.url,peakfq.fn)
  download.file(tab.url,tab.fn)
}

 --

 Message: 34
 Date: Fri, 5 Jun 2015 08:59:04 +1000
 From: Jim Lemon drjimle...@gmail.com
 To: Curtis DeGasperi curtis.degasp...@gmail.com
 Cc: r-help mailing list r-help@r-project.org
 Subject: Re: [R] web scraping image
 Message-ID:
 
ca+8x3fv0ajw+e22jayv1gfm6jr_tazua5fwgd3t_mfgfqy2...@mail.gmail.com
 Content-Type: text/plain; charset=UTF-8

 Hi Chris,
 I don't have the packages you are using, but tracing this indicates
 that the page source contains the relative path of the graphic, in
 this case:

 /nwisweb/data/img/USGS.12144500.19581112.20140309..0.peak.pres.gif

 and you already have the server URL:

 nwis.waterdata.usgs.gov

 getting the path out of the page source isn't difficult, just split
 the text at double quotes and get the token following img src=. If I
 understand the arguments of download.file correctly, the path is the
 graphic.fn argument and the server URL is the graphic.url argument. I
 would paste them together and display the result to make sure that it
 matches the image you want. When I did this, the correct image
 appeared in my browser. I'm using Google Chrome, so I don't have to
 prepend the http://

 Jim

 On Fri, Jun 5, 2015 at 2:31 AM, Curtis DeGasperi
 curtis.degasp...@gmail.com wrote:
 I'm working on a script that downloads data from the USGS NWIS server.
 dataRetrieval makes it easy to quickly get the data in a neat tabular
 format, but I was also interested in getting the tabular text files -
 also fairly easy for me using download.file.

 However, I'm not skilled enough to work out how to download the nice
 graphic files that can be produced dynamically from the USGS NWIS
 server (for example:

http://nwis.waterdata.usgs.gov/nwis/peak?site_no=12144500agency_cd=USGSformat=img
)

 My question is how do I get the image from this web page and save it
 to a local directory? scrapeR returns the information from the page
 and I suspect this is a possible solution path, but I don't know what
 the next step is.

 My code provided below works from a list I've created of USGS flow
 gauging stations.

 Curtis

 ## Code to process USGS daily flow data for high and low flow analysis
 ## Need to start with list of gauge ids to process
 ## Can't figure out how to automate download of images

 require(dataRetrieval)
 require(data.table)
 require(scrapeR)

 df - read.csv(usgs_stations.csv, header=TRUE)

 lstas -length(df$siteno) #length of locator list

 print(paste('Processsing...',df$name[1],' ',df$siteno[1], sep = ))

 datall -  readNWISpeak(df$siteno[1])

 for (a in 2:lstas) {
   # Print station being processed
   print(paste('Processsing...',df$name[a],' ',df$siteno[a], sep = ))

   dat-  readNWISpeak(df$siteno[a])

   datall - rbind(datall,dat)

 }

 write.csv(datall, file = usgs_peaks.csv)

 # Retrieve ascii text files and graphics

 for (a in 1:lstas) {

   print(paste('Processsing...',df$name[1],' ',df$siteno[1], sep = ))

   graphic.url -
 paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=
',df$siteno[a],'agency_cd=USGSformat=img',
 sep = )
   peakfq.url -
 paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=
',df$siteno[a],'agency_cd=USGSformat=hn2',
 sep = )
   tab.url  - paste('http://nwis.waterdata.usgs.gov/nwis/peak?site_no=
',df$siteno[a],'agency_cd=USGSformat=rdb',
 sep = )

   graphic.fn -

Re: [R-es] help awk y shells en R

2015-06-08 Thread Javier Villacampa González

Al final resulto más fácil de lo esperado. Hay que instalar cywin y
utilizar los comandos de la siguiente manera

system('C:/cygwin/bin/wc -l var_risco_2012.csv')
Esto en principio funciona

El 8 de junio de 2015, 17:41, Carlos Ortega c...@qualityexcellence.es
escribió:

 Hola,

 Mira esto:

 http://stackoverflow.com/questions/18603984/using-system-with-windows

 Saludos,
 Carlos Ortega
 www.qualityexcellence.es

 El 8 de junio de 2015, 17:14, Javier Villacampa González 
 javier.villacampa.gonza...@gmail.com escribió:

 Hola buenas,

 a veces empleo desde R shells de unix, Existe alguna manera de utilizar
 estos shelss desde windows o el lenguaje awk.

 La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows
 es posible. Pero no me queda claro

 Un abrazo y gracias por adelntado

 Javier
 #_
 # EJEMPLO, ¿Que habría que poner en
 # ¿¿???
 # suponiendoq que tengo cygwin instalado
 #_

 # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee
 bien los Ë

 file.rename(from = Data/data.csv, to = Data/data_2.csv)
 switch(OS,
WIN = system( ¿¿???),
MAC = system( command =  awk \'{gsub( \M.?G\,\MEG\); print}\'
 Data/data_2.csv  Data/data_2_2.csv)
 )
 file.rename(from = Data/data.csv, to = Data/data_2.csv)

 --

 [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es




 --
 Saludos,
 Carlos Ortega
 www.qualityexcellence.es




--

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

[R] subsetting a dataframe

2015-06-08 Thread Bogdan Tanasa

Dear all,

would appreciate your suggestions on subsetting a dataframe : please let's
consider an example dataframe df:

dd-c(1,2,3)
rows-c(A1,A2,A3)
columns-c(B1,B2,B3)
numbers - c(400, 500, 600)
df - dataframe(dd,rows,columns, numbers)

and a vector : test_rows -c(A1,A3) ;

how could I subset the dataframe df function of vector test_rows, in such a
way that only the lines of dataframe df (df$rows) that match the elements
of test_rows (A1 and A3) are listed ?

thank you very much,

-- bogdan

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] load a very big .RData - error reading from connection

2015-06-08 Thread carol white via R-help

Hi,How is it possible to load a very big .RData that can't be loaded it's very 
big and the following error msg is displayed

load(.RData)
Error: error reading from connection
Thanks
Carol
 
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Looping Through List of .csv Files to Work with Subsets of the Data

2015-06-08 Thread Chad Danyluck

Hello,

I want to subset specific rows of data from 80 .csv files and write those
subsets into new .csv files. The data I want to subset starts on a
different row for each original .csv file. I've created variables that
identify which row the subset should start and end on, but I want to loop
through this process and I am not sure what to do. I've attempted to write
the loop below, albeit, much of it is pseudo code. If anyone can provide me
with some tips I'd appreciate it.

 This data file is used to create the variables where the subsetting
starts and ends for each participant 
mig.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/mig.data.csv)

# These are the variable names for the start and end of each subset of
relevant data (baseline, audio, and free)
participant.ids - mig.processed.data$participant.id
participant.baseline.start - mig.processed.data$baseline.row.start
participant.baseline.end - mig.processed.data$baseline.row.end
participant.audio.start - mig.processed.data$audio.meditation.row.start
participant.audio.end - mig.processed.data$audio.meditation.row.end
participant.free.start - mig.processed.data$free.meditation.row.start
participant.free.end - mig.processed.data$free.meditation.row.end

# read into a list the individual files from which to subset the data
participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/MIG_RAW DATA  TXT Files/Plain Text Files)

# loop through each participant
for (i in 1:length(participant.files)) {

# get baseline rows
results.baseline -
participant.files[participant.baseline.start[i]:participant.baseline.end[i],]

# get audio rows
results.audio
- participant.files[participant.audio.start[i]:participant.audio.end[i],]

# get free rows
results.free -
participant.files[participant.free.start[i]:participant.free.end[i],]

# write out participant relevant data
write.csv(results.baseline, file=baseline[i].csv)
write.csv(results.audio, file = audio[i].csv)
write.csv(results.free, file = free[i].csv)

}

-- 
Chad M. Danyluck, MA
PhD Candidate, Psychology
University of Toronto



“There is nothing either good or bad but thinking makes it so.” - William
Shakespeare

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] subsetting a dataframe

2015-06-08 Thread William Dunlap

Use is.element(elements,set), or its equivalent, elements %in% set:

df - data.frame(dd = c(1, 2, 3),
 rows = c(A1, A2, A3),
 columns = c(B1, B2, B3),
 numbers = c(400, 500, 600))
test_rows -c(A1,A3)
df[ is.element(df$rows, test_rows), ]
#  dd rows columns numbers
#1  1   A1  B1 400
#3  3   A3  B3 600


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Jun 8, 2015 at 3:44 PM, Bogdan Tanasa tan...@gmail.com wrote:

 Dear all,

 would appreciate your suggestions on subsetting a dataframe : please let's
 consider an example dataframe df:

 dd-c(1,2,3)
 rows-c(A1,A2,A3)
 columns-c(B1,B2,B3)
 numbers - c(400, 500, 600)
 df - dataframe(dd,rows,columns, numbers)

 and a vector : test_rows -c(A1,A3) ;

 how could I subset the dataframe df function of vector test_rows, in such a
 way that only the lines of dataframe df (df$rows) that match the elements
 of test_rows (A1 and A3) are listed ?

 thank you very much,

 -- bogdan

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Looping Through List of .csv Files to Work with Subsets of the Data

2015-06-08 Thread Chad Danyluck

Thank you Don.

I've incorporated your suggestions which have helped me to understand how
loops work better than previously. However, the loop gets stuck trying to
read the current file:

mig.processed.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/mig.log.data.addition.csv)

## ASSUMPTION: Starting with augmented processedbook and correct
free.meditation.end
 Read in all data files and Loop through to create new data files
segmented by the rows identified before 

# get required data
participant.ids - mig.processed.data$participant.id
participant.baseline.start - mig.processed.data$baseline.row.start
participant.baseline.end - mig.processed.data$baseline.row.end
participant.audio.start - mig.processed.data$audio.meditation.row.start
participant.audio.end - mig.processed.data$audio.meditation.row.end
participant.free.start - mig.processed.data$free.meditation.row.start
participant.free.end - mig.processed.data$free.meditation.row.end

participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/MIG_RAW DATA  TXT Files/Plain Text Files)

for (i in 1:length(participant.files)) {

 id - participant.files[i]

  ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this
  ## to ensure that the files sort properly when viewed by the operating
#system
 idc - formatC(id, width=3, flag='0')

#current file
  crnt.file[i] - read.csv( participant.files[i] )

## base
  tmp.base -
crnt.file[participant.baseline.start:participant.baseline.end, ]
  write.csv(tmp.base, file=paste0('baseline',idc,'.csv'))


  ## audio
  tmp.audio - crnt.file[participant.audio.start:participant.audio.end, ]
  write.csv(tmp.audio, file=paste0('audio',idc,'.csv'))



  ## free
  tmp.free - crnt.file[participant.free.start:participant.free.end, ]
  write.csv(tmp.free, file=paste0('free',idc,'.csv'))

}

The error message reads:

Error in file(file, rt) : cannot open the connection
In addition: Warning message:
In file(file, rt) : cannot open file '103.csv': No such file or directory

So it seems to be calling the first file in the list but getting stuck. Any
suggestions?

Best,

Chad

On Mon, Jun 8, 2015 at 8:07 PM, MacQueen, Don macque...@llnl.gov wrote:

 So you have 80 files, one for each participant?

 It appears that from each of the 80 files you want to extract three
 subsets of rows,
   one set for baseline
   one set for audio
   one set for free

 What I think I would do, if the above is correct, is create one master
 file. This file will have eight columns:
 (I'll show an example column name, followed by a description)
   id  participant id
   fn   file name for that participant
   srb  start row for baseline
   erb  end row for baseline
   sra  start row for audio
   era  end row for audio
   srf  start row for free
   erf  end row for free

 This may be fairly close to what you already have, but I'm not sure.

 I would then load the master file into R
   mstf - read.csv( {the master file} )

 Then loop through its rows, and since each row has all the information
 necessary to read the participant's individual file and identify which
 rows to subset, a loop like this should work.

 for (irow in seq(nrow(mstf$id))) {

   id - mstf$id[irow]
   ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this
   ## to ensure that the files sort properly when viewed by the operating
 system
   idc - formatC(id, width=2, flag='0')

   crnt.file - read.csv( mstf$fn[irow] )

   ## base
   tmp.base - crnt.file[ mstf$srb[irow]:mstf$erb[irow] , ]
   write.csv(tmp.base, file=paste0('baseline',idc,'.csv')


   ## audio
   tmp.audio - crnt.file[ mstf$sra[irow]:mstf$era[irow] , ]
   write.csv(tmp.audio, file=paste0('audio',idc,'.csv')



   ## free
   tmp.free - crnt.file[ mstf$srf[irow]:mstf$erf[irow] , ]
   write.csv(tmp.free, file=paste0('free',idc,'.csv')

 }


 Obviously, I can't test this. And there may be (likely are!) some typos in
 it.

 Note that it's not necessary to create variables that identify which row
 the subset should start and end on; these are just looked up from the
 master file when needed. Similarly, the three respective subsets are
 stored in temporary data frames, because they are not (I presume) needed
 when the whole thing is done. (if they were needed, then a different
 strategy would be more appropriate)

 There are different ways to index the loop. I just picked one.

 --
 Don MacQueen

 Lawrence Livermore National Laboratory
 7000 East Ave., L-627
 Livermore, CA 94550
 925-423-1062





 On 6/8/15, 2:48 PM, Chad Danyluck c.danyl...@gmail.com wrote:

 Hello,
 
 I want to subset specific rows of data from 80 .csv files and write those
 subsets into new .csv files. The data I want to subset starts on a
 different row for each original .csv file. I've created variables that
 identify which row the subset should start and end on, but I want to loop
 through this process and I am not sure what to do. I've attempted to write
 the loop below,

Re: [R] Looping Through List of .csv Files to Work with Subsets of the Data

2015-06-08 Thread William Dunlap

   participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/MIG_RAW DATA  TXT Files/Plain Text Files)

Try adding the argument full.names=TRUE to that call to list.files().

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Jun 8, 2015 at 7:15 PM, Chad Danyluck c.danyl...@gmail.com wrote:

 Thank you Don.

 I've incorporated your suggestions which have helped me to understand how
 loops work better than previously. However, the loop gets stuck trying to
 read the current file:

 mig.processed.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG -
 Dissertation/Data  Syntax/mig.log.data.addition.csv)

 ## ASSUMPTION: Starting with augmented processedbook and correct
 free.meditation.end
  Read in all data files and Loop through to create new data files
 segmented by the rows identified before 

 # get required data
 participant.ids - mig.processed.data$participant.id
 participant.baseline.start - mig.processed.data$baseline.row.start
 participant.baseline.end - mig.processed.data$baseline.row.end
 participant.audio.start - mig.processed.data$audio.meditation.row.start
 participant.audio.end - mig.processed.data$audio.meditation.row.end
 participant.free.start - mig.processed.data$free.meditation.row.start
 participant.free.end - mig.processed.data$free.meditation.row.end

 participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG -
 Dissertation/Data  Syntax/MIG_RAW DATA  TXT Files/Plain Text Files)

 for (i in 1:length(participant.files)) {

  id - participant.files[i]

   ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this
   ## to ensure that the files sort properly when viewed by the operating
 #system
  idc - formatC(id, width=3, flag='0')

 #current file
   crnt.file[i] - read.csv( participant.files[i] )

 ## base
   tmp.base -
 crnt.file[participant.baseline.start:participant.baseline.end, ]
   write.csv(tmp.base, file=paste0('baseline',idc,'.csv'))


   ## audio
   tmp.audio - crnt.file[participant.audio.start:participant.audio.end, ]
   write.csv(tmp.audio, file=paste0('audio',idc,'.csv'))



   ## free
   tmp.free - crnt.file[participant.free.start:participant.free.end, ]
   write.csv(tmp.free, file=paste0('free',idc,'.csv'))

 }

 The error message reads:

 Error in file(file, rt) : cannot open the connection
 In addition: Warning message:
 In file(file, rt) : cannot open file '103.csv': No such file or directory

 So it seems to be calling the first file in the list but getting stuck. Any
 suggestions?

 Best,

 Chad

 On Mon, Jun 8, 2015 at 8:07 PM, MacQueen, Don macque...@llnl.gov wrote:

  So you have 80 files, one for each participant?
 
  It appears that from each of the 80 files you want to extract three
  subsets of rows,
one set for baseline
one set for audio
one set for free
 
  What I think I would do, if the above is correct, is create one master
  file. This file will have eight columns:
  (I'll show an example column name, followed by a description)
id  participant id
fn   file name for that participant
srb  start row for baseline
erb  end row for baseline
sra  start row for audio
era  end row for audio
srf  start row for free
erf  end row for free
 
  This may be fairly close to what you already have, but I'm not sure.
 
  I would then load the master file into R
mstf - read.csv( {the master file} )
 
  Then loop through its rows, and since each row has all the information
  necessary to read the participant's individual file and identify which
  rows to subset, a loop like this should work.
 
  for (irow in seq(nrow(mstf$id))) {
 
id - mstf$id[irow]
## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this
## to ensure that the files sort properly when viewed by the operating
  system
idc - formatC(id, width=2, flag='0')
 
crnt.file - read.csv( mstf$fn[irow] )
 
## base
tmp.base - crnt.file[ mstf$srb[irow]:mstf$erb[irow] , ]
write.csv(tmp.base, file=paste0('baseline',idc,'.csv')
 
 
## audio
tmp.audio - crnt.file[ mstf$sra[irow]:mstf$era[irow] , ]
write.csv(tmp.audio, file=paste0('audio',idc,'.csv')
 
 
 
## free
tmp.free - crnt.file[ mstf$srf[irow]:mstf$erf[irow] , ]
write.csv(tmp.free, file=paste0('free',idc,'.csv')
 
  }
 
 
  Obviously, I can't test this. And there may be (likely are!) some typos
 in
  it.
 
  Note that it's not necessary to create variables that identify which row
  the subset should start and end on; these are just looked up from the
  master file when needed. Similarly, the three respective subsets are
  stored in temporary data frames, because they are not (I presume) needed
  when the whole thing is done. (if they were needed, then a different
  strategy would be more appropriate)
 
  There are different ways to index the loop. I just picked one.
 
  --
  Don MacQueen
 
  Lawrence Livermore National Laboratory
  7000 East Ave., L-627
  Livermore, CA 94550
  925-423-1062

Re: [R] Looping Through List of .csv Files to Work with Subsets of the Data

2015-06-08 Thread MacQueen, Don

So you have 80 files, one for each participant?

It appears that from each of the 80 files you want to extract three
subsets of rows,
  one set for baseline
  one set for audio
  one set for free

What I think I would do, if the above is correct, is create one master
file. This file will have eight columns:
(I'll show an example column name, followed by a description)
  id  participant id
  fn   file name for that participant
  srb  start row for baseline
  erb  end row for baseline
  sra  start row for audio
  era  end row for audio
  srf  start row for free
  erf  end row for free

This may be fairly close to what you already have, but I'm not sure.

I would then load the master file into R
  mstf - read.csv( {the master file} )

Then loop through its rows, and since each row has all the information
necessary to read the participant's individual file and identify which
rows to subset, a loop like this should work.

for (irow in seq(nrow(mstf$id))) {

  id - mstf$id[irow]
  ## if id is numeric, e.g., 1, 2, 3 ... 80 then I would do this
  ## to ensure that the files sort properly when viewed by the operating
system
  idc - formatC(id, width=2, flag='0')

  crnt.file - read.csv( mstf$fn[irow] )

  ## base
  tmp.base - crnt.file[ mstf$srb[irow]:mstf$erb[irow] , ]
  write.csv(tmp.base, file=paste0('baseline',idc,'.csv')


  ## audio
  tmp.audio - crnt.file[ mstf$sra[irow]:mstf$era[irow] , ]
  write.csv(tmp.audio, file=paste0('audio',idc,'.csv')



  ## free
  tmp.free - crnt.file[ mstf$srf[irow]:mstf$erf[irow] , ]
  write.csv(tmp.free, file=paste0('free',idc,'.csv')

}


Obviously, I can't test this. And there may be (likely are!) some typos in
it.

Note that it's not necessary to create variables that identify which row
the subset should start and end on; these are just looked up from the
master file when needed. Similarly, the three respective subsets are
stored in temporary data frames, because they are not (I presume) needed
when the whole thing is done. (if they were needed, then a different
strategy would be more appropriate)

There are different ways to index the loop. I just picked one.

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/8/15, 2:48 PM, Chad Danyluck c.danyl...@gmail.com wrote:

Hello,

I want to subset specific rows of data from 80 .csv files and write those
subsets into new .csv files. The data I want to subset starts on a
different row for each original .csv file. I've created variables that
identify which row the subset should start and end on, but I want to loop
through this process and I am not sure what to do. I've attempted to write
the loop below, albeit, much of it is pseudo code. If anyone can provide
me
with some tips I'd appreciate it.

 This data file is used to create the variables where the subsetting
starts and ends for each participant 
mig.data - read.csv(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/mig.data.csv)

# These are the variable names for the start and end of each subset of
relevant data (baseline, audio, and free)
participant.ids - mig.processed.data$participant.id
participant.baseline.start - mig.processed.data$baseline.row.start
participant.baseline.end - mig.processed.data$baseline.row.end
participant.audio.start - mig.processed.data$audio.meditation.row.start
participant.audio.end - mig.processed.data$audio.meditation.row.end
participant.free.start - mig.processed.data$free.meditation.row.start
participant.free.end - mig.processed.data$free.meditation.row.end

# read into a list the individual files from which to subset the data
participant.files - list.files(/Users/cdanyluck/Documents/Studies/MIG -
Dissertation/Data  Syntax/MIG_RAW DATA  TXT Files/Plain Text Files)

# loop through each participant
for (i in 1:length(participant.files)) {

# get baseline rows
results.baseline -
participant.files[participant.baseline.start[i]:participant.baseline.end[i
],]

# get audio rows
results.audio
- participant.files[participant.audio.start[i]:participant.audio.end[i],]

# get free rows
results.free -
participant.files[participant.free.start[i]:participant.free.end[i],]

# write out participant relevant data
write.csv(results.baseline, file=baseline[i].csv)
write.csv(results.audio, file = audio[i].csv)
write.csv(results.free, file = free[i].csv)

}

-- 
Chad M. Danyluck, MA
PhD Candidate, Psychology
University of Toronto



³There is nothing either good or bad but thinking makes it so.² - William
Shakespeare

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org

[R] Summarizing data based on Date

2015-06-08 Thread Shivi82

Hi All,

I have a data set with 11000 rows  19 columns. 
I have 2 columns on which I need to summarize the data:- Date  Weight.
Snapshot is :
Date 
13/03/2015
31/03/2015
15/03/2015
17/03/2015
17/03/2015
11/3/2015
11/3/2015
19/03/2015

CHG_WT
0
0
0
770
3,730
70
10
500
Now I need to summarize  this data based on Day wise trend of weight however
I have tried bifurcating and truncating the date and saw multiple options
over the web - zoo package, iso week etc but I am not sure on how to reach
to this analysis.
If you experts can please suggest how to achieve the requirement.
Thanks, Shivi













--
View this message in context: 
http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Mean error message missing

2015-06-08 Thread Christian Brandstätter


Dear list,

I found an odd behavior of the mean function; it is allowed to do 
something that you probably shouldn't:
If you calculate mean() of a sequence of numbers (without declaring them 
as vector), mean() then just computes mean() of the first element. Is 
there a reason why there is no warning, like in sd for example?


Example code:
mean(1,2,3,4)
sd(1,2,3,4)

Best regards
Christian

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Text Mining - Remove punctuation not removing quotes and dashes

2015-06-08 Thread Anindya Sankar Dey

Hi,

I have been doing some text mining. I created the DTM matrix using the
following steps.

corpus1-VCorpus(VectorSource(resume1$Dat1))

corpus1-tm_map(corpus1,content_transformer(tolower))

dtm-DocumentTermMatrix(corpus1,
   control = list(removePunctuation = TRUE,
  removeNumbers = TRUE,
  removeSparseTerms=TRUE,
stopwords = TRUE))


After all the run I am still getting words like -quotation, fun, model
, etc.

What can I do about it. I do not need this dahses and extra quotations.

-- 
Anindya Sankar Dey

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mean error message missing

2015-06-08 Thread Achim Zeileis


On Mon, 8 Jun 2015, Christian Brandstätter wrote:


Dear list,

I found an odd behavior of the mean function; it is allowed to do something 
that you probably shouldn't:
If you calculate mean() of a sequence of numbers (without declaring them as 
vector), mean() then just computes mean() of the first element. Is there a 
reason why there is no warning, like in sd for example?


mean() - unlike sd() - is a generic function that has a '...' argument 
that is passed on to its methods. The default method which is called in 
your example also has a '...' argument (because the generic has it) but 
doesn't use it.



Example code:
mean(1,2,3,4)
sd(1,2,3,4)

Best regards
Christian

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] problems editing R console

2015-06-08 Thread Rosa Oliveira

Thanks all off you ;)

I think I got it.

I was saving the workplace and loading it, but after that I wasn’t calling my 
data ;)


really naive.

Thanks very much.

best
RO

Atenciosamente,
Rosa Oliveira

-- 



Rosa Celeste dos Santos Oliveira, 

E-mail: rosit...@gmail.com
Tlm: +351 939355143 
Linkedin: https://pt.linkedin.com/in/rosacsoliveira

Many admire, few know
Hippocrates

 On 08 Jun 2015, at 01:30, Mark Sharp msh...@txbiomed.org wrote:
 
 Rosa,
 
 See save() and load() functions for background. However, I suspect you will 
 want to do something as described in the article in this link 
 http://www.fromthebottomoftheheap.net/2012/04/01/saving-and-loading-r-objects/
 
 
 Mark
 
 
 
 R. Mark Sharp, Ph.D.
 Director of Primate Records Database
 Southwest National Primate Research Center
 Texas Biomedical Research Institute
 P.O. Box 760549
 San Antonio, TX 78245-0549
 Telephone: (210)258-9476
 e-mail: msh...@txbiomed.org
 
 
 
 On Jun 7, 2015, at 5:58 PM, Rosa Oliveira rosit...@gmail.com wrote:
 
 Dear Mark,
 
 
 I’ll try to explain better.
 
 Imagine I write:
 
 library(foreign)
 library(nlme)
 
 set.seed(1000)
 n.sample-1 #sample size
 M - 5
 DP_x - 2
 x - rnorm(n.sample,M,DP_x)
 p - pnorm(-3+x)  
 y - rbinom(n.sample,1,p)
 dp_erro - 0.01
 erro - rnorm(n.sample,0,dp_erro)
 x.erro - x+erro
 
 but with a function, with 2000 simulations. 
 I save my “output” and I get X.erro in a .txt file. (text edit file).
 
 I do another setting with DP_x=3 and save, and so on.
 
 For some reason I realize I’ve done my simulation the wrong way and I have 
 to apply a correction, for example:
 
 x.erro = 1.4X+erro, i.e. in the truth I could use my first X and erro values 
 in each setting, but as it is in a .txt file I can’t use them any more. Is 
 there a way to save the results in a  format that I can use the values? Just 
 apply my corrections and don’t have to do the 2000 simulations for each 
 setting again?
 
 My problem is that the function I use takes 3 days running, and just 500 
 simulations :(
 
 Best,
 RO
 
 
 Atenciosamente,
 Rosa Oliveira
 
 -- 
 
 
 smile.jpg
 Rosa Celeste dos Santos Oliveira, 
 
 E-mail: rosit...@gmail.com
 Tlm: +351 939355143 
 Linkedin: https://pt.linkedin.com/in/rosacsoliveira
 
 Many admire, few know
 Hippocrates
 
 On 07 Jun 2015, at 23:03, Mark Sharp msh...@txbiomed.org wrote:
 
 I cannot understand your request as stated. Can you provide a small example?
 
 Mark
 
 R. Mark Sharp, Ph.D.
 msh...@txbiomed.org
 
 On Jun 7, 2015, at 2:49 PM, Rosa Oliveira rosit...@gmail.com wrote:
 
 Dear all,
 
 I’m doing simulations on R, and as my code is being changed and improved I 
 need to, sometimes, work in finished simulations, i.e, 
 
 After my simulation is  over I need to settle another setting.
 The problem is that I need to get back to the previous result.
 
 When I save the result it saves as txt, so I can’t edit that result any 
 more.
 
 Imagine I save a setting and save the mean, nonetheless, in another 
 setting the mean as problems, so I have to ask the median.
 
 As I have to have the same statistics to all settings, nowadays I have to 
 run my first setting again.
 
 My advisor told me that I could save another way so I can “edit” my first 
 result. Is it possible?
 
 I tried to save as save my workplace, … but after I don’t know what to 
 do with it.
 
 Can you please help me?
 I know is a naive question, but I have to go through this every 3 days 
 (time each simulation takes long). And my work is being delayed :(
 
 
 Best,
 RO
 
 
 
 Atenciosamente,
 Rosa Oliveira
 
 -- 
 
 
 
 Rosa Celeste dos Santos Oliveira, 
 
 E-mail: rosit...@gmail.com
 Tlm: +351 939355143 
 Linkedin: https://pt.linkedin.com/in/rosacsoliveira
 
 Many admire, few know
 Hippocrates
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
 
 
 
 
 
 
 
 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] source code for dbeta

2015-06-08 Thread Varun Sinha

Hi,

Thanks a lot. I downloaded the tar.gz file and I found the C code.

I would really appreciate it if you could field another question:
I have to use sql, and I have to perform various statistical calculations -
like integrate, dbeta etc. Sql does not have these functions, plus they are
very difficult to code. Would it be possible to use the C code, compile it
and deploy it in sql? Is that feasible, or even permitted?

Thanks once again, I'm very grateful.


On Mon, Jun 8, 2015 at 2:06 AM, Duncan Murdoch murdoch.dun...@gmail.com
wrote:

 On 07/06/2015 6:11 PM, Mark Sharp wrote:
  Varun,
 
  If you type dbeta at the command line you get the R source, which in
 this case tells you that the code is calling a compiled source. This is
 indicated by the line bytecode: 0x7fc3bb1b84e0

 No, that says that the R code (what is shown) is compiled.  What
 indicates that this is C code is the use of .Call.  The C_dbeta and
 C_dnbeta objects are NativeSymbolInfo objects that hold the pointers
 to the C entry points.

 Since it is in a base package (stats), the source is in the R sources,
 somewhere in  https://svn.r-project.org/R/trunk/src/library/stats/src.
 You can search through those files for the dbeta or dnbeta functions.
 The C_ prefix is conventionally used in the R sources to indicate that
 it is C code; generally you replace it with do_ in the actual C code.
  This particular function is actually not really in the package source;
 it's in the main part of the R sources, in file

 https://svn.r-project.org/R/trunk/src/nmath/dbeta.c

 (though it takes a few steps to get there, starting in the stats package
 function do_dbeta).

 Duncan Murdoch
 
  See the following.
  dbeta
  function (x, shape1, shape2, ncp = 0, log = FALSE)
  {
  if (missing(ncp))
  .Call(C_dbeta, x, shape1, shape2, log)
  else .Call(C_dnbeta, x, shape1, shape2, ncp, log)
  }
  bytecode: 0x7fc3bb1b84e0
  environment: namespace:stats
 
  Compiled code in a package
 
  If you want to view compiled code in a package, you will need to
 download/unpack the package source. The installed binaries are not
 sufficient. A package's source code is available from the same CRAN (or
 CRAN compatible) repository that the package was originally installed from.
 The download.packages() function can get the package source for you.
 
  Extracted from
 http://stackoverflow.com/questions/19226816/how-can-i-view-the-source-code-for-a-function
 
  Mark
 
 
  R. Mark Sharp, Ph.D.
  msh...@txbiomed.org
 
 
  On Jun 7, 2015, at 4:31 AM, Varun Sinha sinha.varun...@gmail.com
 wrote:
 
  Hi,
 
  I am trying to find the source code for dbeta function.
 
  I tried edit(dbeta) and this is what I got:
  edit(dbeta)
  function (x, shape1, shape2, ncp = 0, log = FALSE)
  {
 if (missing(ncp))
 .Call(C_dbeta, x, shape1, shape2, log)
 else .Call(C_dnbeta, x, shape1, shape2, ncp, log)
  }
  environment: namespace:stats
 
  It looks like it is calling calling C_dbeta, but I'm not sure. If it
 does,
  how do I find it's source code?
 
  Thank you!
  Varun
 
   [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summarizing data based on Date

2015-06-08 Thread PIKAL Petr

Hi

Is your Date really Date or is it character? What is result of

str(Date)

If you want to det summaries for dates you can use

?aggregate

However in this case I strongly recommend to show us your data by

dput(yourdata)

and explain on the example what summary do you want.

I can be completely wrong but maybe

aggregate(CHG_WT, list(format(Date, %d), sum)

can get you required values.

Cheers
Petr

 -Original Message-
 From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82
 Sent: Monday, June 08, 2015 10:08 AM
 To: r-help@r-project.org
 Subject: [R] Summarizing data based on Date

 Hi All,

 I have a data set with 11000 rows  19 columns.
 I have 2 columns on which I need to summarize the data:- Date  Weight.
 Snapshot is :
 Date
 13/03/2015
 31/03/2015
 15/03/2015
 17/03/2015
 17/03/2015
 11/3/2015
 11/3/2015
 19/03/2015

 CHG_WT
 0
 0
 0
 770
 3,730
 70
 10
 500
 Now I need to summarize  this data based on Day wise trend of weight
 however I have tried bifurcating and truncating the date and saw
 multiple options over the web - zoo package, iso week etc but I am not
 sure on how to reach to this analysis.
 If you experts can please suggest how to achieve the requirement.
 Thanks, Shivi













 --
 View this message in context:
 http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-
 tp4708328.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R to HTML problem

2015-06-08 Thread Pijush Das

Dear Jim Lemon,


Thank you very much Jim for your help.


Regards,
Pijush

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail

2015-06-08 Thread Meyners, Michael

Aehm, adding on this: I incorrectly *assumed* without testing that rounding 
would help; it doesn't:

ecdf(round(test2,0))# a rounding that is way too rough for my application...
#Error in xy.coords(x, y) : 'x' and 'y' lengths differ

Digging deeper: The initially mentioned call to unique() is not very helpful, 
as test2 is a data frame, so I get what I deserve, an unchanged data frame with 
1 row. Still, the issue remains and can even be simplified further:

 ecdf(data.frame(a=3, b=4))
Empirical CDF 
Call: ecdf(data.frame(a = 3, b = 4))
 x[1:2] =  3,  4

works ok, but

 ecdf(data.frame(a=3, b=3))
Error in xy.coords(x, y) : 'x' and 'y' lengths differ

doesn't (same for a=b=1 or 2, so likely the same for any a=b). Instead, 

 ecdf(c(a=3, b=3))
Empirical CDF 
Call: ecdf(c(a = 3, b = 3))
 x[1:1] =  3

does the trick. From ?ecdf, I get that x should be a numeric vector - 
apparently, my misuse of the function by applying it to a row of a data frame 
(i.e. a data frame with one row). In all my other (dozens of) cases that worked 
ok, though but not for this particular one. A simple unlist() helps:

 ecdf(unlist(data.frame(a=3, b=3)))
Empirical CDF 
Call: ecdf(unlist(data.frame(a = 3, b = 3)))
 x[1:1] =  3

Yet, I'm even more confused than before: in my other data, there were also 
duplicated values in the vector (1-row-data frame), and it never caused any 
issue. For this particular example, it does. I must be missing something 
fundamental...
 
Michael

 -Original Message-
 From: Meyners, Michael
 Sent: Montag, 8. Juni 2015 12:02
 To: 'r-help@r-project.org'
 Subject: mismatch between match and unique causing ecdf (well,
 approxfun) to fail
 
 All,
 
 I encountered the following issue with ecdf which was originally on a vector
 of length 10,000, but I have been able to reduce it to a minimal reproducible
 example (just to avoid questions why I'd want to do this for a vector of
 length 2...):
 
 test2 = structure(list(X817 = 3.39824670255344, X4789 = 3.39824670255344),
 .Names = c(X817, X4789), row.names = 74L, class = data.frame)
 ecdf(test2)
 
 # Error in xy.coords(x, y) : 'x' and 'y' lengths differ
 
 In an attempt to track this down, it occurs that
 
 unique(test2)
 #   X817X4789
 #74 3.398247 3.398247
 
 while
 
 match(test2, unique(test2))
 #[1] 1 1
 
 matches both values to the first one. This causes a hiccup in the call to 
 ecdf,
 as this uses (an equivalent to) a call to approxfun with x = test2 and y =
 cumsum(tabulate(match(test2, unique(test2, the latter now containing
 one entry less than the former, so xy.coords fails.
 
 I understand that the issue should be somehow related  to FAQ 7.31, but I
 would have hoped that unique and match would be using the same precision
 and hence both or neither would consider the two values identical, but not
 one match while unique doesn't.
 
 Last but not least, it doesn't really cause an issue on my end (other than
 breaking my code and hence out of a loop at first place...); rounding will 
 help
 w/o noteworthy changes to the outcome, so no need to propose a
 workaround :-) I'd rather like to raise the issue and learn whether there is a
 purpose for this behavior, and/or whether there is a generic fix to this, or
 whether I am completely missing something.
 
 Version info (under Windows 7):
 R version 3.2.0 (2015-04-16) -- Full of Ingredients
 Platform: x86_64-w64-mingw32/x64 (64-bit)
 
 Cheers, Michael

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summarizing data based on Date

2015-06-08 Thread Shivi82

Hi Petr,
Thanks for the explanation below. 
I tried the code you supplied however it seems as my date is a factor hence
it is not working.
The error I got from the code was :

Error: unexpected symbol in:
final-aggregate(test$CHG_WT,list(format(test$CR_DT,%d),sum)
final

str(test$CR_DT)- gives Factor with 31 levels 



--
View this message in context: 
http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-tp4708328p4708333.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] mismatch between match and unique causing ecdf (well, approxfun) to fail

2015-06-08 Thread Meyners, Michael

All,

I encountered the following issue with ecdf which was originally on a vector of 
length 10,000, but I have been able to reduce it to a minimal reproducible 
example (just to avoid questions why I'd want to do this for a vector of length 
2...):

test2 = structure(list(X817 = 3.39824670255344, X4789 = 3.39824670255344), 
.Names = c(X817, X4789), row.names = 74L, class = data.frame)
ecdf(test2) 

# Error in xy.coords(x, y) : 'x' and 'y' lengths differ

In an attempt to track this down, it occurs that 

unique(test2)
#   X817X4789
#74 3.398247 3.398247

while 

match(test2, unique(test2))
#[1] 1 1

matches both values to the first one. This causes a hiccup in the call to ecdf, 
as this uses (an equivalent to) a call to approxfun with x = test2 and y = 
cumsum(tabulate(match(test2, unique(test2, the latter now containing one 
entry less than the former, so xy.coords fails.

I understand that the issue should be somehow related  to FAQ 7.31, but I would 
have hoped that unique and match would be using the same precision and hence 
both or neither would consider the two values identical, but not one match 
while unique doesn't. 

Last but not least, it doesn't really cause an issue on my end (other than 
breaking my code and hence out of a loop at first place...); rounding will help 
w/o noteworthy changes to the outcome, so no need to propose a workaround :-) 
I'd rather like to raise the issue and learn whether there is a purpose for 
this behavior, and/or whether there is a generic fix to this, or whether I am 
completely missing something.

Version info (under Windows 7): 
R version 3.2.0 (2015-04-16) -- Full of Ingredients
Platform: x86_64-w64-mingw32/x64 (64-bit)

Cheers, Michael 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] usar Selenium para web scraping

2015-06-08 Thread Carlos Ortega

Hola,

No sé si esta respuesta en Stack Overflow te puede ayudar:

http://stackoverflow.com/questions/26938118/check-for-dialog-box-in-rselenium

Saludos,
Carlos Ortega
www.qualityexcellence.es

El 8 de junio de 2015, 9:09, José Luis Cañadas Reche canadasre...@gmail.com
 escribió:

  Gracias Javier y Carlos.

  El tema es que relenium me da error al iniciar firefox y lo cierra. En la
 página github del paquete https://github.com/LluisRamon/relenium dicen
 que lo discontinúan debido a la aparición de otro paquete RSelenium. Y aquí
 es dónde me pierdo, no he averiguado como acceder a los valores de un combo
 utilizando RSelenium.

 Saludos.

 El 05/06/15 a las 15:40, javier.ruben.marcu...@gmail.com escribió:

  Estimado José Luis Cañadas

  En lo personal el trabajo de Gregorio que cita Carlos me fue de mucha
 ayuda, lo único que Rselenium tiene un comportamiento algo extraño, mi
 problema es en dos líneas, la primera sobre ejemplos que no funcionan (algo
 cambió), pero la importante es sobre mi trabajo, luego de horas de web
 scraping por alguna razón da un error, este tiene que ver con el recorrido
 de todas las opciones de un combo (serán 200), y en la mitad informa un
 error relacionado con encontrar el id en HTML que tiene que recorrer
 (aunque ya lo recorrió varias veces). Este error no supe solucionarlo, en
 caso de no tener que llenar formularios HTML rvest suele ser más rápido.

  Javier Rubén Marcuzzi
 Técnico en Industrias Lácteas
 Veterinario

   *De:* Carlos Ortega c...@qualityexcellence.es
 *Enviado el:* ‎viernes‎, ‎05‎ de ‎junio‎ de ‎2015 ‎08‎:‎49‎ ‎a.m.
 *Para:* jose luis cañadas canadasre...@gmail.com
 *CC:* R-help-es@r-project.org r-help-es@r-project.org

  Hola José Luis,

 Además de lo que puso en su blog, Gregorio hizo una presentación muy clara
 de cómo usar RSelenium en el grupo de R de Madrid. El video de lo que contó
 es este:

 https://vimeo.com/96023824

 Por si en él encuentras la clave

 Saludos,
 Carlos Ortega
 www.qualityexcellence.es


 El 5 de junio de 2015, 13:28, José Luis Cañadas Reche 
 canadasre...@gmail.com escribió:

  Hola.
 
  Tengo que bajarme varias tablas del INE y necesito interactuar con el
  navegador. Ví el fantástico post que  escribió Gregorio Serrano (que la
  tierra le sea leve), en
 
 http://www.grserrano.net/wp/2014/01/relenium-el-siguiente-nivel-de-web-scraping-con-r/
  y estoy intentando reproducirlo para aprender como funciona relenium
 
  Pero relenium me da error después de
 
  if(!require(relenium)) install.packages(relenium)
 
  precios - 
 
 http://www.ine.es/jaxi/tabla.do?path=/t38/bme2/t07/a081/l0/file=1300010.pxtype=pcaxisL=0
  
 
  firefox - firefoxClass$new()
 
  Error in exceptionTable[, 1] : subíndice fuera de  los límites
 
  Total que me he puesto a trastear con RSelenium,  y consigo seleccionar
 el
  elemento combobox pero no sé como obtener los valores que muestra ni como
  seleccionarlos. ¿Alguna idea?
 
 
 
  library(RSelenium)
  checkForServer()
  startServer()
 
  remDr - remoteDriver(remoteServerAddr = localhost
   , port = 
   , browserName = firefox
  )
 
  remDr$open()
 
 
  remDr$navigate(precios)
 
  # buscar por id
  webElem1 - remDr$findElement(using = 'id', value = 'cri1')
 
  ___
  R-help-es mailing list
  R-help-es@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-help-es
 



 --
 Saludos,
 Carlos Ortega
 www.qualityexcellence.es

  [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es



 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Mean error message missing

2015-06-08 Thread Christian Brandstätter

Thank you for the explanation.
But if you take for instance plot.default(), being another generic 
function, it would not work like that:
plot(1,2,3,4), only plot(1,2) is accepted.


 From R-help (Usage):
## Default S3 method:
mean(x, trim = 0, na.rm = FALSE, ...)

What is puzzling, is that apparently na.rm (and trim, which is indicated in the 
help) is accepting numeric values.
mean(c(1,NA,10),10,TRUE)
mean(c(1,NA,10),10,FALSE)

This should give at least a warning in my opinion.

mean(c(1,NA,10),10,200)



On 08/06/2015 09:27, Achim Zeileis wrote:

 On Mon, 8 Jun 2015, Christian Brandst�tter wrote:

 Dear list,

 I found an odd behavior of the mean function; it is allowed to do 
 something that you probably shouldn't:
 If you calculate mean() of a sequence of numbers (without declaring 
 them as vector), mean() then just computes mean() of the first 
 element. Is there a reason why there is no warning, like in sd for 
 example?

 mean() - unlike sd() - is a generic function that has a '...' argument 
 that is passed on to its methods. The default method which is called 
 in your example also has a '...' argument (because the generic has it) 
 but doesn't use it.

 Example code:
 mean(1,2,3,4)
 sd(1,2,3,4)

 Best regards
 Christian

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cannot Sum with DDPLY

2015-06-08 Thread Shivi82

Hi All,
Kindly see the below code I have used:
maxorder-ddply(test, ~ ORIGIN,summarize,Weight=sum(CHG_WT))

Here I have written the code to summarize values based on origin and total
weight however I am getting below error:
Error: ‘sum’ not meaningful for factors

Please advice. I need CHG_WT total for each state in the Origin column.




--
View this message in context: 
http://r.789695.n4.nabble.com/Cannot-Sum-with-DDPLY-tp4708338.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] source code for dbeta

2015-06-08 Thread Duncan Murdoch

On 07/06/2015 11:05 PM, Varun Sinha wrote:
 Hi,
 
 Thanks a lot. I downloaded the tar.gz file and I found the C code.
 
 I would really appreciate it if you could field another question:
 I have to use sql, and I have to perform various statistical
 calculations - like integrate, dbeta etc. Sql does not have these
 functions, plus they are very difficult to code. Would it be possible to
 use the C code, compile it and deploy it in sql? Is that feasible, or
 even permitted?

It is permitted for local use only without conditions.  If you want to
deploy the application, your application must be licensed under the GPL.

As to the practicality:  see the Writing R Extensions manual, section
6.16, which describes how to link many math functions (I forget if dbeta
is included) into your own C code.  Linking those to your database
system will strongly depend on which database system you're using, and I
think for all of them the question would be off topic here.  You need to
ask on their help forum.

Duncan Murdoch


 
 Thanks once again, I'm very grateful.
 
 
 On Mon, Jun 8, 2015 at 2:06 AM, Duncan Murdoch murdoch.dun...@gmail.com
 mailto:murdoch.dun...@gmail.com wrote:
 
 On 07/06/2015 6:11 PM, Mark Sharp wrote:
  Varun,
 
  If you type dbeta at the command line you get the R source, which in 
 this case tells you that the code is calling a compiled source. This is 
 indicated by the line bytecode: 0x7fc3bb1b84e0
 
 No, that says that the R code (what is shown) is compiled.  What
 indicates that this is C code is the use of .Call.  The C_dbeta and
 C_dnbeta objects are NativeSymbolInfo objects that hold the pointers
 to the C entry points.
 
 Since it is in a base package (stats), the source is in the R sources,
 somewhere in  https://svn.r-project.org/R/trunk/src/library/stats/src.
 You can search through those files for the dbeta or dnbeta functions.
 The C_ prefix is conventionally used in the R sources to indicate that
 it is C code; generally you replace it with do_ in the actual C code.
  This particular function is actually not really in the package source;
 it's in the main part of the R sources, in file
 
 https://svn.r-project.org/R/trunk/src/nmath/dbeta.c
 
 (though it takes a few steps to get there, starting in the stats package
 function do_dbeta).
 
 Duncan Murdoch
 
  See the following.
  dbeta
  function (x, shape1, shape2, ncp = 0, log = FALSE)
  {
  if (missing(ncp))
  .Call(C_dbeta, x, shape1, shape2, log)
  else .Call(C_dnbeta, x, shape1, shape2, ncp, log)
  }
  bytecode: 0x7fc3bb1b84e0
  environment: namespace:stats
 
  Compiled code in a package
 
  If you want to view compiled code in a package, you will need to
 download/unpack the package source. The installed binaries are not
 sufficient. A package's source code is available from the same CRAN
 (or CRAN compatible) repository that the package was originally
 installed from. The download.packages() function can get the package
 source for you.
 
  Extracted from
 
 http://stackoverflow.com/questions/19226816/how-can-i-view-the-source-code-for-a-function
 
  Mark
 
 
  R. Mark Sharp, Ph.D.
  msh...@txbiomed.org
 
 
  On Jun 7, 2015, at 4:31 AM, Varun Sinha sinha.varun...@gmail.com
 mailto:sinha.varun...@gmail.com wrote:
 
  Hi,
 
  I am trying to find the source code for dbeta function.
 
  I tried edit(dbeta) and this is what I got:
  edit(dbeta)
  function (x, shape1, shape2, ncp = 0, log = FALSE)
  {
 if (missing(ncp))
 .Call(C_dbeta, x, shape1, shape2, log)
 else .Call(C_dnbeta, x, shape1, shape2, ncp, log)
  }
  environment: namespace:stats
 
  It looks like it is calling calling C_dbeta, but I'm not sure. If
 it does,
  how do I find it's source code?
 
  Thank you!
  Varun
 
   [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailto:R-help@r-project.org mailing list
 -- To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
  __
  R-help@r-project.org mailto:R-help@r-project.org mailing list --
 To UNSUBSCRIBE and more, see
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread John Sorkin

David,I appreciate you suggestion, but it won't work for me. I need to replace 
the space for a period at the time the data are read, not afterward. My 
variables names have periods I want to keep, if I use your suggestion I will 
replace the period inserted when the data are read, as well as the period that 
I want to keep.
Thank you,
John 



John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 

 David L Carlson dcarl...@tamu.edu 06/08/15 10:21 AM 
You can use gsub() to change the names:

 dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15))
 dat
  Var.1Var.2
1  9.627122 14.15376
2 10.741617 16.92937
3  8.492926 15.23767
4 12.226146 15.19834
5  8.829982 14.46957
 names(dat) - gsub(\\., _, names(dat))
 dat
  Var_1Var_2
1  9.627122 14.15376
2 10.741617 16.92937
3  8.492926 15.23767
4 12.226146 15.19834
5  8.829982 14.46957

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin
Sent: Monday, June 8, 2015 9:16 AM
Cc: r-help@r-project.org
Subject: [R] Blank spaces are replaced by period in read.csv, I want to replace 
blacks with an underline

I am reading a csv file. The column headers have spaces in them. The spaces are 
replaced by a period. I want to replace the space by another character (e.g. 
the underline) rather than the period. Can someone tell me how to accomplish 
this?Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized use, disclosure or distribution is prohibited. If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Call
Send SMS
Call from mobile
Add to Skype
You'll need Skype CreditFree via Skype



Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized use, disclosure or distribution is prohibited. If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread Mark Sharp

John,

I like using stringr or stringi for this type of thing. stringi is written in C 
and faster so I now typically use it. You can also use base functions. The main 
trick is the handy names() function.

 example - data.frame(Col 1 A = 1:3, Col 1 B = letters[1:3])
 example
  Col.1.A Col.1.B
1   1   a
2   2   b
3   3   c
 library(stringi)
 names(example) - stri_replace_all_fixed(names(example), ., _)
 example
  Col_1_A Col_1_B
1   1   a
2   2   b
3   3   c

R. Mark Sharp, Ph.D.
Director of Primate Records Database
Southwest National Primate Research Center
Texas Biomedical Research Institute
P.O. Box 760549
San Antonio, TX 78245-0549
Telephone: (210)258-9476
e-mail: msh...@txbiomed.org







 On Jun 8, 2015, at 9:15 AM, John Sorkin jsor...@grecc.umaryland.edu wrote:
 
 I am reading a csv file. The column headers have spaces in them. The spaces 
 are replaced by a period. I want to replace the space by another character 
 (e.g. the underline) rather than the period. Can someone tell me how to 
 accomplish this?Thank you,
 John
 
 John David Sorkin M.D., Ph.D.
 Professor of Medicine
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology and 
 Geriatric Medicine
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing) 
 
 
 Confidentiality Statement:
 This email message, including any attachments, is for ...{{dropped:12}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] columna de un data.table puede ser data.frame?

2015-06-08 Thread MªLuz Morales

Hola,

si, se que para trabajar con data.table se usan [] pero en esta línea estoy
creando el data.table, es decir, data.table es la función a la que llamo y
PesosParam es el data.table que creo

Un saludo
MªLuz Morales

El 8 de junio de 2015, 15:17, Carlos J. Gil Bellosta c...@datanalytics.com
escribió:

 Hola, ¿qué tal?

 data.table funciona con corchetes, no paréntesis. ¿Has leído la
 viñeta/tutorial?

 Un saludo,

 Carlos J. Gil Bellosta
 http://www.datanaytics.com

 El día 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com
 escribió:
  Hola,
 
  yo quiero construir un data.table donde una columna (Parametros) son
  caracteres y otra el resultado de la función information.gain, que
 devuelve
  un data.frame. El código que he usado es este, pero me da error
 
  PesosParam - data.table(,.(Parametros, Peso:=
  information.gain(In.hospital_death~., ParamCol)))
 
  Es posible hacer lo que digo? o debo hacer una transformación del
  data.frame a data.table explícitamente. Esto también lo he probado con el
  código:
 
  # Conversión de data.frame a data.table
  setattr(PesosParam, class, c(data.table, data.frame))
  data.table:::settruelength(PesosParam, 0L)
  invisible(alloc.col(PesosParam))
 
  pero no encuentra settruelength
 
  Gracias
  Un saludo
  MªLuz Morales
 
  [[alternative HTML version deleted]]
 
  ___
  R-help-es mailing list
  R-help-es@r-project.org
  https://stat.ethz.ch/mailman/listinfo/r-help-es


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread David L Carlson

You can use gsub() to change the names:

 dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15))
 dat
  Var.1Var.2
1  9.627122 14.15376
2 10.741617 16.92937
3  8.492926 15.23767
4 12.226146 15.19834
5  8.829982 14.46957
 names(dat) - gsub(\\., _, names(dat))
 dat
  Var_1Var_2
1  9.627122 14.15376
2 10.741617 16.92937
3  8.492926 15.23767
4 12.226146 15.19834
5  8.829982 14.46957

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin
Sent: Monday, June 8, 2015 9:16 AM
Cc: r-help@r-project.org
Subject: [R] Blank spaces are replaced by period in read.csv, I want to replace 
blacks with an underline

I am reading a csv file. The column headers have spaces in them. The spaces are 
replaced by a period. I want to replace the space by another character (e.g. 
the underline) rather than the period. Can someone tell me how to accomplish 
this?Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 


Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:12}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Summarizing data based on Date

2015-06-08 Thread PIKAL Petr

Hi

 -Original Message-
 From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82
 Sent: Monday, June 08, 2015 11:23 AM
 To: r-help@r-project.org
 Subject: Re: [R] Summarizing data based on Date

 Hi Petr,
 Thanks for the explanation below.
 I tried the code you supplied however it seems as my date is a factor
 hence it is not working.

So you need to change your factor to date.

see
?as.Date

as.Date(as.factor(Sys.Date()))

Cheers
Petr

 The error I got from the code was :

 Error: unexpected symbol in:
 final-aggregate(test$CHG_WT,list(format(test$CR_DT,%d),sum)
 final

 str(test$CR_DT)- gives Factor with 31 levels

 --
 View this message in context:
 http://r.789695.n4.nabble.com/Summarizing-data-based-on-Date-
 tp4708328p4708333.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread John Sorkin

I am reading a csv file. The column headers have spaces in them. The spaces are 
replaced by a period. I want to replace the space by another character (e.g. 
the underline) rather than the period. Can someone tell me how to accomplish 
this?Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 


Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized use, disclosure or distribution is prohibited. If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cannot Sum with DDPLY

2015-06-08 Thread PIKAL Petr

Hi

 -Original Message-
 From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Shivi82
 Sent: Monday, June 08, 2015 12:48 PM
 To: r-help@r-project.org
 Subject: [R] Cannot Sum with DDPLY

 Hi All,
 Kindly see the below code I have used:
 maxorder-ddply(test, ~ ORIGIN,summarize,Weight=sum(CHG_WT))

 Here I have written the code to summarize values based on origin and
 total weight however I am getting below error:
 Error: ‘sum’ not meaningful for factors

I consider this error message as ***extremely*** informative. Your CHG_WT is 
actually the factor object and sum is not meaningful for factors.

Maybe it is time for you to check R-Intro document about objects and their 
differences before you continue with posting.

If CHG_WT shall be numeric you probably did not read it correctly for whatever 
reason.

Cheers
Petr

 Please advice. I need CHG_WT total for each state in the Origin column.

 --
 View this message in context: http://r.789695.n4.nabble.com/Cannot-Sum-
 with-DDPLY-tp4708338.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread Sarah Goslee

Easiest? Use sub() to replace the periods after the fact.

You can also use the check.names or the col.names arguments to
read.table() to customize your import.

Sarah

On Mon, Jun 8, 2015 at 10:15 AM, John Sorkin
jsor...@grecc.umaryland.edu wrote:
 I am reading a csv file. The column headers have spaces in them. The spaces 
 are replaced by a period. I want to replace the space by another character 
 (e.g. the underline) rather than the period. Can someone tell me how to 
 accomplish this?Thank you,
 John

 John David Sorkin M.D., Ph.D.
 Professor of Medicine
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology and 
 Geriatric Medicine
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Mean error message missing

2015-06-08 Thread Duncan Murdoch

On 08/06/2015 6:04 AM, Christian Brandstätter wrote:
 Thank you for the explanation.
 But if you take for instance plot.default(), being another generic 
 function, it would not work like that:
 plot(1,2,3,4), only plot(1,2) is accepted.
 
 
  From R-help (Usage):
 ## Default S3 method:
 mean(x, trim = 0, na.rm = FALSE, ...)
 
 What is puzzling, is that apparently na.rm (and trim, which is indicated in 
 the help) is accepting numeric values.
 mean(c(1,NA,10),10,TRUE)
 mean(c(1,NA,10),10,FALSE)
 
 This should give at least a warning in my opinion.

It is a common idiom in R programming to treat non-zero values as TRUE,
and zero as FALSE.  If every use of a number where a logical is needed
generated a warning, you'd be swamped with them.

Duncan Murdoch

 
 mean(c(1,NA,10),10,200)
 
 
 
 On 08/06/2015 09:27, Achim Zeileis wrote:
 
 On Mon, 8 Jun 2015, Christian Brandst�tter wrote:

 Dear list,

 I found an odd behavior of the mean function; it is allowed to do 
 something that you probably shouldn't:
 If you calculate mean() of a sequence of numbers (without declaring 
 them as vector), mean() then just computes mean() of the first 
 element. Is there a reason why there is no warning, like in sd for 
 example?

 mean() - unlike sd() - is a generic function that has a '...' argument 
 that is passed on to its methods. The default method which is called 
 in your example also has a '...' argument (because the generic has it) 
 but doesn't use it.

 Example code:
 mean(1,2,3,4)
 sd(1,2,3,4)

 Best regards
 Christian

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
   [[alternative HTML version deleted]]
 
 
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] columna de un data.table puede ser data.frame?

2015-06-08 Thread MªLuz Morales

Hola,

yo quiero construir un data.table donde una columna (Parametros) son
caracteres y otra el resultado de la función information.gain, que devuelve
un data.frame. El código que he usado es este, pero me da error

PesosParam - data.table(,.(Parametros, Peso:=
information.gain(In.hospital_death~., ParamCol)))

Es posible hacer lo que digo? o debo hacer una transformación del
data.frame a data.table explícitamente. Esto también lo he probado con el
código:

# Conversión de data.frame a data.table
setattr(PesosParam, class, c(data.table, data.frame))
data.table:::settruelength(PesosParam, 0L)
invisible(alloc.col(PesosParam))

pero no encuentra settruelength

Gracias
Un saludo
MªLuz Morales

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Mean error message missing

2015-06-08 Thread Christian Brandstätter


Thank you very much, I didn't know that.


On 08/06/2015 6:04 AM, Christian Brandstätter wrote:

Thank you for the explanation.
But if you take for instance plot.default(), being another generic
function, it would not work like that:
plot(1,2,3,4), only plot(1,2) is accepted.


  From R-help (Usage):
## Default S3 method:
mean(x, trim = 0, na.rm = FALSE, ...)

What is puzzling, is that apparently na.rm (and trim, which is indicated in the 
help) is accepting numeric values.
mean(c(1,NA,10),10,TRUE)
mean(c(1,NA,10),10,FALSE)

This should give at least a warning in my opinion.

It is a common idiom in R programming to treat non-zero values as TRUE,
and zero as FALSE.  If every use of a number where a logical is needed
generated a warning, you'd be swamped with them.

Duncan Murdoch


mean(c(1,NA,10),10,200)



On 08/06/2015 09:27, Achim Zeileis wrote:


On Mon, 8 Jun 2015, Christian Brandst�tter wrote:


Dear list,

I found an odd behavior of the mean function; it is allowed to do
something that you probably shouldn't:
If you calculate mean() of a sequence of numbers (without declaring
them as vector), mean() then just computes mean() of the first
element. Is there a reason why there is no warning, like in sd for
example?

mean() - unlike sd() - is a generic function that has a '...' argument
that is passed on to its methods. The default method which is called
in your example also has a '...' argument (because the generic has it)
but doesn't use it.


Example code:
mean(1,2,3,4)
sd(1,2,3,4)

Best regards
Christian

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] columna de un data.table puede ser data.frame?

2015-06-08 Thread Carlos J. Gil Bellosta

Hola, ¿qué tal?

data.table funciona con corchetes, no paréntesis. ¿Has leído la viñeta/tutorial?

Un saludo,

Carlos J. Gil Bellosta
http://www.datanaytics.com

El día 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com escribió:
 Hola,

 yo quiero construir un data.table donde una columna (Parametros) son
 caracteres y otra el resultado de la función information.gain, que devuelve
 un data.frame. El código que he usado es este, pero me da error

 PesosParam - data.table(,.(Parametros, Peso:=
 information.gain(In.hospital_death~., ParamCol)))

 Es posible hacer lo que digo? o debo hacer una transformación del
 data.frame a data.table explícitamente. Esto también lo he probado con el
 código:

 # Conversión de data.frame a data.table
 setattr(PesosParam, class, c(data.table, data.frame))
 data.table:::settruelength(PesosParam, 0L)
 invisible(alloc.col(PesosParam))

 pero no encuentra settruelength

 Gracias
 Un saludo
 MªLuz Morales

 [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R-es] columna de un data.table puede ser data.frame?

2015-06-08 Thread MªLuz Morales

Solucionado
Efectivamente era un problema de notación. Esto funciona
PesosParam - data.table(Param = Parametros, Peso =
information.gain(In.hospital_death~., ParamCol))
Nota: Parametros es un vector de caracteres

Gracias
Un saludo
MªLuz

El 8 de junio de 2015, 16:08, Carlos Ortega c...@qualityexcellence.es
escribió:

 Hola Mª Luz,

 Cuando vayas a crear el data.table lo haces igual que un data.frame.
 Fíjate en el ejemplo de la ayuda de data.table():

 DF = data.frame(x=rep(c(a,b,c),each=3), y=c(1,3,6), v=1:9)
 DT = data.table(x=rep(c(a,b,c),each=3), y=c(1,3,6), v=1:9)

 Cuando creas el data.table, la opción que has utilizado de .() no aplica.
 Esto aplica una vez ya tienes el data.table y quierez crear una nueva
 variable.

 Entonces el código para crear el data.table sería algo parecido a esto:

 PesosParam - data.table(Parametros, Peso=
 information.gain(In.hospital_death~., ParamCol))

 Aunque lo anterior si te da error, es porque tienes que decir
 explícitamente de dónde salen: Parametros, In.hospital_death y
 ParamCol. Por eso a lo mejor tienes que definir estas variables antes:

 Parametros - my_data_table$Parametros

 Y para information.gain(), entiendo que además de las variables tendrás
 que indicar el data.frame (el data) donde están esas variables...


 Saludos,
 Carlos Ortega
 www.qualityexcellence.es

 El 8 de junio de 2015, 15:14, MªLuz Morales mlzm...@gmail.com escribió:

 Hola,

 yo quiero construir un data.table donde una columna (Parametros) son
 caracteres y otra el resultado de la función information.gain, que
 devuelve
 un data.frame. El código que he usado es este, pero me da error

 PesosParam - data.table(,.(Parametros, Peso:=
 information.gain(In.hospital_death~., ParamCol)))

 Es posible hacer lo que digo? o debo hacer una transformación del
 data.frame a data.table explícitamente. Esto también lo he probado con el
 código:

 # Conversión de data.frame a data.table
 setattr(PesosParam, class, c(data.table, data.frame))
 data.table:::settruelength(PesosParam, 0L)
 invisible(alloc.col(PesosParam))

 pero no encuentra settruelength

 Gracias
 Un saludo
 MªLuz Morales

 [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es




 --
 Saludos,
 Carlos Ortega
 www.qualityexcellence.es


[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread David L Carlson

Then using Sarah's suggestion something like?

 dat - read.table(text=
+ 'Var 1' Var.2
+ 1 6
+ 2 7
+ 3 8
+ 4 9
+ 5 10, header=TRUE, col.names=c(Var_1, Var.2))
 dat
  Var_1 Var.2
1 1 6
2 2 7
3 3 8
4 4 9
5 510

David C

From: John Sorkin [mailto:jsor...@grecc.umaryland.edu] 
Sent: Monday, June 8, 2015 9:25 AM
To: David L Carlson
Cc: r-help@r-project.org
Subject: RE: [R] Blank spaces are replaced by period in read.csv, I want to 
replace blacks with an underline

David,
I appreciate you suggestion, but it won't work for me. I need to replace the 
space for a period at the time the data are read, not afterward. My variables 
names have periods I want to keep, if I use your suggestion I will replace the 
period inserted when the data are read, as well as the period that I want to 
keep.
Thank you,
John 


John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 

 David L Carlson dcarl...@tamu.edu 06/08/15 10:21 AM 
You can use gsub() to change the names:

 dat - data.frame(Var 1=rnorm(5, 10), Var 2=rnorm(5, 15))
 dat
Var.1 Var.2
1 9.627122 14.15376
2 10.741617 16.92937
3 8.492926 15.23767
4 12.226146 15.19834
5 8.829982 14.46957
 names(dat) - gsub(\\., _, names(dat))
 dat
Var_1 Var_2
1 9.627122 14.15376
2 10.741617 16.92937
3 8.492926 15.23767
4 12.226146 15.19834
5 8.829982 14.46957

-
David L Carlson
Department of Anthropology
Texas AM University
College Station, TX 77840-4352



-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Sorkin
Sent: Monday, June 8, 2015 9:16 AM
Cc: r-help@r-project.org
Subject: [R] Blank spaces are replaced by period in read.csv, I want to replace 
blacks with an underline

I am reading a csv file. The column headers have spaces in them. The spaces are 
replaced by a period. I want to replace the space by another character (e.g. 
the underline) rather than the period. Can someone tell me how to accomplish 
this?Thank you,
John

John David Sorkin M.D., Ph.D.
Professor of Medicine
Chief, Biostatistics and Informatics
University of Maryland School of Medicine Division of Gerontology and Geriatric 
Medicine
Baltimore VA Medical Center
10 North Greene Street
GRECC (BT/18/GR)
Baltimore, MD 21201-1524
(Phone) 410-605-7119410-605-7119
(Fax) 410-605-7913 (Please call phone number above prior to faxing) 


Confidentiality Statement:
This email message, including any attachments, is for th...{{dropped:24}}

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread Sarah Goslee

I've taken the liberty of copying this back to the list, so that others can
participate in or benefit from the discussion.

On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu
wrote:

 Sarah,
 I am not sure how I use check.names to replace every space in the names of
 my variables with an underline. Can you show me how to do this? My current
 code is as follows:


check.names just tells R not to reformat your column names. If they aren't
already what you want, you'll need to do something else.


 data - read.csv(C:\\Users\\john\\Dropbox
 (Personal)\\HanlonMatt\\fullgenus3.csv)

 The problem I has is that my column names are not unique, e.g., I have
 multiple columns whose column names are (in CSV format):
 X Y, X Y, X Y, X Y
 R reads the names as follows:
 X.Y, X.Y.1, X.Y.2, X.Y.3
 I need to have the names look like:
 X_Y, X_Y.1, X_Y.2, X_Y.3


You've been saying that you want to replace every space with an underscore,
but that's not what your example shows. Instead, you want to let R import
the names and add the identifying number (though if you do it yourself you
can get the number to match the column number, which is neater), then
change the FIRST underscore to a period.

I'd import them with check.names=FALSE, then modify them explicitly:


 mynames - c(x y, x y, x y, x y)
 mynames
[1] x y x y x y x y
 mynames - sub( , ., mynames)
 mynames
[1] x.y x.y x.y x.y
 mynames - paste(mynames, seq_along(mynames), sep=_)
 mynames
[1] x.y_1 x.y_2 x.y_3 x.y_4


You could also let R modify them, then use sub() to change the first
underscore to a period and leave the rest alone.

Sarah

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R-es] help awk y shells en R

2015-06-08 Thread Javier Villacampa González

Hola buenas,

a veces empleo desde R shells de unix, Existe alguna manera de utilizar
estos shelss desde windows o el lenguaje awk.

La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows
es posible. Pero no me queda claro

Un abrazo y gracias por adelntado

Javier
#_
# EJEMPLO, ¿Que habría que poner en
# ¿¿???
# suponiendoq que tengo cygwin instalado
#_

# Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee
bien los Ë

file.rename(from = Data/data.csv, to = Data/data_2.csv)
switch(OS,
   WIN = system( ¿¿???),
   MAC = system( command =  awk \'{gsub( \M.?G\,\MEG\); print}\'
Data/data_2.csv  Data/data_2_2.csv)
)
file.rename(from = Data/data.csv, to = Data/data_2.csv)

--

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread Duncan Murdoch

On 08/06/2015 10:23 AM, Sarah Goslee wrote:
 Easiest? Use sub() to replace the periods after the fact.
 
 You can also use the check.names or the col.names arguments to
 read.table() to customize your import.

Yes, check.names is the right idea.  Use check.names = FALSE, then use
sub() or gsub() to replace the spaces with underscores.

Duncan Murdoch

 Sarah
 
 On Mon, Jun 8, 2015 at 10:15 AM, John Sorkin
 jsor...@grecc.umaryland.edu wrote:
 I am reading a csv file. The column headers have spaces in them. The spaces 
 are replaced by a period. I want to replace the space by another character 
 (e.g. the underline) rather than the period. Can someone tell me how to 
 accomplish this?Thank you,
 John

 John David Sorkin M.D., Ph.D.
 Professor of Medicine
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology and 
 Geriatric Medicine
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New package stepR: fitting step-functions

2015-06-08 Thread Thomas Hotz


Dear R users,

It is my pleasure to announce the availability of package stepR (1.0-2) 
on CRAN.


The main purpose of the package is to fit piecewise constant functions 
(a.k.a. step-functions or block signals) to serial data in a fully 
data-driven manner under certain (Gaussian or non-Gaussian) 
distributional assumptions.


It mainly implements the algorithms described in the references below - 
in a (hopefully) user-friendly fashion.


Try

   library(stepR)
   example(smuceR) # for [1] and [2]
   example(jsmurf) # for [3]
   example(stepsel) # for [4]

to get an idea about what it can do, and how to use it.

We hope it proves useful; community feedback is therefore very welcome!

Best regards

Thomas Hotz
TU Ilmenau, Institute of Mathematics

References:

[1] Frick, K., Munk, A., and Sieling, H. (2014). Multiscale Change-Point 
Inference. With discussion and rejoinder by the authors. Journal of the 
Royal Statistical Society, Series B, 76(3), 495-580.
[2] Futschik, A., Hotz, T., Munk, A. Sieling, H. (2014). Multiresolution 
DNA partitioning: statistical evidence for segments. Bioinformatics,  
30(16), 2255-2262.
[3] Hotz, T., Schütte, O., Sieling, H., Polupanow, T., Diederichsen, U., 
Steinem, C., and Munk, A. (2013). Idealizing Ion Channel Recordings by a 
Jump Segmentation Multiresolution Filter. IEEE Transactions on 
NanoBioscience, 12(4), 376-386.
[4] Boysen, L., Kempe, A., Liebscher, V., Munk, A., Wittich, O. (2009). 
Consistencies and rates of convergence of jump-penalized least squares 
estimators. The Annals of Statistics, 37(1), 157-183.


___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New R package kwb.hantush (0.2.1): calculation of groundwater mounding beneath an (stormwater) infiltration basin

2015-06-08 Thread Michael Rustler

Dear R users,

It is a pleasure for me to announce the availability of the new package 
kwb.hantush (0.2.1)  on CRAN. Its objective is the calculation of groundwater 
mounding beneath an (stormwater) infiltration basin by solving the Hantush 
(1967) equation. For checking the correct implementation of the algorithm the R 
modelling results were cross-checked against alternative models assessed in 
Carleton (2010) by using the same model parameterisation. 

References: 
[1] Carleton, G.B., 2010, Simulation of groundwater mounding beneath 
hypothetical stormwater infiltration basins: U.S. Geological Survey Scientific 
Investigations Report 2010-5102, 64 p.; http://pubs.usgs.gov/sir/2010/5102/ 

[2] Hantush, M.S. (1967): Growth and decay of groundwater-mounds in response to 
uniform percolation, Water Resources Research (March, 1967); 
http://doi.org/10.1029/WR003i001p00227 


How to get started ? 

* Firstly install the package from CRAN by using:  
install.packages(kwb.hantush)

* Secondly visit http://kwb-r.github.io/kwb.hantush  for doing a short 
tutorial, which answers the following questions: 
- How do I perform model runs ?

- How accurate is the solution of the Hantush equation implemented in R 
compared to the a reference model 
  (e.g. U.S. Geological Survey EXCEL spreadsheet solution, 
http://pubs.usgs.gov/sir/2010/5102/support/Hantush_USGS_SIR_2010-5102-1110.xlsm)
 ?

I hope it proves to be useful and community feedback is therefore very welcome!

Best Regards, 
Michael Rustler


Dipl.-Geoök. Michael Rustler
KompetenzZentrum Wasser Berlin gGmbH
Cicerostr. 24
D-10709 Berlin
Tel. +49 (0)30 53653 825
Fax +49 (0)30 53653 888
Email: michael.rust...@kompetenz-wasser.de
Homepage: www.kompetenz-wasser.de http://www.kompetenz-wasser.de


Geschäftsführer:
Dipl.-Ing. Andreas Hartmann
Sitz der Gesellschaft: Berlin
Amtsgericht Charlottenburg
HRB 84461

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] [R-pkgs] New version of wikipediatrend

2015-06-08 Thread Peter Meissner



Dear UseRs,


wikipediatrend - a package to retrieve Wikipedia page access statistics -  
has jumped from version 0.2 to 1.1.3 and now is more streamlined, feature  
richer, more tested and comes with a vignette as well as a lot of fun.



packge information: http://cran.rstudio.com/web/packages/wikipediatrend

vignette:
http://cran.rstudio.com/web/packages/wikipediatrend/vignettes/using-wikipediatrend.html


project page:   https://github.com/petermeissner/wikipediatrend


Best, Peter




NEWS wikipediatrend
==

version 1.1.3 // 2015-06-04 ...
--

- modifying vignette to comply with CRAN policies (dropping lines  
installing packages if not present)



version 1.1.2 // 2015-05-23 ...
--

- modifying caching to comply with CRAN policies

- changing default folder of cache file from temp (basename(tempdir())) to  
Rtemp ( tempdir() )



version 1.1.1 // 2015-05-23 ...
--

- adding ghrr as additional repo to comply with CRAN policies

- changing default folder of cache file from home (~) to temp  
(basename(tempdir()))



version 1.1.0 // 2015-05-21 ...
--

- feature: caching has been overhauled

- feature: wp_trend() now tries to guess if page was supplied as title  
with possible special characters or as (url-encoded) URL part and take  
care of  further processing


- bug-fix: special character support of the packages was lousy and  
preventing

the usage of articles of non-standard languages ( - especially on Windows)
  * introduction of the wp_df class to allow for a print.wp_df that
a) shortens long strings on print
b) does not use format() (format() causes UTF-8 characters to be  
replaced by U+ strings (propably only))
  * using a package specific write_utf8_csv() and read_utf8_csv() to be  
able to store and cache data for articles with special character names  
(even under Windows, write.csv() does not allow enforcing a specific  
encoding)


- bug-fix / backward compatibility: with version 1.0.0 old parameters for
wp_trend() were causing errors

- bug-fix: wp_cache_reset() would stop with an error if called twice in a  
row - fixed



version 1.0.0 // 2015-04-01 ...
--

- api-change: option userAgent deleted: the default is to send information  
on

versions of R, wikipediatrend, curl as well as RCurl

- api-change: option requestFrom deleted: the default is to not send the  
header


- feature: wp_trend() now by default caches data retrievals in a temporary  
file


- feature: wp_trend(file=save.csv) now allows to specify a file where
retrievals are stored (this will always add to the already existing data)

- feature: wp_trend() now allows to specify more than one page and/or  
language

at a time. data than will be retrieved for every combination of
page-language and date

- feature: caching system is persistant wp_cache_file() will report file  
used for

caching; wp_cache_reset() will reset cache; wp_cache_load() will return its
content as data.frame()

- feature: while wp_trend() now (invisibly) returns only data from the  
current
request at hand the new function wp_cache() will retrieve data from cache  
files
(by default / if no file name is specified it retrieves data from  
.wp_trend_cache)


- api-change: the data returned by wp_trend(), cached in cache-file,  
retrieved by

wp_cache() does consist of more variables: date, count, project, title,
rank, month

- feature: testthat tests now check base functionality of the package

- bug-fix: non-existing page views for a month have led to an error, fixed.

- bug-fix: wp_trend() now checks date inputs better for logical  
inconsistencies



version 0.2.0 // 2014-11-01 ...
--

- first puplication on CRAN

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] mismatch between match and unique causing ecdf (well, approxfun) to fail

2015-06-08 Thread Martin Maechler


 Aehm, adding on this: I incorrectly *assumed* without testing that rounding 
 would help; it doesn't:
 ecdf(round(test2,0))  # a rounding that is way too rough for my application...
 #Error in xy.coords(x, y) : 'x' and 'y' lengths differ
 
 Digging deeper: The initially mentioned call to unique() is not very helpful, 
 as test2 is a data frame, so I get what I deserve, an unchanged data frame 
 with 1 row. Still, the issue remains and can even be simplified further:
 
  ecdf(data.frame(a=3, b=4))
 Empirical CDF 
 Call: ecdf(data.frame(a = 3, b = 4))
  x[1:2] =  3,  4
 
 works ok, but
 
  ecdf(data.frame(a=3, b=3))
 Error in xy.coords(x, y) : 'x' and 'y' lengths differ
 
 doesn't (same for a=b=1 or 2, so likely the same for any a=b). Instead, 
 
  ecdf(c(a=3, b=3))
 Empirical CDF 
 Call: ecdf(c(a = 3, b = 3))
  x[1:1] =  3
 
 does the trick. From ?ecdf, I get that x should be a numeric vector - 
 apparently, my misuse of the function by applying it to a row of a data frame 
 (i.e. a data frame with one row). In all my other (dozens of) cases that 
 worked ok, though but not for this particular one. A simple unlist() helps:

You were lucky.   To use a one-row data frame instead of a
numerical vector will typically *not* work unless ... well, you
are lucky.

No, do *not*  pass data frame rows instead of numeric vectors.

 
  ecdf(unlist(data.frame(a=3, b=3)))
 Empirical CDF 
 Call: ecdf(unlist(data.frame(a = 3, b = 3)))
  x[1:1] =  3
 
 Yet, I'm even more confused than before: in my other data, there were also 
 duplicated values in the vector (1-row-data frame), and it never caused any 
 issue. For this particular example, it does. I must be missing something 
 fundamental...
  

well.. I'm confused about why you are confused,
but if you are thinking about passing rows of data frames as
numeric vectors, this means you are sure that your data frame
only contains classical numbers (no factors, no 'Date's,
no...).

In such a case, transform your data frame to a numerical matrix
*once* preferably using  data.matrix(d.fr) instead of just  as.matrix(d.fr)
but in this case it should not matter.
Then *check* the result and then work with that matrix from then on.

All other things probably will continue to leave you confused ..
;-)

Martin Maechler, 
ETH Zurich

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread peter dalgaard


On 08 Jun 2015, at 17:03 , Sarah Goslee sarah.gos...@gmail.com wrote:

 
 I'd import them with check.names=FALSE, then modify them explicitly:
 
 
 mynames - c(x y, x y, x y, x y)
 mynames
 [1] x y x y x y x y
 mynames - sub( , ., mynames)
 mynames
 [1] x.y x.y x.y x.y
 mynames - paste(mynames, seq_along(mynames), sep=_)
 mynames
 [1] x.y_1 x.y_2 x.y_3 x.y_4

Didn't he want x_y.1, not x.y_1? Obviously, just switch . and _ for that.

A potential improvement (in case not all columns are x y) is to replace the 
last bit with make.unique(mynames).

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread John Sorkin

Sarah, 
Many, many thanks.
John

 John David Sorkin M.D., Ph.D.
 Professor of Medicine
 Chief, Biostatistics and Informatics
 University of Maryland School of Medicine Division of Gerontology and 
 Geriatric Medicine
 Baltimore VA Medical Center
 10 North Greene Street
 GRECC (BT/18/GR)
 Baltimore, MD 21201-1524
 (Phone) 410-605-7119
 (Fax) 410-605-7913 (Please call phone number above prior to faxing)


 On Jun 8, 2015, at 11:04 AM, Sarah Goslee sarah.gos...@gmail.com wrote:
 
 I've taken the liberty of copying this back to the list, so that others can 
 participate in or benefit from the discussion.
 
 On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu 
 wrote:
 Sarah,
 I am not sure how I use check.names to replace every space in the names of 
 my variables with an underline. Can you show me how to do this? My current 
 code is as follows:
 
 check.names just tells R not to reformat your column names. If they aren't 
 already what you want, you'll need to do something else. 
  
 data - read.csv(C:\\Users\\john\\Dropbox 
 (Personal)\\HanlonMatt\\fullgenus3.csv)
 
 The problem I has is that my column names are not unique, e.g., I have 
 multiple columns whose column names are (in CSV format):
 X Y, X Y, X Y, X Y
 R reads the names as follows:
 X.Y, X.Y.1, X.Y.2, X.Y.3
 I need to have the names look like:
 X_Y, X_Y.1, X_Y.2, X_Y.3
 
 You've been saying that you want to replace every space with an underscore, 
 but that's not what your example shows. Instead, you want to let R import the 
 names and add the identifying number (though if you do it yourself you can 
 get the number to match the column number, which is neater), then change the 
 FIRST underscore to a period.
 
 I'd import them with check.names=FALSE, then modify them explicitly:
 
 
  mynames - c(x y, x y, x y, x y)
  mynames
 [1] x y x y x y x y
  mynames - sub( , ., mynames)
  mynames
 [1] x.y x.y x.y x.y
  mynames - paste(mynames, seq_along(mynames), sep=_)
  mynames
 [1] x.y_1 x.y_2 x.y_3 x.y_4
 
 
 You could also let R modify them, then use sub() to change the first 
 underscore to a period and leave the rest alone.
 
 Sarah

Confidentiality Statement:
This email message, including any attachments, is for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
Any unauthorized use, disclosure or distribution is prohibited. If you are not 
the intended recipient, please contact the sender by reply email and destroy 
all copies of the original message. 
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R-es] help awk y shells en R

2015-06-08 Thread Carlos Ortega

Hola,

Mira esto:

http://stackoverflow.com/questions/18603984/using-system-with-windows

Saludos,
Carlos Ortega
www.qualityexcellence.es

El 8 de junio de 2015, 17:14, Javier Villacampa González 
javier.villacampa.gonza...@gmail.com escribió:

 Hola buenas,

 a veces empleo desde R shells de unix, Existe alguna manera de utilizar
 estos shelss desde windows o el lenguaje awk.

 La idea es hacerlo siempre desde R, igual invoncando cygwin desde windows
 es posible. Pero no me queda claro

 Un abrazo y gracias por adelntado

 Javier
 #_
 # EJEMPLO, ¿Que habría que poner en
 # ¿¿???
 # suponiendoq que tengo cygwin instalado
 #_

 # Un ejemplo sería cambiar unos MËG por unos MEG ya que fread no me lee
 bien los Ë

 file.rename(from = Data/data.csv, to = Data/data_2.csv)
 switch(OS,
WIN = system( ¿¿???),
MAC = system( command =  awk \'{gsub( \M.?G\,\MEG\); print}\'
 Data/data_2.csv  Data/data_2_2.csv)
 )
 file.rename(from = Data/data.csv, to = Data/data_2.csv)

 --

 [[alternative HTML version deleted]]

 ___
 R-help-es mailing list
 R-help-es@r-project.org
 https://stat.ethz.ch/mailman/listinfo/r-help-es




-- 
Saludos,
Carlos Ortega
www.qualityexcellence.es

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es

Re: [R] Blank spaces are replaced by period in read.csv, I want to replace blacks with an underline

2015-06-08 Thread William Dunlap

mynames
   [1] x.y x.y x.y x.y
mynames - paste(mynames, seq_along(mynames), sep=_)

In addition, if there were a variety of names in mynames and you
wanted to number each unique name separately you could use ave():

 origNames - c(X, Y, Y, X, Z, X)
 ave(origNames, origNames, FUN=function(x)paste0(x, _, seq_along(x)))
[1] X_1 Y_1 Y_2 X_2 Z_1 X_3
 ave(origNames, origNames,
FUN=function(x)if(length(x)==1) x else paste0(x, _, seq_along(x)))
[1] X_1 Y_1 Y_2 X_2 Z   X_3



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Jun 8, 2015 at 8:03 AM, Sarah Goslee sarah.gos...@gmail.com wrote:

 I've taken the liberty of copying this back to the list, so that others can
 participate in or benefit from the discussion.

 On Mon, Jun 8, 2015 at 10:49 AM, John Sorkin jsor...@grecc.umaryland.edu
 wrote:

  Sarah,
  I am not sure how I use check.names to replace every space in the names
 of
  my variables with an underline. Can you show me how to do this? My
 current
  code is as follows:
 

 check.names just tells R not to reformat your column names. If they aren't
 already what you want, you'll need to do something else.


  data - read.csv(C:\\Users\\john\\Dropbox
  (Personal)\\HanlonMatt\\fullgenus3.csv)
 
  The problem I has is that my column names are not unique, e.g., I have
  multiple columns whose column names are (in CSV format):
  X Y, X Y, X Y, X Y
  R reads the names as follows:
  X.Y, X.Y.1, X.Y.2, X.Y.3
  I need to have the names look like:
  X_Y, X_Y.1, X_Y.2, X_Y.3
 

 You've been saying that you want to replace every space with an underscore,
 but that's not what your example shows. Instead, you want to let R import
 the names and add the identifying number (though if you do it yourself you
 can get the number to match the column number, which is neater), then
 change the FIRST underscore to a period.

 I'd import them with check.names=FALSE, then modify them explicitly:


  mynames - c(x y, x y, x y, x y)
  mynames
 [1] x y x y x y x y
  mynames - sub( , ., mynames)
  mynames
 [1] x.y x.y x.y x.y
  mynames - paste(mynames, seq_along(mynames), sep=_)
  mynames
 [1] x.y_1 x.y_2 x.y_3 x.y_4


 You could also let R modify them, then use sub() to change the first
 underscore to a period and leave the rest alone.

 Sarah

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

50 matches

Mail list logo