Re: [R] Reading a bunch of csv files into R

HJ YAN Mon, 28 May 2012 10:12:49 -0700

Dear Bryan

Thank you so much for your prompt reply!


Please see my responds below under ===== in your reply...

Many thanks again!

HJ

On Mon, May 28, 2012 at 4:45 PM, Bryan Hanson <han...@depauw.edu> wrote:

> OK, a couple of things (I only looked through quickly):
>
> 1.  R doesn't allow variable names to begin with a number.  Be sure you
> don't try that.
>
============
    Yes, I understand this. Some of my csv files' name begining with
number, so I put 'Data' infront them using  'NAME<-
paste("Data",data_names, sep=".")' as shown in my last email.


>  2.  What's the overall goal here?  Read them in, change the name, then
> write them out?  Let us know and it will be easier to help you.
>
=============
    The overall goal here is for my current study I receive hundreds of csv
files every two weeks, and I need to read them into R for futher analysis,
e.g. the data are recorded in 10 minutes apart interval and are collected
every two weeks from a few hundreds monitors.

     So I want to know how to do these jobs more efficiently:

(i) Read them into R; Put the data from same monitors together and checking
missing values, manipulate the data in the way we need, e.g. accordig to
region, monitoring type, which involves aggregating the whole group (or a
sub group) of the data etc;

(ii) Edit the names, because sometimes we want to match names in one format
to another, e.g. 512180_20120523150757==London_2012_May_23rd_15:07:57
(e.g. Location name_Year_Month_Day_Hour_Minute_Second)

(iii) If (i) and (ii) can be done I would think 'write them out' into csv
would not be too difficult. Mainly we do analysis in R and no need output
in csv format so far...




>  3.  Regardless of your goal, I think you are "over thinking" the
> solution.  Let us know what you want to accomplish and we can shorten it up
> I'm sure.
>
=====================
    I am trying to input the data as a list which might be easier, but I am
not sure if other data type has advantage over that...


Data1<-list( NAME)

[1] NAME
 "Data.512180_20120523150757" "Data.513687_20120523181947"
"Data.513690_20120524112111" "Data.521858_20120524091428"
"Data.523215_20120523123419"

for(i in 1:length(filenames)) {Data1[[i]]<-read.csv(filenames[i])}

But when I tried to access the components in this list 'Data1', only the
first method of the three (shown below) works, and I think the other two
are more useful for me. Any ideas??

(1) Data1[[1]]
     *** this one works
(2) Data1[["Data.512180_20120523150757"]]
     *** this one doesn't work
(3)  Data1$Data.512180_20120523150757
      *** this one doesn't work

Hope I have made myself clear here.

Thanks!
HJ


>
> Bryan
>
>  On May 28, 2012, at 11:20 AM, HJ YAN wrote:
>
>   Dear Rui, Kevin, Bryan and Nutter
>
>
> Thank you so much for your very helpful hints!
>
> Now I have extracted all the file names and managed to edit them using the
> code (1)-(4) below and obtained the name format as I wanted
>
> (1) files<-list.files(path = "myworking directory", pattern = NULL,
> all.files = FALSE,
>            full.names = FALSE, recursive = FALSE,ignore.case = FALSE,
> include.dirs = FALSE)
>
> (2) filenames <- files[grep("[.]csv", files)]
>
> [1] "512180_20120523150757.csv"
> "513687_20120523181947.csv"
> "513690_20120524112111.csv"
>  "521858_20120524091428.csv"
>  "523215_20120523123419.csv"
> ...(a few hundred more...)
>
>
> (3) data_names <- gsub("[.]csv", "", filenames)
>
> (4) NAME<- paste("Data",data_names, sep=".")
>
>
> Up to here I got NAME containing all the names I'm going to use..
>
> > NAME
> [1] "Data.512180_20120523150757"
> "Data.513687_20120523181947"
> "Data.513690_20120524112111"
>  "Data.521858_20120524091428"
>  "Data.523215_20120523123419"
> ....
>
>
>  But I still haven't successfuly  read the whole bunch of csv files into R
> and name them as expected...e.g. I want to read "512180_20120523150757.csv"
> into R and name it "Data.512180_20120523150757" and so on...
> For a single file we can just write
>
> Data.512180_20120523150757<-read.csv("512180_20120523150757.csv")
>
> If any of the following commands (as you suggested) works, then my
> question is sorted out. But I got error messages for every attempt...
> (i)
> > df.list <- lapply(seq_len(filenames), read.csv)
>
> Error in seq_len(filenames) :
>   argument must be coercible to non-negative integer
> In addition: Warning message:
> In is.vector(X) : NAs introduced by coercion
>
> > filenames
> [1] "512180_20120523150757.csv" "513687_20120523181947.csv"
> "513690_20120524112111.csv" "521858_20120524091428.csv"
> [5] "523215_20120523123419.csv"...
>
>
> (ii) None of the following code works...
>
> myDir="myworking directory"
>
> #for(i in 1:length(filenames)){assign(NAME[i], read.csv(file.path(myDir,
> filenames[i])))}
> #for(i in 1:5){assign(NAME[i], read.csv(file.path=myDir, filenames[i]))}
>
> setwd("myworking directory")
> #for(i in 1:5){assign(NAME[i], read.csv( filenames[i]))}
>
>
>
> Warning messages:
> 1: In N[i] <- read.csv(filenames[i]) :
>   number of items to replace is not a multiple of replacement length
> 2: In N[i] <- read.csv(filenames[i]) :
>   number of items to replace is not a multiple of replacement length
> 3: In N[i] <- read.csv(filenames[i]) :
>   number of items to replace is not a multiple of replacement length
> 4: In N[i] <- read.csv(filenames[i]) :
>   number of items to replace is not a multiple of replacement length
> 5: In N[i] <- read.csv(filenames[i]) :
>   number of items to replace is not a multiple of replacement length
>
>
> Seems I am getting there, but could you spot where my code went wrong
> please??
>
> Many thanks again!
>
> HJ
>
>
>
>
>
> On Fri, May 25, 2012 at 8:36 PM, Rui barradas <rui1...@sapo.pt> wrote:
>
>> Hello,
>>
>> Or maybe put the data frames in a list
>>
>> df.list <- lapply(seq_len(filenames), read.csv, ...) # '...other...' are
>> options you might want to pass, (like headers=TRUE)
>> names(df.list) <- data_names
>>
>> Now access the data frames by number in the list or by name in data_names.
>>
>> Hope this helps,
>>
>> Rui Barradas
>> Em 25-05-2012 20:08, Nutter, Benjamin escreveu:
>>
>>>  For example:
>>>
>>> myDir<- "some file path"
>>> filenames<- list.files(myDir)
>>> filenames<- filenames[grep("[.]csv", filenames)]
>>>
>>> data_names<- gsub("[.]csv", "", filenames)
>>>
>>> for(i in 1:length(filenames)) assign(data_names[i],
>>> read.csv(file.path(myDir, filenames[i])))
>>>
>>>
>>>  Benjamin Nutter |  Biostatistician     |  Quantitative Health Sciences
>>>   Cleveland Clinic    |  9500 Euclid Ave.  |  Cleveland, OH 44195  | (216)
>>> 445-1365
>>>
>>>
>>> -----Original Message-----
>>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-**
>>> project.org <r-help-boun...@r-project.org>] On Behalf Of Kevin Wright
>>> Sent: Friday, May 25, 2012 2:55 PM
>>> To: HJ YAN
>>> Cc: r-help@r-project.org
>>> Subject: Re: [R] Reading a bunch of csv files into R
>>>
>>> See ?dir
>>>
>>> Assign the value to a vector and loop over the elements of the vector.
>>>
>>> Kevin
>>>
>>>
>>> On Fri, May 25, 2012 at 12:16 PM, HJ YAN<yhj...@googlemail.com>  wrote:
>>>
>>>> Dear R users
>>>>
>>>>
>>>> I am struggling from a data importing issue:
>>>>
>>>> I have some hundreds of csv files needed to be read into R for futher
>>>> analysis. All those csv files are named in one of the three formats:
>>>>
>>>> (1) strings: e.g. London_Oxford street
>>>> (2) Integer: e.g. 1234_5678
>>>> (3) combined: e.g. London_1234
>>>>
>>>> I intend to use read.csv("xxxx_xxx.csv") but I only dealt with sigle
>>>> documents before and if there are only no more than 20 files, I do not
>>>> bother to search a more efficient way.
>>>>
>>>>
>>>> Is there any claver way that I do not have to type in all these
>>>> hundreds names by hand, maybe using a R package or write some code in
>>>> some other languages if it is not too difficult to learn.
>>>>
>>>> Any thoughts/hints please??
>>>>
>>>> Many thanks in advance!
>>>>
>>>> HJ
>>>>
>>>>        [[alternative HTML version deleted]]
>>>>
>>>> ______________________________**________________
>>>> R-help@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/**posting-guide.html<http://www.r-project.org/posting-guide.html>
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>> --
>>> Kevin Wright
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> ==============================**=====
>>>
>>>  Please consider the environment before printing this e-mail
>>>
>>> Cleveland Clinic is ranked one of the top hospitals
>>> in America by U.S.News&  World Report (2010).
>>>
>>> Visit us online at http://www.clevelandclinic.org for
>>> a complete listing of our services, staff and
>>> locations.
>>>
>>>
>>> Confidentiality Note:  This message is intended for
>>> use\...{{dropped:13}}
>>>
>>>
>>> ______________________________**________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.r-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________**________________
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reading a bunch of csv files into R

Reply via email to