On Wed, Mar 19, 2003 at 12:40:20PM -0500, [EMAIL PROTECTED] wrote:
> On 19 Mar, [EMAIL PROTECTED] wrote:
> > Have you tried:
> >       data <- read.table("data.dat", header=TRUE, sep="|", as.is=TRUE)
> > 
> 
> Yes I did. However, it takes a LOT more time because of the date/time
> string. The result looks like this:
> 
> 
> str(data)
> `data.frame': 317437 obs. of  8 variables:
>  $ phone   : num  1.52e+10 1.42e+10 1.82e+10 1.65e+10 1.65e+10 ...
>  $ state   : int  3 3 3 3 3 3 3 3 3 3 ...
>  $ code    : int  983 983 983 983 3000 983 983 983 983 5203 ...
>  $ amount  : int  1000 1000 2500 2500 2500 1000 1000 2500 2500 2500 ...
>  $ left    : int  260 0 0 25 0 1260 273 0 0 0 ...
>  $ channel : Factor w/ 5 levels "CSR","IN","IVR",..: 2 5 4 2 3 2 2 3 4 3 ...
>  $ time    : Factor w/ 312198 levels "2002-10-16 ..",..: 1 2 3 4 5 6 7 8 9 10 ...
>  $ mtd     : Factor w/ 2 levels "C","D": 1 1 1 1 1 1 1 1 1 1 ...
> 
> I think the 312198 factor level is wrong. Also, the phone column is  a string,
> not a number. I didn't see how to specify that with read.table(). (In my
> original post, I think I forgot to mention that I had over 300,000 entries in
> my file).

Check out the colClasses argument to read.table.  Something like...

library(methods) #necessary for colClasses

data <- read.table("data.dat", header=TRUE, sep="|", 
                colClasses=c("character","integer","integer",
                                "integer","integer","character","character",
                                "character"))

You can convert the items you need to be factors after they're loaded,
like this...

data$mtd <- factor(data$mtd)

Hope it helps

Jason
-- 
Indigo Industrial Controls Ltd.
64-21-343-545
[EMAIL PROTECTED]

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to