Folks:
I am trying to read in a large file. Definition of large is:
Number of lines: 333, 250
Size: 850 MB

The maching is a dual core intel, with 4 GB RAM and nothing else running on it. 
I read the previous threads on read.fwf and did not see any conclusive 
statements on how to read fast. Example record and R code given below. I was 
hoping to purchase a better machine and do analysis with larger datasets - but 
these preliminary results do not look good.

Does anyone have any experience with large files (> 1GB) and using them with 
Revolution-R?


Thanks.

Satish

Example Code
key_vec <- c(1,3,3,4,2,8,8,2,2,3,2,2,1,3,3,3,3,9)
key_names <- 
c("allgeo","area1","zone","dist","ccust1","whse","bindc","ccust2","account","area2","ccust3","customer","allprod","cat","bu","class","size","bdc")
key_info <- data.frame(key_vec,key_names)
col_names <- c(key_names,sas_time$week)
num_buckets <- rep(12,209)
width_vec = c(key_vec,num_buckets)
col_classes<-c(rep("factor",18),rep("numeric",209))
#threewkoutstat <- 
read.fwf(file="3wkoutstatfcst_file02.dat",widths=width_vec,header=FALSE,colClasses=col_classes,n=100)
threewkoutstat <- 
read.fwf(file="3wkoutstatfcst_file02.dat",widths=width_vec,header=FALSE,colClasses=col_classes)
names(threewkoutstat) <- col_names

Example record (only one record pasted below)
A004001003799000049250000492599990049999A001002002015002015009        0.00      
  0.00        0.00        0.00        0.00        0.00        0.00        0.00  
      0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00   !
      0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.60        
0.60        0.60        0.70        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00       !
  0.00        0.00        0.00        0.00        0.00        0.00     
   0.00        0.00        0.00        0.00        0.00        0.00        0.00 
       0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00        0.00        0.00    
    0.00        0.00        0.00        0.00        0.00        0.00        
0.00        0.00        0.00        0.00        0.00

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to