Re: [R] How to load a big txt file

2007-06-07 Thread ssls sddd
Dear Chung-hong Chan,

Thanks! Can you recommend a text editor for splitting? I used UltraEdit
and TextPad but did not find they can split files.

Sincerely,

Alex

On 6/6/07, Chung-hong Chan [EMAIL PROTECTED] wrote:

 Easy solution will be split your big txt files by text editor.

 e.g. 5000 rows each.

 and then combine the dataframes together into one.



 On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote:
  Dear list,
 
  I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
  for downstream clustering analysis.
 
  I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t)
  but it took a long time and failed. However, it had no problem if I just
 put
  data of 3 columns.
 
  Is there any way which can load this big file?
 
  Thanks for any suggestions!
 
  Sincerely,
   Alex
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 The scientists of today think deeply instead of clearly. One must be
 sane to think clearly, but one can think deeply and be quite insane.
 Nikola Tesla
 http://www.macgrass.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-07 Thread ssls sddd
Dear Michael,

It consists of 238305 rows and 50 columns including the header and row
names.

Thanks!

Alex

On 6/7/07, michael watson (IAH-C) [EMAIL PROTECTED] wrote:

 Erm... Is that a typo?  Are we really talking 23800 rows and 49 columns?
 Because that doesn't seem that many

 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of ssls sddd
 Sent: 07 June 2007 10:48
 To: r-help@stat.math.ethz.ch
 Subject: Re: [R] How to load a big txt file

 Dear Chung-hong Chan,

 Thanks! Can you recommend a text editor for splitting? I used UltraEdit
 and TextPad but did not find they can split files.

 Sincerely,

 Alex

 On 6/6/07, Chung-hong Chan [EMAIL PROTECTED] wrote:
 
  Easy solution will be split your big txt files by text editor.
 
  e.g. 5000 rows each.
 
  and then combine the dataframes together into one.
 
 
 
  On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote:
   Dear list,
  
   I need to read a big txt file (around 130Mb; 23800 rows and 49
 columns)
   for downstream clustering analysis.
  
   I first used Tumor - read.table(Tumor.txt,header = TRUE,sep =
 \t)
   but it took a long time and failed. However, it had no problem if I
 just
  put
   data of 3 columns.
  
   Is there any way which can load this big file?
  
   Thanks for any suggestions!
  
   Sincerely,
Alex
  
   [[alternative HTML version deleted]]
  
   __
   R-help@stat.math.ethz.ch mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help
   PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
   and provide commented, minimal, self-contained, reproducible code.
  
 
 
  --
  The scientists of today think deeply instead of clearly. One must be
  sane to think clearly, but one can think deeply and be quite insane.
  Nikola Tesla
  http://www.macgrass.com
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-07 Thread ssls sddd
Dear Jim,

Thanks a lot! The size of the text file is 189,588,541 bytes.
It consists of 238305 rows (including the header) and
50 columns (the first column is for ID and the rest for 49 samples).

The first row looks like:

ID
AIRNS_p_Sty5_Mapping250K_Sty_A09_50156.cel
AIRNS_p_Sty5_Mapping250K_Sty_A11_50188.cel
AIRNS_p_Sty5_Mapping250K_Sty_A12_50204.cel
AIRNS_p_Sty5_Mapping250K_Sty_B09_50158.cel
AIRNS_p_Sty5_Mapping250K_Sty_C01_50032.cel
AIRNS_p_Sty5_Mapping250K_Sty_C12_50208.cel
AIRNS_p_Sty5_Mapping250K_Sty_D03_50066.cel
AIRNS_p_Sty5_Mapping250K_Sty_D08_50146.cel
AIRNS_p_Sty5_Mapping250K_Sty_F03_50070.cel
AIRNS_p_Sty5_Mapping250K_Sty_F12_50214.cel
AIRNS_p_Sty5_Mapping250K_Sty_G09_50168.cel
DOLCE_p_Sty7_Mapping250K_Sty_B04_53892.cel
DOLCE_p_Sty7_Mapping250K_Sty_B06_53924.cel
DOLCE_p_Sty7_Mapping250K_Sty_C05_53910.cel
DOLCE_p_Sty7_Mapping250K_Sty_C10_53990.cel
DOLCE_p_Sty7_Mapping250K_Sty_D05_53912.cel
DOLCE_p_Sty7_Mapping250K_Sty_E01_53850.cel
DOLCE_p_Sty7_Mapping250K_Sty_G12_54030.cel
DOLCE_p_Sty7_Mapping250K_Sty_H06_53936.cel
DOLCE_p_Sty7_Mapping250K_Sty_H08_53968.cel
DOLCE_p_Sty7_Mapping250K_Sty_H11_54016.cel
DOLCE_p_Sty7_Mapping250K_Sty_H12_54032.cel
GUSTO_p_Sty20_Mapping250K_Sty_C08_81736.cel
GUSTO_p_Sty20_Mapping250K_Sty_E03_81660.cel
GUSTO_p_Sty20_Mapping250K_Sty_H02_81650.cel
HEWED_p_250KSty_Plate_20060123_GOOD_B01_46246.cel
HEWED_p_250KSty_Plate_20060123_GOOD_C06_46328.cel
HEWED_p_250KSty_Plate_20060123_GOOD_F02_46270.cel
HEWED_p_250KSty_Plate_20060123_GOOD_G04_46304.cel
HOCUS_p_Sty4_Mapping250K_Sty_B05_55060.cel
HOCUS_p_Sty4_Mapping250K_Sty_B12_55172.cel
HOCUS_p_Sty4_Mapping250K_Sty_E05_55066.cel
SOARS_p_Sty23_Mapping250K_Sty_B07_89024.cel
SOARS_p_Sty23_Mapping250K_Sty_C01_88930.cel
SOARS_p_Sty23_Mapping250K_Sty_C11_89090.cel
SOARS_p_Sty23_Mapping250K_Sty_F07_89032.cel
SOARS_p_Sty23_Mapping250K_Sty_H08_89052.cel
SOARS_p_Sty23_Mapping250K_Sty_H10_89084.cel
VINOS_p_Sty8_Mapping250K_Sty_A04_54082.cel
VINOS_p_Sty8_Mapping250K_Sty_A07_54130.cel
VINOS_p_Sty8_Mapping250K_Sty_B08_54148.cel
VINOS_p_Sty8_Mapping250K_Sty_D01_54040.cel
VINOS_p_Sty8_Mapping250K_Sty_D05_54104.cel
VINOS_p_Sty8_Mapping250K_Sty_E04_54090.cel
VINOS_p_Sty8_Mapping250K_Sty_E12_54218.cel
VINOS_p_Sty8_Mapping250K_Sty_G01_54046.cel
VINOS_p_Sty8_Mapping250K_Sty_G12_54222.cel
VOLTS_p_Sty9_Mapping250K_Sty_G09_57916.cel
VOLTS_p_Sty9_Mapping250K_Sty_H12_57966.cel


and the second row looks like:

SNP_A-17802711.85642004013061.50955998897551.7315399646759
1.5307699441911.65760004520421.4741799831392.1564099788666
1.77572267 1.59794998168952.1641461851.980849981308
2.1803700923921.87822997570042.14855003356931.5325000286102
1.72329998016362.22812008857731.9381694821.8546999692917
2.1590900421143 2.19284009933472.02532005310062.6680200099945
2.74359011650092.08049988746643.21423006057742.1001501083374
2.1475799083713.52442002296451.3744800090791.6613099575043
3.1606800556183 2.09170007705691.8727256131.8952000141144
1.8135700225831.81808996200562.25536990165711.927329428
1.67664003372191.34246003627781.56669998168951.7180800437927
1.9548699855804 1.9996948242.22429990768431.7591500282288
2.04801988601682.638689994812

Thanks a lot!

Sincerely,

Alex


On 6/6/07, jim holtman [EMAIL PROTECTED] wrote:

 It would be useful if you could post the first couple of rows of the data
 so we can see what it looks like.

 On 6/6/07, ssls sddd [EMAIL PROTECTED]  wrote:
 
  Dear list,
 
  I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
  for downstream clustering analysis.
 
  I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t)
  but it took a long time and failed. However, it had no problem if I just
  put
  data of 3 columns.
 
  Is there any way which can load this big file?
 
  Thanks for any suggestions!
 
  Sincerely,
  Alex
 
 [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-07 Thread michael watson \(IAH-C\)
Erm... Is that a typo?  Are we really talking 23800 rows and 49 columns?
Because that doesn't seem that many

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of ssls sddd
Sent: 07 June 2007 10:48
To: r-help@stat.math.ethz.ch
Subject: Re: [R] How to load a big txt file

Dear Chung-hong Chan,

Thanks! Can you recommend a text editor for splitting? I used UltraEdit
and TextPad but did not find they can split files.

Sincerely,

Alex

On 6/6/07, Chung-hong Chan [EMAIL PROTECTED] wrote:

 Easy solution will be split your big txt files by text editor.

 e.g. 5000 rows each.

 and then combine the dataframes together into one.



 On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote:
  Dear list,
 
  I need to read a big txt file (around 130Mb; 23800 rows and 49
columns)
  for downstream clustering analysis.
 
  I first used Tumor - read.table(Tumor.txt,header = TRUE,sep =
\t)
  but it took a long time and failed. However, it had no problem if I
just
 put
  data of 3 columns.
 
  Is there any way which can load this big file?
 
  Thanks for any suggestions!
 
  Sincerely,
   Alex
 
  [[alternative HTML version deleted]]
 
  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 


 --
 The scientists of today think deeply instead of clearly. One must be
 sane to think clearly, but one can think deeply and be quite insane.
 Nikola Tesla
 http://www.macgrass.com

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-07 Thread jim holtman
I took your data and duped the data line so I had 100,000 rows and it took
40 seconds to read in when specifying colClasses

 system.time(x - read.table('/tempxx.txt',
header=TRUE,colClasses=c('factor', rep('numeric',49
   user  system elapsed
  40.980.46   42.39
 str(x)
'data.frame':   102272 obs. of  50 variables:
 $ ID   : Factor w/ 1 level
SNP_A-1780271: 1 1 1 1 1 1 1 1 1 1 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_A09_50156.cel   : num  1.86 1.86 1.86
1.86 1.86 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_A11_50188.cel   : num  1.51 1.51 1.51
1.51 1.51 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_A12_50204.cel   : num  1.73 1.73 1.73
1.73 1.73 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_B09_50158.cel   : num  1.53 1.53 1.53
1.53 1.53 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_C01_50032.cel   : num  1.66 1.66 1.66
1.66 1.66 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_C12_50208.cel   : num  1.47 1.47 1.47
1.47 1.47 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_D03_50066.cel   : num  2.16 2.16 2.16
2.16 2.16 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_D08_50146.cel   : num  1.78 1.78 1.78
1.78 1.78 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_F03_50070.cel   : num  1.60 1.60 1.60
1.60 1.60 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_F12_50214.cel   : num  2.16 2.16 2.16
2.16 2.16 ...
 $ AIRNS_p_Sty5_Mapping250K_Sty_G09_50168.cel   : num  1.98 1.98 1.98
1.98 1.98 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_B04_53892.cel   : num  2.18 2.18 2.18
2.18 2.18 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_B06_53924.cel   : num  1.88 1.88 1.88
1.88 1.88 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_C05_53910.cel   : num  2.15 2.15 2.15
2.15 2.15 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_C10_53990.cel   : num  1.53 1.53 1.53
1.53 1.53 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_D05_53912.cel   : num  1.72 1.72 1.72
1.72 1.72 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_E01_53850.cel   : num  2.23 2.23 2.23
2.23 2.23 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_G12_54030.cel   : num  1.94 1.94 1.94
1.94 1.94 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_H06_53936.cel   : num  1.85 1.85 1.85
1.85 1.85 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_H08_53968.cel   : num  2.16 2.16 2.16
2.16 2.16 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_H11_54016.cel   : num  2.19 2.19 2.19
2.19 2.19 ...
 $ DOLCE_p_Sty7_Mapping250K_Sty_H12_54032.cel   : num  2.03 2.03 2.03
2.03 2.03 ...
 $ GUSTO_p_Sty20_Mapping250K_Sty_C08_81736.cel  : num  2.67 2.67 2.67
2.67 2.67 ...
 $ GUSTO_p_Sty20_Mapping250K_Sty_E03_81660.cel  : num  2.74 2.74 2.74
2.74 2.74 ...
 $ GUSTO_p_Sty20_Mapping250K_Sty_H02_81650.cel  : num  2.08 2.08 2.08
2.08 2.08 ...
 $ HEWED_p_250KSty_Plate_20060123_GOOD_B01_46246.cel: num  3.21 3.21 3.21
3.21 3.21 ...
 $ HEWED_p_250KSty_Plate_20060123_GOOD_C06_46328.cel: num  2.1 2.1 2.1 2.1
2.1 ...
 $ HEWED_p_250KSty_Plate_20060123_GOOD_F02_46270.cel: num  2.15 2.15 2.15
2.15 2.15 ...
 $ HEWED_p_250KSty_Plate_20060123_GOOD_G04_46304.cel: num  3.52 3.52 3.52
3.52 3.52 ...
 $ HOCUS_p_Sty4_Mapping250K_Sty_B05_55060.cel   : num  1.37 1.37 1.37
1.37 1.37 ...
 $ HOCUS_p_Sty4_Mapping250K_Sty_B12_55172.cel   : num  1.66 1.66 1.66
1.66 1.66 ...
 $ HOCUS_p_Sty4_Mapping250K_Sty_E05_55066.cel   : num  3.16 3.16 3.16
3.16 3.16 ...
 $ SOARS_p_Sty23_Mapping250K_Sty_B07_89024.cel  : num  2.09 2.09 2.09
2.09 2.09 ...
 $ SOARS_p_Sty23_Mapping250K_Sty_C01_88930.cel  : num  1.87 1.87 1.87
1.87 1.87 ...
 $ SOARS_p_Sty23_Mapping250K_Sty_C11_89090.cel  : num  1.90 1.90 1.90
1.90 1.90 ...
 $ SOARS_p_Sty23_Mapping250K_Sty_F07_89032.cel  : num  1.81 1.81 1.81
1.81 1.81 ...
 $ SOARS_p_Sty23_Mapping250K_Sty_H08_89052.cel  : num  1.82 1.82 1.82
1.82 1.82 ...
 $ SOARS_p_Sty23_Mapping250K_Sty_H10_89084.cel  : num  2.26 2.26 2.26
2.26 2.26 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_A04_54082.cel   : num  1.93 1.93 1.93
1.93 1.93 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_A07_54130.cel   : num  1.68 1.68 1.68
1.68 1.68 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_B08_54148.cel   : num  1.34 1.34 1.34
1.34 1.34 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_D01_54040.cel   : num  1.57 1.57 1.57
1.57 1.57 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_D05_54104.cel   : num  1.72 1.72 1.72
1.72 1.72 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_E04_54090.cel   : num  1.95 1.95 1.95
1.95 1.95 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_E12_54218.cel   : num  1.44 1.44 1.44
1.44 1.44 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_G01_54046.cel   : num  2.22 2.22 2.22
2.22 2.22 ...
 $ VINOS_p_Sty8_Mapping250K_Sty_G12_54222.cel   : num  1.76 1.76 1.76
1.76 1.76 ...
 $ VOLTS_p_Sty9_Mapping250K_Sty_G09_57916.cel   : num  2.05 2.05 2.05
2.05 2.05 ...
 $ VOLTS_p_Sty9_Mapping250K_Sty_H12_57966.cel   : num  2.64 2.64 2.64
2.64 2.64 ...



On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote:

 Dear Jim,

 Thanks a lot! The size of the text file is 189,588,541 bytes.
 It consists of 238305 rows (including the header) and
 50 columns (the first column is for ID and the rest for 49 samples).

 

Re: [R] How to load a big txt file

2007-06-07 Thread ssls sddd
Dear Jim,

It works great. I appreciate your help.

Sincerely,

Alex

On 6/7/07, jim holtman [EMAIL PROTECTED] wrote:

 I took your data and duped the data line so I had 100,000 rows and it took
 40 seconds to read in when specifying colClasses

  system.time(x - read.table('/tempxx.txt',
 header=TRUE,colClasses=c('factor', rep('numeric',49
user  system elapsed
   40.980.46   42.39
  str(x)
 'data.frame ':   102272 obs. of  50 variables:
  $ ID   : Factor w/ 1 level
 SNP_A-1780271: 1 1 1 1 1 1 1 1 1 1 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_A09_50156.cel   : num  1.86 1.86 1.86
 1.86 1.86 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_A11_50188.cel   : num  1.51 1.51 1.51
 1.51 1.51 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_A12_50204.cel   : num  1.73 1.73 1.73
 1.73 1.73 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_B09_50158.cel   : num  1.53 1.53 1.53
 1.53 1.53 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_C01_50032.cel   : num  1.66 1.66 1.66
 1.66 1.66 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_C12_50208.cel   : num  1.47 1.47 1.47
 1.47 1.47 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_D03_50066.cel   : num  2.16 2.16 2.16
 2.16 2.16 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_D08_50146.cel   : num  1.78 1.78 1.78
 1.78 1.78 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_F03_50070.cel   : num  1.60 1.60 1.60
 1.60 1.60 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_F12_50214.cel   : num  2.16 2.16 2.16
 2.16 2.16 ...
  $ AIRNS_p_Sty5_Mapping250K_Sty_G09_50168.cel   : num  1.98 1.98 1.98
 1.98 1.98 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_B04_53892.cel   : num  2.18 2.18 2.18
 2.18 2.18 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_B06_53924.cel   : num  1.88 1.88 1.88
 1.88 1.88 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_C05_53910.cel   : num  2.15 2.15 2.15
 2.15 2.15 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_C10_53990.cel   : num  1.53 1.53 1.53
 1.53 1.53 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_D05_53912.cel   : num  1.72 1.72 1.72
 1.72 1.72 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_E01_53850.cel   : num  2.23 2.23 2.23
 2.23 2.23 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_G12_54030.cel   : num  1.94 1.94 1.94
 1.94 1.94 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_H06_53936.cel   : num  1.85 1.85 1.85
 1.85 1.85 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_H08_53968.cel   : num  2.16 2.16 2.16
 2.16 2.16 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_H11_54016.cel   : num  2.19 2.19 2.19
 2.19 2.19 ...
  $ DOLCE_p_Sty7_Mapping250K_Sty_H12_54032.cel   : num  2.03 2.03 2.03
 2.03 2.03 ...
  $ GUSTO_p_Sty20_Mapping250K_Sty_C08_81736.cel  : num  2.67 2.67 2.67
 2.67 2.67 ...
  $ GUSTO_p_Sty20_Mapping250K_Sty_E03_81660.cel  : num  2.74 2.74 2.74
 2.74 2.74 ...
  $ GUSTO_p_Sty20_Mapping250K_Sty_H02_81650.cel  : num  2.08 2.08 2.08
 2.08 2.08 ...
  $ HEWED_p_250KSty_Plate_20060123_GOOD_B01_46246.cel: num  3.21 3.21 3.21
 3.21 3.21 ...
  $ HEWED_p_250KSty_Plate_20060123_GOOD_C06_46328.cel: num  2.1 2.1 2.1 2.1
 2.1 ...
  $ HEWED_p_250KSty_Plate_20060123_GOOD_F02_46270.cel: num  2.15 2.15 2.15
 2.15 2.15 ...
  $ HEWED_p_250KSty_Plate_20060123_GOOD_G04_46304.cel: num  3.52 3.52 3.52
 3.52 3.52 ...
  $ HOCUS_p_Sty4_Mapping250K_Sty_B05_55060.cel   : num  1.37 1.37 1.37
 1.37 1.37 ...
  $ HOCUS_p_Sty4_Mapping250K_Sty_B12_55172.cel   : num  1.66 1.66 1.66
 1.66 1.66 ...
  $ HOCUS_p_Sty4_Mapping250K_Sty_E05_55066.cel   : num  3.16 3.16 3.16
 3.16 3.16 ...
  $ SOARS_p_Sty23_Mapping250K_Sty_B07_89024.cel  : num  2.09 2.09 2.09
 2.09 2.09 ...
  $ SOARS_p_Sty23_Mapping250K_Sty_C01_88930.cel  : num  1.87 1.87 1.87
 1.87 1.87 ...
  $ SOARS_p_Sty23_Mapping250K_Sty_C11_89090.cel  : num  1.90 1.90 1.90
 1.90 1.90 ...
  $ SOARS_p_Sty23_Mapping250K_Sty_F07_89032.cel  : num  1.81 1.81 1.81
 1.81 1.81 ...
  $ SOARS_p_Sty23_Mapping250K_Sty_H08_89052.cel  : num  1.82 1.82 1.82
 1.82 1.82 ...
  $ SOARS_p_Sty23_Mapping250K_Sty_H10_89084.cel  : num  2.26 2.26 2.26
 2.26 2.26 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_A04_54082.cel   : num  1.93 1.93 1.93
 1.93 1.93 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_A07_54130.cel   : num  1.68 1.68 1.68
 1.68 1.68 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_B08_54148.cel   : num  1.34 1.34 1.34
 1.34 1.34 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_D01_54040.cel   : num  1.57 1.57 1.57
 1.57 1.57 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_D05_54104.cel   : num  1.72 1.72 1.72
 1.72 1.72 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_E04_54090.cel   : num  1.95 1.95 1.95
 1.95 1.95 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_E12_54218.cel   : num  1.44 1.44 1.44
 1.44 1.44 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_G01_54046.cel   : num  2.22 2.22 2.22
 2.22 2.22 ...
  $ VINOS_p_Sty8_Mapping250K_Sty_G12_54222.cel   : num  1.76 1.76 1.76
 1.76 1.76 ...
  $ VOLTS_p_Sty9_Mapping250K_Sty_G09_57916.cel   : num  2.05 2.05 2.05
 2.05 2.05 ...
  $ VOLTS_p_Sty9_Mapping250K_Sty_H12_57966.cel   : num  2.64 2.64 2.64
 2.64 2.64 ...
 


 On 6/7/07, ssls sddd 

[R] How to load a big txt file

2007-06-06 Thread ssls sddd
Dear list,

I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
for downstream clustering analysis.

I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t)
but it took a long time and failed. However, it had no problem if I just put
data of 3 columns.

Is there any way which can load this big file?

Thanks for any suggestions!

Sincerely,
 Alex

[[alternative HTML version deleted]]

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-06 Thread Chung-hong Chan
Easy solution will be split your big txt files by text editor.

e.g. 5000 rows each.

and then combine the dataframes together into one.



On 6/7/07, ssls sddd [EMAIL PROTECTED] wrote:
 Dear list,

 I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
 for downstream clustering analysis.

 I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t)
 but it took a long time and failed. However, it had no problem if I just put
 data of 3 columns.

 Is there any way which can load this big file?

 Thanks for any suggestions!

 Sincerely,
  Alex

 [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 
The scientists of today think deeply instead of clearly. One must be
sane to think clearly, but one can think deeply and be quite insane.
Nikola Tesla
http://www.macgrass.com

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-06 Thread Charles C. Berry

Alex,

See

R Data Import/Export Version 2.5.0 (2007-04-23)

search for 'large' or 'scan'.

Usually, taking care with the arguments

nlines, what, quote, comment.char

should be enough to get scan() to cooperate.

You will need around 1GB RAM to store the result, so if you are working on 
a machine with less, you will need to upgrade. Consider storing the result 
as a numeric matrix.

If any of those columns are long strings not needed in your computation, 
be sure to skip over them. Read the 'Details' of the help page for scan() 
carefully.

Chuck


On Thu, 7 Jun 2007, ssls sddd wrote:

 Dear list,

 I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
 for downstream clustering analysis.

 I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t)
 but it took a long time and failed. However, it had no problem if I just put
 data of 3 columns.

 Is there any way which can load this big file?

 Thanks for any suggestions!

 Sincerely,
 Alex

   [[alternative HTML version deleted]]

 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to load a big txt file

2007-06-06 Thread Charles C. Berry
On Wed, 6 Jun 2007, Charles C. Berry wrote:


 Alex,

 See

   R Data Import/Export Version 2.5.0 (2007-04-23)

 search for 'large' or 'scan'.

 Usually, taking care with the arguments

   nlines, what, quote, comment.char

 should be enough to get scan() to cooperate.

 You will need around 1GB RAM to store the result, so if you are working on a

Oops. 23800*49*8 == 9329600 is more like 0.01GB, I guess.


 machine with less, you will need to upgrade. Consider storing the result as a 
 numeric matrix.

 If any of those columns are long strings not needed in your computation, be 
 sure to skip over them. Read the 'Details' of the help page for scan() 
 carefully.

 Chuck


 On Thu, 7 Jun 2007, ssls sddd wrote:

  Dear list,

  I need to read a big txt file (around 130Mb; 23800 rows and 49 columns)
  for downstream clustering analysis.

  I first used Tumor - read.table(Tumor.txt,header = TRUE,sep = \t)
  but it took a long time and failed. However, it had no problem if I just
  put
  data of 3 columns.

  Is there any way which can load this big file?

  Thanks for any suggestions!

  Sincerely,
  Alex

   [[alternative HTML version deleted]]

  __
  R-help@stat.math.ethz.ch mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 

 Charles C. Berry(858) 534-2098
 Dept of Family/Preventive Medicine
 E mailto:[EMAIL PROTECTED] UC San Diego
 http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901




Charles C. Berry(858) 534-2098
  Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://biostat.ucsd.edu/~cberry/ La Jolla, San Diego 92093-0901

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.