Re: [datatable-help] Fwd: fread: Handling NAs with ",," not working?

Matthew Dowle Fri, 04 Jan 2013 13:07:56 -0800


Ok this is fixed and committed (786) with your example as tests.
Thanks again.


On 04.01.2013 09:38, Matthew Dowle wrote:

Great, thanks for this. I reproduced and fixed and will commit later
tonight.  It was the part of the code that loops through the header

row to test if it contains column names (if every field ischaracter).


On 03.01.2013 20:50, Akhil Behl wrote:

So, here is a `head' of my dataset. Note the `,,' in the 2nd lastcolumn.



02-FEB-2009,09:55:04:962,26022009,2500,PE,36,500,44,200,11850,1100,,2865.60

02-FEB-2009,09:55:04:987,26022009,2800,PE,108.75,200,111,50,11700,1450,,2865.60

02-FEB-2009,09:55:04:939,26022009,3100,CE,31.1,3000,36.55,200,3500,5250,,2865.60

02-FEB-2009,09:55:04:989,26022009,2600,PE,52.05,500,57,400,16050,1150,,2865.60

02-FEB-2009,09:55:04:981,26022009,3000,CE,56.25,2000,67,150,21500,13750,,2865.60

02-FEB-2009,09:55:04:991,26022009,2900,CE,81,1000,100,100,18100,4550,1000,2865.60

02-FEB-2009,09:55:04:953,26022009,2800,CE,150,50,159.7,5000,13400,15500,,2865.60

02-FEB-2009,09:55:04:987,26022009,2700,PE,72.15,3000,79,50,19200,5100,,2865.60

02-FEB-2009,09:55:04:615,26022009,2450,CE,256.9,500,678,500,500,500,,2865.60

02-FEB-2009,09:55:04:894,26022009,3300,CE,6,7000,10.8,2000,7000,2550,,2865.60

The documentation says that ",," should be read as "". But insteadthe

function throws an error (one I can not understand). See here:

R> library(data.table)
data.table 1.8.7  For help type: help("data.table")

R> tt <- fread("sample.csv", verbose=TRUE)

Detected eol as \n only (no \r afterwards), the UNIX and Macstandard.

Starting format detection on line 30 (the last non blank line in the
first 30)
Detected sep as ',' and 13 columns
Type codes: 3300320200002

Found first row with 13 fields occuring on line 1 (either columnnames

or first row of data)

Error in fread("sample.csv", verbose = TRUE) : Unexpected character(

02-F) ending field 12 of line 1

Using na.strings="" does not work either. But I guess that shouldnot

have made a difference anyway?

Then I opened the file in GVim and converted all `,,' to `,NA,' and
re-read the file. This time it works.

R> tt <- fread("sample-with-NA.csv", verbose=TRUE)

Detected eol as \n only (no \r afterwards), the UNIX and Macstandard.

Starting format detection on line 30 (the last non blank line in the
first 30)
Detected sep as ',' and 13 columns
Type codes: 3300320200002

Found first row with 13 fields occuring on line 1 (either columnnames

or first row of data)
The first data row has some non character fields. Treating as a data
row and using default column names.
Count of eol after pos: 101

Subtracted 1 for last eol and any trailing empty lines, leaving 100data rows

   0.000s (  6%) Memory map (quicker if you rerun)
   0.000s ( 40%) Format detection
   0.000s (  7%) Count rows (wc -l)
   0.000s (  2%) Allocation of 100x13 result (xMB) in RAM
   0.000s ( 41%) Reading data

0.000s ( 0%) Bumping column type midread and coercing dataalready read

   0.000s (  3%) Changing na.strings to NA
   0.001s        Total

I've attached a 100 row sample.csv and a sample-with-NA.csv here for
you to replicate the issue.

Maybe, it is just that I am missing something. Can you explain?

Thanks a lot!

--
ASB.


_______________________________________________
datatable-help mailing list
[email protected]
https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help

Re: [datatable-help] Fwd: fread: Handling NAs with ",," not working?

Reply via email to