Thank you. :)
On Sat, Jan 5, 2013 at 2:37 AM, Matthew Dowle <[email protected]> wrote: > > Ok this is fixed and committed (786) with your example as tests. > Thanks again. > > > On 04.01.2013 09:38, Matthew Dowle wrote: >> >> Great, thanks for this. I reproduced and fixed and will commit later >> tonight. It was the part of the code that loops through the header >> row to test if it contains column names (if every field is character). >> >> On 03.01.2013 20:50, Akhil Behl wrote: >>> >>> So, here is a `head' of my dataset. Note the `,,' in the 2nd last column. >>> >>> >>> >>> 02-FEB-2009,09:55:04:962,26022009,2500,PE,36,500,44,200,11850,1100,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:987,26022009,2800,PE,108.75,200,111,50,11700,1450,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:939,26022009,3100,CE,31.1,3000,36.55,200,3500,5250,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:989,26022009,2600,PE,52.05,500,57,400,16050,1150,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:981,26022009,3000,CE,56.25,2000,67,150,21500,13750,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:991,26022009,2900,CE,81,1000,100,100,18100,4550,1000,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:953,26022009,2800,CE,150,50,159.7,5000,13400,15500,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:987,26022009,2700,PE,72.15,3000,79,50,19200,5100,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:615,26022009,2450,CE,256.9,500,678,500,500,500,,2865.60 >>> >>> >>> 02-FEB-2009,09:55:04:894,26022009,3300,CE,6,7000,10.8,2000,7000,2550,,2865.60 >>> >>> The documentation says that ",," should be read as "". But instead the >>> function throws an error (one I can not understand). See here: >>> >>> R> library(data.table) >>> data.table 1.8.7 For help type: help("data.table") >>> >>> R> tt <- fread("sample.csv", verbose=TRUE) >>> >>> Detected eol as \n only (no \r afterwards), the UNIX and Mac standard. >>> Starting format detection on line 30 (the last non blank line in the >>> first 30) >>> Detected sep as ',' and 13 columns >>> Type codes: 3300320200002 >>> Found first row with 13 fields occuring on line 1 (either column names >>> or first row of data) >>> Error in fread("sample.csv", verbose = TRUE) : Unexpected character ( >>> 02-F) ending field 12 of line 1 >>> >>> Using na.strings="" does not work either. But I guess that should not >>> have made a difference anyway? >>> >>> Then I opened the file in GVim and converted all `,,' to `,NA,' and >>> re-read the file. This time it works. >>> >>> R> tt <- fread("sample-with-NA.csv", verbose=TRUE) >>> >>> Detected eol as \n only (no \r afterwards), the UNIX and Mac standard. >>> Starting format detection on line 30 (the last non blank line in the >>> first 30) >>> Detected sep as ',' and 13 columns >>> Type codes: 3300320200002 >>> Found first row with 13 fields occuring on line 1 (either column names >>> or first row of data) >>> The first data row has some non character fields. Treating as a data >>> row and using default column names. >>> Count of eol after pos: 101 >>> Subtracted 1 for last eol and any trailing empty lines, leaving 100 data >>> rows >>> 0.000s ( 6%) Memory map (quicker if you rerun) >>> 0.000s ( 40%) Format detection >>> 0.000s ( 7%) Count rows (wc -l) >>> 0.000s ( 2%) Allocation of 100x13 result (xMB) in RAM >>> 0.000s ( 41%) Reading data >>> 0.000s ( 0%) Bumping column type midread and coercing data already >>> read >>> 0.000s ( 3%) Changing na.strings to NA >>> 0.001s Total >>> >>> I've attached a 100 row sample.csv and a sample-with-NA.csv here for >>> you to replicate the issue. >>> >>> Maybe, it is just that I am missing something. Can you explain? >>> >>> Thanks a lot! >>> >>> -- >>> ASB. > > _______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
