Hi, I am wondering whether it is possible to read a file using fread() with: 1) Multiple header lines, and
2) Multiple whitespace characters separating fields The sample of the input file is as follows: ------------- Garbage header information that I need to skip when reading... Number of lines here are variable. Serial_Number PHIv Lu/W (-) (lm) (lm/W) ABCDEFG 27.0264 103.58 HIJKLMNO 33.9143 91.03 Some footer information that spans multiple lines ------------- To handle the multiple lines of headers, I would have to read the file using fread() first, reprocess the file using a similar algorithm to identify the actual header -- i.e. one line above what fread() would identify as the header, then throw away the names of the columns fread() created and rename it to the actual ones I find. However, this seems to be highly inefficient since I would replicate what fread() did within R -- not to mention I do not quite know how to do that. As far as handling the multiple (and variable) spaces for separator, I do not see fread() being able to handle this either. read.table() however does with the default sep="" value. Of course, that does not handle the garbage headers and footers that fread() so beautifully avoids with its autostart algorithm. Any suggestions as to how I would do this easily? I have lots of these files to read, and doing manual editing is not desirable. If there is a hack I can do with fread(), that would be ideal. Thanks a lot for your help. Regards, Harish
_______________________________________________ datatable-help mailing list [email protected] https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help
