You need to handle it one way or another. Note, however, that none of the UCI data sets is large enough to be split in a map-reduce program. If you are producing your own data, I would recommend using something like Avro that is self-describing (like CSV), but which is much more flexible.
On Wed, Jul 6, 2011 at 3:23 AM, Xiaobo Gu <[email protected]> wrote: > Thanks for your reply first, so we must wirte specific code to > handle the CSV header if we have it in the file, right? >
