HDFS does not operate with records in mind. There shouldn't be too much of a problem with having a few MBs per record in text files (provided, 'few MBs' means a (very) small fraction of the file's blocksize value).
On Sat, Mar 5, 2011 at 1:00 AM, Kelly Burkhart <[email protected]> wrote: > Hello, are the restrictions to the size or "width" of text files > placed in HDFS? I have a file structure like this: > > <text key><tab><text data><nl> > > It would be helpful if in some circumstances I could make text data > really large (large meaning many KB to one/few MB). I may have some > rows that have a very small payload and some with a very large > payload. Is this OK? When HDFS is splitting the file into chunks to > spread across the cluster will it ever spit a record? Total file size > may be on the order of 20-30GB. > > Thanks, > > -K > -- Harsh J www.harshj.com
