Hi, As far as I know -
If a split has ended in the midst of a record, the node processing it will get the rest of the record from the remote node hosting it. The node processing the next split, will ignore the beginning of its split and start after the first record separator (newline in your example). Naama 2008/7/15 caoyuzhong <[EMAIL PROTECTED]>: > > Hi, > > A large file will be splitted into serveral FileSplits in > FileInputFormat.java#getSplits(). > We know FileInputFormat presents a byte-oriented view of the input file so > a whole record (for instance a line) might be broken during the process of > generating > several FileSplits for a single file. Then one part of a whole record will > be in one InputSplit and another > part will be in another InputSplit and the two InputSplits might be > processed in different Node. > > I want to know how does hadoop handle with this problem? > > Yu zhong > 2008/07/15 > > > > _________________________________________________________________ > 多个邮箱同步管理,live mail客户端万人抢用中 > http://get.live.cn/product/mail.html -- oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo 00 oo "If you want your children to be intelligent, read them fairy tales. If you want them to be more intelligent, read them more fairy tales." (Albert Einstein)
