Hi, A large file will be splitted into serveral FileSplits in FileInputFormat.java#getSplits(). We know FileInputFormat presents a byte-oriented view of the input file so a whole record (for instance a line) might be broken during the process of generating several FileSplits for a single file. Then one part of a whole record will be in one InputSplit and another part will be in another InputSplit and the two InputSplits might be processed in different Node.
I want to know how does hadoop handle with this problem? Yu zhong 2008/07/15 _________________________________________________________________ 多个邮箱同步管理,live mail客户端万人抢用中 http://get.live.cn/product/mail.html
