So basically what I can deduce from it is, isSplittable() only applies to stream compressed files. Right?
-- Thanks & Regards, Sugandha Naolekar On Wed, Feb 26, 2014 at 2:06 PM, Jeff Zhang <[email protected]> wrote: > Hi Sugandha, > > Take gz file as an example, It is not splittable because of the > compression algorithm it is used. It can not guarantee that one record is > located in one block, if one record is in 2 blocks, your program will crash > since you can not get the whole record. > > > > > On Wed, Feb 26, 2014 at 1:24 PM, Sugandha Naolekar <[email protected] > > wrote: > >> Hello, >> >> If a single file is split of size 129 MB is split in two halves/blocks of >> HDFS as the max block size id 128 MB. And each of the blocks is read >> depending on the InputFormat it supports. Thus, what is the significance of >> isSplittable() method then? >> >> If it is set to false, entire block will be considered as single input >> split? How will TextInputFormat react to it? >> >> >> -- >> Thanks & Regards, >> Sugandha Naolekar >> >> >> >> >
