I talked with guy who worked on this and he said our production issue
was probably not directly caused by getLength() returning 0.
Anyway, we are interested in fixing that, estimating length from files
is good idea.
Lukas
InputSplit.getLength() and RecordReader.getProgress() is important for the
MR framework to be able to show progress etc. It would be good to return
raw data sizes in getLength() computed from region's total size of store
files, and progress being calculated from scanner's amount of raw data seen.
Enis