My current project processes input file of size 333302161 bytes. What I plan to do is to split the file into equal size pieces (and on blank line boundary) to improve performance.
I found 12 classes in 0.20.1 source code which implement InputSplit. If someone has written code similar to what I plan to do, please share some hint. Thanks On Fri, Jan 8, 2010 at 2:27 AM, Amogh Vasekar <am...@yahoo-inc.com> wrote: > Hi, > The deprecation is due to the new evolving mapreduce ( o.a.h.mapreduce ) > APIs. Old APIs are supported for available distributions. The equivalent of > TextInputFormat is available in new API : > > > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/lib/input/TextInputFormat.html > > Thanks, > Amogh > > > On 1/8/10 3:47 AM, "Ted Yu" <yuzhih...@gmail.com> wrote: > > According to: > > http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/TextInputFormat.html#isSplitable%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path%29 > > isSplitable() is deprecated. > > Which method should I use to replace it ? > > Thanks > >