Please don't use attachments. They should be stripped by the Apache mailer. There are a bunch of mail archiver sites which don't save attachments.
Lance On Sun, Dec 26, 2010 at 8:20 AM, Harsh J <[email protected]> wrote: > Hi, > > On Sun, Dec 26, 2010 at 6:29 PM, Black, Michael (IS) > <[email protected]> wrote: >> I assume there's a way to make a specific # of splits and add each document >> to the separate splits...but I'll be darned if I can find the docs or an >> example to show this. > > Would CombineFileInputFormat and CombineFileSplit be what you're looking for? > > Doc links: > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/CombineFileInputFormat.html > & > http://hadoop.apache.org/common/docs/r0.20.2/api/org/apache/hadoop/mapred/lib/CombineFileSplit.html > >> As I said I'm using hadoop-0.20.2 which I know makes a difference as so many >> things get deprecated on each release. Old references don't seem to work. > > The API marked deprecated in 0.20.{0,1,2} has been un-deprecated in > the 0.21.0 release and is also considered as the "stable" API. You > can continue using it, as it is still supported. > > (Maybe 0.20.3 will have them un-deprecated too, I'm not sure what's > the status on that, although doing so would surely help avoid beginner > confusion.) > -- > Harsh J > www.harshj.com > -- Lance Norskog [email protected]
