It sounds like JobTracker setting, so the restart looks to be required. You verify it in pseudo-distributed mode by setting it to a very low value, restarting JT and seeing if you get the exception that prints this new value.
Sent from my iPhone > On 14 jul 2014, at 16:03, Jan Warchoł <[email protected]> wrote: > > Hello, > > I recently got "Split metadata size exceeded 10000000" error when running > Cascading jobs with very big joins. I found that I should change > mapreduce.jobtracker.split.metainfo.maxsize property in hadoop configuration > by adding this to the mapred-site.xml file: > > <property> > <!-- allow more space for split metadata (default is 10000000) --> > <name>mapreduce.jobtracker.split.metainfo.maxsize</name> > <value>1000000000</value> > </property> > > but it didn't seem to have any effect - I'm probably doing something wrong. > > Where should I add this change so that is has the desired effect? Do I > understand correctly that jobtracker restart is required after making the > change? The cluster I'm working on has Hadoop 1.0.4. > > thanks for any help, > -- > Jan Warchoł > Software Engineer > <clr[1][14].png> > ----------------------------------------- > M: +48 509 078 203 > E: [email protected] > ----------------------------------------- > CodiLime Sp. z o.o. - Ltd. company with its registered office in Poland, > 01-167 Warsaw, ul. Zawiszy 14/97. Registered by The District Court for the > Capital City of Warsaw, XII Commercial Department of the National Court > Register. Entered into National Court Register under No. KRS 0000388871. Tax > identification number (NIP) 5272657478. Statistical number (REGON) 142974628.
