Thanks Rahul. Our ops people have implemented the config change. On Thursday, September 19, 2013, Rahul Jain wrote:
> Matt, > > It would be better for you to do an global config update: set > *mapreduce.job.max.split.locations > *to at least the number of datanodes in your cluster, either in > hive-site.xml or mapred-site.xml. Either case, this is a sensible > configuration update if you are going to use CombineFileInputFormat to read > input data in hive. > > -Rahul > > > On Thu, Sep 19, 2013 at 3:31 PM, Matt Davies <[email protected]> wrote: > > What are the ramifications of setting a hard coded value in our scripts > and then changing parameters which influence the input data size. I.e. I > want to run across 1 day worth of data, then a different day I want to run > against 30 days? > > > > > On Thu, Sep 19, 2013 at 3:11 PM, Rahul Jain <[email protected]> wrote: > > I am assuming you have looked at this already: > > https://issues.apache.org/jira/browse/MAPREDUCE-5186 > > You do have a workaround here to increase *mapreduce.job.max.split.locations > *value in hive configuration, or do we need more than that here ? > > -Rahul > > > On Thu, Sep 19, 2013 at 11:00 AM, Murtaza Doctor > <[email protected]>wrote: > > It used to throw a warning in 1.03 and now has become an IOException. I > was more trying to figure out why it is exceeding the limit even though the > replication factor is 3. Also Hive may use CombineInputSplit or some > version of it, are we saying it will always exceed the limit of 10? > > > On Thu, Sep 19, 2013 at 10:05 AM, Edward Capriolo > <[email protected]>wrote: > > We have this job submit property buried in hive that defaults to 10. We > should make that configurable. > > > On Wed, Sep 18, 2013 at 9:34 PM, Harsh J <[email protected]> wrote: > > Do your input files carry a replication factor of 10+? That could be > one cause behind this. > > On Thu, Sep 19, 2013 at 6:20 AM, Murtaza Doctor <[email protected]> > wrote: > > Folks, > > > > Any one run into this issue before: > > java.io.IOException: Max block location exceeded for split: Paths: > > "/foo/bar...." > > .... > > InputFormatClass: org.apache.hadoop.mapred.TextInputFormat > > splitsize: 15 maxsize: 10 > > at > > > org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162) > > at > > > org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87) > > at > > > org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:501) > > at > > > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471) > > at > > > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269) > > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266) > > at java.security.AccessController.doPrivileged(Native Method) > > at javax.security.auth.Subject.doAs(Subject.java:415) > > at > > > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > > at org.apac > >
