Re: Issue: Max block location exceeded for split error when running hive

Murtaza Doctor Thu, 19 Sep 2013 01:00:52 -0700

We are using the default replication factor of 3.  When new files are put
on HDFS we never override the replication factor. When there is more data
involved it fails on a larger split size.



On Wed, Sep 18, 2013 at 6:34 PM, Harsh J <[email protected]> wrote:

> Do your input files carry a replication factor of 10+? That could be
> one cause behind this.
>
> On Thu, Sep 19, 2013 at 6:20 AM, Murtaza Doctor <[email protected]>
> wrote:
> > Folks,
> >
> > Any one run into this issue before:
> > java.io.IOException: Max block location exceeded for split: Paths:
> > "/foo/bar...."
> > ....
> > InputFormatClass: org.apache.hadoop.mapred.TextInputFormat
> > splitsize: 15 maxsize: 10
> > at
> >
> org.apache.hadoop.mapreduce.split.JobSplitWriter.writeOldSplits(JobSplitWriter.java:162)
> > at
> >
> org.apache.hadoop.mapreduce.split.JobSplitWriter.createSplitFiles(JobSplitWriter.java:87)
> > at
> >
> org.apache.hadoop.mapreduce.JobSubmitter.writeOldSplits(JobSubmitter.java:501)
> > at
> >
> org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:471)
> > at
> >
> org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:366)
> > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1269)
> > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1266)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > at org.apache.hadoop.mapreduce.Job.submit(Job.java:1266)
> > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:606)
> > at org.apache.hadoop.mapred.JobClient$1.run(JobClient.java:601)
> > at java.security.AccessController.doPrivileged(Native Method)
> > at javax.security.auth.Subject.doAs(Subject.java:415)
> > at
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> > at
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:601)
> > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:586)
> > at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:447)
> >
> > When we set the property to something higher as suggested like:
> > mapreduce.job.max.split.locations = more than on what it failed
> > then the job runs successfully.
> >
> > I am trying to dig up additional documentation on this since the default
> > seems to be 10, not sure how that limit was set.
> > Additionally what is the recommended value and what factors does it
> depend
> > on?
> >
> > We are running YARN, the actual query is Hive on CDH 4.3, with Hive
> version
> > 0.10
> >
> > Any pointers in this direction will be helpful.
> >
> > Regards,
> > md
>
>
>
> --
> Harsh J
>

Re: Issue: Max block location exceeded for split error when running hive

Reply via email to