> I have a pretty big Hive Query. I¹m joining over 3 Hive-Tables which
>have thousands of lines each. I¹m grouping this join by several columns.

Hive-on-Tez shouldn¹t have any issue even with billion of lines on a JOIN.

> 0 failed, info=[Containercontainer_1434357133795_0008_01_000039 finished
>while trying to launch. Diagnostics: [Container failed. Container expired
>since it was unused]], TaskAttempt 1 failed,

Looks like your node manager is actually not spinning up a container that
was allocated (i.e allocation succeeded, but the task spin up failed).

Which YARN scheduler are you running (fair/capacity?) and do you have any
idea on what the logs on the NodeManager logs say about trying to spin up
this container?

If I¹m not wrong, you need to also check if the YARN user has a ulimit set
for the total number of processes on the NM nodes.


 
Cheers,
Gopal


Reply via email to