> I have a pretty big Hive Query. I¹m joining over 3 Hive-Tables which >have thousands of lines each. I¹m grouping this join by several columns.
Hive-on-Tez shouldn¹t have any issue even with billion of lines on a JOIN. > 0 failed, info=[Containercontainer_1434357133795_0008_01_000039 finished >while trying to launch. Diagnostics: [Container failed. Container expired >since it was unused]], TaskAttempt 1 failed, Looks like your node manager is actually not spinning up a container that was allocated (i.e allocation succeeded, but the task spin up failed). Which YARN scheduler are you running (fair/capacity?) and do you have any idea on what the logs on the NodeManager logs say about trying to spin up this container? If I¹m not wrong, you need to also check if the YARN user has a ulimit set for the total number of processes on the NM nodes. Cheers, Gopal