Hi Gopal - actually no., the table is not partitioned/bucketed. Everyday the whole table gets cleaned up and populated with last 120 days' data...
What are the other properties I can try to improve the performance of reduce steps...? Suresh V http://www.justbirds.in On Sat, Jan 9, 2016 at 8:52 AM, Gopal Vijayaraghavan <gop...@apache.org> wrote: > Hi, > > > The job completes fine if we reduce the # of rows processed by reducing > >the # of days data being processed. > > > > > It just gets stuck after all maps are completed. We checked the logs and > >it says the containers are released. > > Looks like you're inserting into a bucketed & partitioned table and facing > connection timeouts due to GC pauses? > > By default, the optimization slows down the 1-partition at a time ETL, so > it is disabled. > > If your data load falls into the category of >1 partition & has bucketing, > you need to set > > set hive.optimize.sort.dynamic.partition=true; > > > The largest data-load done using a single SQL statement was the 100Tb ETL > loads for TPC-DS. > > In hive-11, people had workarounds using explicit "DISTRIBUTE BY" or "SORT > BY" which didn't scale as well. > > If you have those in your query, remove it. > > >2016-01-08 19:33:33,119 INFO [Socket Reader #1 for port 43451] > >org.apache.hadoop.ipc.Server: Socket Reader #1 for port 43451: > >readAndProcess from client 39.0.8.17 threw exception > >[java.io.IOException: Connection reset by peer] > > Whether that fixes it or not, there are other low-level issues which > trigger similar errors as you scale your cluster to 300+ nodes [1]. > > https://github.com/t3rmin4t0r/notes/wiki/Hadoop-Tuning-notes > > > > Cheers, > Gopal > [1] - > <http://www.slideshare.net/Hadoop_Summit/w-1205p230-aradhakrishnan-v3/10> > > > > > > >