subject:"Tune hive query launched thru spark\-yarn job."

Re: Tune hive query launched thru spark-yarn job.

2019-09-05 Thread Himali Patel

. From: Sathi Chowdhury Date: Thursday, 5 September 2019 at 8:10 PM To: Himali Patel , "user@spark.apache.org" Subject: Re: Tune hive query launched thru spark-yarn job. What I can immediately think of is, as you are doing IN in the where clause for a series of timestamps, if you can

Re: Tune hive query launched thru spark-yarn job.

2019-09-05 Thread Sathi Chowdhury

What I can immediately think of is, as you are doing IN in the where clause for a series of timestamps, if you can consider breaking them and for each epoch timestampYou can load your results to an intermediate staging table and then do a final aggregate from that table keeping the group by

Tune hive query launched thru spark-yarn job.

2019-09-05 Thread Himali Patel

Hello all, We have one use-case where we are aggregating billion of rows. It does huge shuffle. Example : As per ‘Job’ tab on yarn UI When Input size is 350 G something, shuffle size >3 TBs. This increases Non-DFS usage beyond warning limit and thus affecting entire cluster. It seems we need

Re: Tune hive query launched thru spark-yarn job.

Re: Tune hive query launched thru spark-yarn job.

Tune hive query launched thru spark-yarn job.

3 matches

Site Navigation

Mail list logo

Footer information