Re: optimize hive query to move a subset of data from one partition table to another table

devjyoti patra Mon, 12 Feb 2018 09:36:16 -0800

Can you try running your query with static literal for date filter.
(join_date >= SOME 2 MONTH OLD DATE). I cannot think of any reason why this
query should create more than 60 tasks.





On 12 Feb 2018 6:26 am, "amit kumar singh" <amitiem...@gmail.com> wrote:

Hi

create table emp as select * from emp_full where join_date
>=date_sub(join_date,2)

i am trying to select from one table insert into another table

i need a way to do select last 2 month of data everytime

table is partitioned on year month day

On Sun, Feb 11, 2018 at 4:30 PM, Richard Qiao <richardqiao2...@gmail.com>
wrote:

> Would you mind share your code with us to analyze?
>
> > On Feb 10, 2018, at 10:18 AM, amit kumar singh <amitiem...@gmail.com>
> wrote:
> >
> > Hi Team,
> >
> > We have hive external  table which has 50 tb of data partitioned on year
> month day
> >
> > i want to move last 2 month of data into another table
> >
> > when i try to do this through spark ,more than 120k task are getting
> created
> >
> > what is the best way to do this
> >
> > thanks
> > Rohit
>
>

Re: optimize hive query to move a subset of data from one partition table to another table

Reply via email to