Hi,

I have a requirement in Hive to remove duplicate records ( they differ only by 
one column i.e a date column) and keep the latest date record.

Sample :
Hive Table :
 d2 is a higher 
cno,sqno,date

100 1 1-oct-2013
101 2 1-oct-2013
100 1 2-oct-2013
102 2 2-oct-2013


Output needed:

100 1 2-oct-2013
101 2 1-oct-2013
102 2 2-oct-2013

I am using Hive 0.11

Any suggestions please ?

Regards,
Raj

Reply via email to