Hi, I have a requirement in Hive to remove duplicate records ( they differ only by one column i.e a date column) and keep the latest date record.
Sample : Hive Table : d2 is a higher cno,sqno,date 100 1 1-oct-2013 101 2 1-oct-2013 100 1 2-oct-2013 102 2 2-oct-2013 Output needed: 100 1 2-oct-2013 101 2 1-oct-2013 102 2 2-oct-2013 I am using Hive 0.11 Any suggestions please ? Regards, Raj