like -MM-DD.
>>For example, for "2-oct-2013" it will be 2013-10-02.
>>
>>
>>Best Regards,
>>Nishant Kelkar
>>
>>
>>
>>
>>
>>On Wed, Sep 10, 2014 at 11:48 AM, Raj Hadoop wrote:
>>
>>The
>>>
>>
The
>>
>> SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date
>>
>> is returning the lowest date. I need the largest date.
>>
>>
>>
>>
>> On Wed, 9/10/14, Raj Hadoop wrote:
>>
>> Subjec
ed, Sep 10, 2014 at 11:48 AM, Raj Hadoop wrote:
>
> The
>
> SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date
>
> is returning the lowest date. I need the largest date.
>
>
>
> --------
> On Wed, 9/10/14, Raj Hadoop wrote:
>
&g
---
>On Wed, 9/10/14, Raj Hadoop wrote:
>
> Subject: Re: Remove duplicate records in Hive
> To: user@hive.apache.org
> Date: Wednesday, September 10, 2014, 2:41 PM
>
>
> Thanks. I will try it.
> ----
&g
0] AS latest_date
>
> is returning the lowest date. I need the largest date.
>
>
>
>
> On Wed, 9/10/14, Raj Hadoop wrote:
>
> Subject: Re: Remove duplicate records in Hive
> To: user@hive.apache.org
> Date: Wednesda
The
SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date
is returning the lowest date. I need the largest date.
On Wed, 9/10/14, Raj Hadoop wrote:
Subject: Re: Remove duplicate records in Hive
To: user@hive.apache.org
Date: Wednesday, September 10
¬¬'
-Mensagem original-
De: Raj Hadoop [mailto:hadoop...@yahoo.com]
Enviada em: quarta-feira, 10 de setembro de 2014 15:42
Para: user@hive.apache.org
Assunto: Re: Remove duplicate records in Hive
Thanks. I will try it.
On Wed, 9/10/14, Ni
Thanks. I will try it.
On Wed, 9/10/14, Nishant Kelkar wrote:
Subject: Re: Remove duplicate records in Hive
To: user@hive.apache.org, hadoop...@yahoo.com
Date: Wednesday, September 10, 2014, 1:59 PM
Hi
Raj,
You can do something
along these
Hi Raj,
You can do something along these lines:
SELECT cno, sqno, SORT_ARRAY(COLLECT_SET(date))[0] AS latest_date FROM
table GROUP BY cno, sqno;
However, you have to make sure your date format is such that sorting it
gives you the most recent date. The best way to do that is to have it in
format
Whoops, thought this was someone in my office, so obviously you can’t come see
me :)
--
Kevin Weiler
IT
IMC Financial Markets | 233 S. Wacker Drive, Suite 4300 | Chicago, IL 60606 |
http://imc-chicago.com/
Phone: +1 312-204-7439 | Fax: +1 312-244-3301 | E-Mail:
kevin.wei...@imc-chicago.com
If you can just query the table for your results, you can do a SELECT DISTINCT
instead of just a SELECT. If you give me a bit more information about where the
duplicate data is coming from, I can provide a bit more detail. You can come
see me on the end of desk.
--
Kevin Weiler
IT
IMC Financial
Hi,
I have a requirement in Hive to remove duplicate records ( they differ only by
one column i.e a date column) and keep the latest date record.
Sample :
Hive Table :
d2 is a higher
cno,sqno,date
100 1 1-oct-2013
101 2 1-oct-2013
100 1 2-oct-2013
102 2 2-oct-2013
Output needed:
100 1 2-o
12 matches
Mail list logo