Hi Ashish, thanks for help!

1) regarding the group-by-week, it was entirely my fault... I miswrote
the function... to get grouping by week one needs to have
FIELD: date_add('2007-12-31', (datediff(date_field, '2007-12-31') / 7) *
7)
GROUP BY: datediff(date_field, '2007-12-31') / 7

This then works and you get both the grouping and the first day of the
month as the field value

2) However this is not so simple for the month:

> For getting numeric sorting you can do an order by CAST(month(pdate)
AS INT). That will probably work for you?

This won't work since pdate by itself is a date string, not a number,
and even if it was then the month dates would be:
2008-4-20 which comes after 2008-10-20 when sorting lexically
Therefore the question is how to get zero padding with hive?
I have this expression:
concat(concat(concat(CAST(year(date_field) as STRING), '-'),
CAST(month(date_field) as STRING)), '-1')

which produces dates like:
2008-4-1
but I want zero-padding, which means
2008-04-1

How does one do that with hive?

Is there a function that would take separate year,month and day values
and return properly formated date?

[btw from my experience support for handling dates in hive is extremely
weak, and code becomes heavily convoluted to do it properly]


-- 
Andraz Tori, CTO
Zemanta Ltd, New York, London, Ljubljana
www.zemanta.com
mail: [email protected]
tel: +386 41 515 767
twitter: andraz, skype: minmax_test


Reply via email to