I’m not aware of an option to do what you request in the external table 
definition but you might want to that using a view.

P.s.
I seems to me that defining the partition column as a string would be more user 
friendly than integer, e.g. –

select * from threads_test where yyyymmdd like ‘2016%’ – year 2016;
select * from threads_test where yyyymmdd like ‘201603%’ –- March 2016;
select * from threads_test where yyyymmdd like ‘______01’ -- first of every 
month;





$ hdfs dfs -ls -R /tmp/threads_test
drwxr-xr-x   - cloudera supergroup          0 2016-06-04 10:45 
/tmp/threads_test/20160604
-rw-r--r--   1 cloudera supergroup        136 2016-06-04 10:45 
/tmp/threads_test/20160604/data.txt

$ hdfs dfs -cat /tmp/threads_test/20160604/data.txt
{"url":"www.blablabla.com","pageType":"pg1","addDate":"2016-05-17T02:10:44.527","postDate":"2016-05-16T02:08:55","postText":"YadaYada"}

----------------------------------------------------------------------------------------------------


hive> add jar /usr/lib/hive-hcatalog/share/hcatalog/hive-hcatalog-core.jar;

hive>
create external table threads_test
(
    url         string
   ,pagetype    string
   ,adddate     string
   ,postdate    string
   ,posttext    string
)
    partitioned by (yyyymmdd string)
    row format serde 'org.apache.hive.hcatalog.data.JsonSerDe'
    location '/tmp/threads_test'
;

hive> alter table threads_test add partition (yyyymmdd=20160604) location 
'/tmp/threads_test/20160604';

hive> select * from threads_test;

www.blablabla.com    pg1    2016-05-17T02:10:44.527    2016-05-16T02:08:55  
YadaYada      20160604

hive>
create view threads_test_v
as
select      url
           ,pagetype
           ,cast (concat_ws(' ',substr (adddate ,1,10),substr (adddate ,12)) as 
timestamp)  as adddate
           ,cast (concat_ws(' ',substr (postdate,1,10),substr (postdate,12)) as 
timestamp)  as postdate
           ,posttext

from        threads_test
;

hive> select * from threads_test_v;

www.blablabla.com    pg1    2016-05-17 02:10:44.527    2016-05-16 02:08:55  
YadaYada


From: Igor Kravzov [mailto:igork.ine...@gmail.com]
Sent: Saturday, June 04, 2016 8:13 PM
To: user@hive.apache.org
Subject: Convert date in string format to timestamp in table definition

Hi,

I have 2 dates in Json file defined like this
"addDate": "2016-05-17T02:10:44.527",
  "postDate": "2016-05-16T02:08:55",

Right now I define external table based on this file like this:
CREATE external TABLE threads_test
(url string,
 pagetype string,
 adddate string,
 postdate string,
 posttext string)
partitioned by (yyyymmdd int)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
location 'my location';

is it possible to define these 2 dates as timestamp?
Do I need to change date format in the file? is it possible to specify date 
format in table definition?
Or I better off with string?

Thanks in advance.

Reply via email to