[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-30 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14521089#comment-14521089
 ] 

Alexander Pivovarov commented on HIVE-10488:


looks like Orc table contains int values instead of date values
e.g. 1996-01-01=1994

I got very similar results as in the description when I removed quotation marks 
wrapping date values:
{code}
--  1996-01-01=1994
hive select cast(1996-01-01 as timestamp);
OK
1969-12-31 16:00:01.994

--  2000-01-01=1998
hive select cast(2000-01-01 as timestamp);
OK
1969-12-31 16:00:01.998

--  2000-12-31=1957
hive select cast(2000-12-31 as timestamp);
OK
1969-12-31 16:00:01.957
{code}
my TimeZone is US/Pacific - this is why time is -8hr from 1970   (1969 4pm)

The description shows 1969 7pm (-5 hours offset from 1970) - So, their timezone 
is US/Eastern

 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-29 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519997#comment-14519997
 ] 

Alexander Pivovarov commented on HIVE-10488:


Got it! Thank you.

BTW, I remember it was one issue with date to timestamp conversion for negative 
unix time HIVE-10178. But it is fixed already and has nothing to do with Orc.
{code}
select cast(cast('1966-01-01 00:00:01' as timestamp) as date);
1966-02-02
{code}

 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-29 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519846#comment-14519846
 ] 

Chaoyu Tang commented on HIVE-10488:


[~the6campbells] I was also not able to reproduce the issue in Hive 1.2. Here 
are my steps:
{code}
create table testcastts (key int, datevalue date);
insert into testcastts select 0, null from src limit 1;
insert into testcastts select 1, date '1996-01-01' from src limit 1;
insert into testcastts select 2, date '2000-01-01' from src limit 1;
insert into testcastts select 3, date '2000-12-31' from src limit 1;
---
select key, datevalue, cast(datevalue as timestamp) from testcastts;
0   NULLNULL
1   1996-01-01  1996-01-01 00:00:00
2   2000-01-01  2000-01-01 00:00:00
3   2000-12-31  2000-12-31 00:00:00

---
create table if not exists testcastorcts (key int, datevalue date) stored as 
orc;
insert overwrite table testcastorcts select * from testcastts;
select key, datevalue, cast(datevalue as timestamp) from testcastorcts;
0   NULLNULL
1   1996-01-01  1996-01-01 00:00:00
2   2000-01-01  2000-01-01 00:00:00
3   2000-12-31  2000-12-31 00:00:00
{code}
Do you see any difference between above my test case and yours? Otherwise, I 
will resolve this JIRA as Not Reproducible. Thanks


 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-29 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519865#comment-14519865
 ] 

Alexander Pivovarov commented on HIVE-10488:


you do not have to put from src limit 1 in select statement in hive-1.2

 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-29 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519863#comment-14519863
 ] 

Alexander Pivovarov commented on HIVE-10488:


Chaoyu, testcastts should be Orc table. Is Orc default table format in your 
hive config?
Cast is working fine for textfile table for N.

 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-29 Thread Chaoyu Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14519921#comment-14519921
 ] 

Chaoyu Tang commented on HIVE-10488:


Here are the desc formatted from two tables (testcastts and testcastorcts) I 
tested:
{code}
# Detailed Table Information 
Database:   jira 
Owner:  ctang
CreateTime: Wed Apr 29 12:03:08 EDT 2015 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   file:/user/hive/warehouse/apache/jira.db/testcastts 
 
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   true
numFiles4   
numRows 4   
rawDataSize 40  
totalSize   44  
transient_lastDdlTime   1430323769  
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
serialization.format1  

===
key int 
datevalue   date
 
# Detailed Table Information 
Database:   jira 
Owner:  ctang
CreateTime: Wed Apr 29 12:12:42 EDT 2015 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   file:/user/hive/warehouse/apache/jira.db/testcastorcts  
 
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   true
numFiles1   
numRows 4   
rawDataSize 184 
totalSize   304 
transient_lastDdlTime   1430324019  
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
OutputFormat:   org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
serialization.format1  
{code}
BTW, I also queried with vectorized execution (set 
hive.vectorized.execution.enabled=true) for the ORC table testcastorcts, it 
also worked fine.

 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14513641#comment-14513641
 ] 

Alexander Pivovarov commented on HIVE-10488:


I can not reproduce this issue in hive 1.2.0
I created 2 tables 
t3 - textfile
t3o - Orc

{code}
hive desc formatted t3;
OK
# col_name  data_type   comment 
 
rnumint 
cdt date
 
# Detailed Table Information 
Database:   default  
Owner:  apivovarov   
CreateTime: Sun Apr 26 23:58:29 PDT 2015 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   hdfs://localhost/apps/apivovarov/warehouse/t3
Table Type: MANAGED_TABLE
Table Parameters:
transient_lastDdlTime   1430117909  
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe  
 
InputFormat:org.apache.hadoop.mapred.TextInputFormat 
OutputFormat:   
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat   
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
serialization.format1   
Time taken: 0.098 seconds, Fetched: 27 row(s)
{code}
{code}
hive desc formatted t3o;
OK
# col_name  data_type   comment 
 
rnumint 
cdt date
 
# Detailed Table Information 
Database:   default  
Owner:  apivovarov   
CreateTime: Mon Apr 27 00:00:11 PDT 2015 
LastAccessTime: UNKNOWN  
Protect Mode:   None 
Retention:  0
Location:   hdfs://localhost/apps/apivovarov/warehouse/t3o   
Table Type: MANAGED_TABLE
Table Parameters:
COLUMN_STATS_ACCURATE   true
numFiles1   
numRows 4   
rawDataSize 184 
totalSize   302 
transient_lastDdlTime   1430118011  
 
# Storage Information
SerDe Library:  org.apache.hadoop.hive.ql.io.orc.OrcSerde
InputFormat:org.apache.hadoop.hive.ql.io.orc.OrcInputFormat  
OutputFormat:   org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
 
Compressed: No   
Num Buckets:-1   
Bucket Columns: []   
Sort Columns:   []   
Storage Desc Params: 
serialization.format1   
Time taken: 0.096 seconds, Fetched: 32 row(s)
{code}
{code}
hive select * from t3;
OK
0   NULL
1   1996-01-01
2   2000-01-01
3   2000-12-31
Time taken: 0.086 seconds, Fetched: 4 row(s)
{code}
{code}
hive select * from t3o;
OK
0   NULL
1   1996-01-01
2   2000-01-01
3   2000-12-31
Time taken: 0.086 seconds, Fetched: 4 row(s)
{code}
{code}
hive select rnum, cdt, cast (cdt as timestamp) from t3;
OK
0   NULLNULL
1   1996-01-01  1996-01-01 00:00:00
2   2000-01-01  2000-01-01 00:00:00
3   2000-12-31  2000-12-31 00:00:00
Time taken: 0.091 seconds, Fetched: 4 row(s)
{code}
{code}
hive select rnum, cdt, cast (cdt as timestamp) from t3o;
OK
0   NULLNULL
1   1996-01-01  1996-01-01 00:00:00
2   2000-01-01  2000-01-01 00:00:00
3   2000-12-31  2000-12-31 00:00:00
Time taken: 0.108 seconds, Fetched: 4 row(s)
{code}

MR
{code}
hive select t3.rnum, t3.cdt, cast (t3.cdt as timestamp) cts, t3o.cdt cdt2, 
cast(t3o.cdt as timestamp) cts2 from t3 join t3o on (t3.rnum = t3o.rnum);
Query ID = apivovarov_20150427000533_2734a9a1-63eb-45d4-83a4-4129ae3e7afc
Total jobs = 1
15/04/27 00:05:36 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
Execution log at: 
/tmp/apivovarov/apivovarov_20150427000533_2734a9a1-63eb-45d4-83a4-4129ae3e7afc.log
2015-04-27 00:05:37 Starting to launch local task to process map join;  

[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-27 Thread N Campbell (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514263#comment-14514263
 ] 

N Campbell commented on HIVE-10488:
---

Cast was using a column of type date and not literals or value expression.
Variance was the format the data was in.



 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10488) cast DATE as TIMESTAMP returns incorrect values

2015-04-27 Thread Alexander Pivovarov (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14514489#comment-14514489
 ] 

Alexander Pivovarov commented on HIVE-10488:


Put 2015-01-02 to quotes in the example 3
Otherwise you cast integer 2012 to timestamp

 cast DATE as TIMESTAMP returns incorrect values
 ---

 Key: HIVE-10488
 URL: https://issues.apache.org/jira/browse/HIVE-10488
 Project: Hive
  Issue Type: Bug
  Components: SQL
Affects Versions: 0.13.1
Reporter: N Campbell
Assignee: Chaoyu Tang

 same data in textfile works
 same data loaded into an ORC table does not
 connection property of tez/mr makes no difference.
 select rnum, cdt, cast (cdt as timestamp) from tdt
 0 null  null
 1 1996-01-01  1969-12-31 19:00:09.496
 2 2000-01-01  1969-12-31 19:00:10.957
 3 2000-12-31  1969-12-31 19:00:11.322
 vs
 0 null  null
 1 1996-01-01  1996-01-01 00:00:00.0
 2 2000-01-01  2000-01-01 00:00:00.0
 3 2000-12-31  2000-12-31 00:00:00.0
 create table  if not exists TDT ( RNUM int , CDT date   )
  STORED AS orc  ;
 insert overwrite table TDT select * from  text.TDT;
 0|\N
 1|1996-01-01
 2|2000-01-01
 3|2000-12-31



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)