[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176800#comment-17176800 ] Renukaprasad C commented on HIVE-23927: --- Thanks [~jcamachorodriguez] & [~pgaref]. We will do the similar implementation as other integer datatype conversion (As suggested by [~pgaref] -Maybe we should make this configurable as well – as we do in longToTimestamp method) in *PrimitiveObjectInspectorUtils.getTimestamp(Object, PrimitiveObjectInspector, boolean).* > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176543#comment-17176543 ] Jesus Camacho Rodriguez commented on HIVE-23927: [~abstractdog], maybe not in the context of ORC-554 then. My point was that the same issue was faced in ORC since the logic was coming from Hive, and some sensible defaults to make the conversion uniform were chosen... We could use same defaults. > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176438#comment-17176438 ] Panagiotis Garefalakis commented on HIVE-23927: --- I guess the main issue here is *PrimitiveObjectInspectorUtils.getTimestamp(Object, PrimitiveObjectInspector, boolean)*. For int: https://github.com/apache/hive/blob/6ceeea87a34f53add62fa6d0a332b06b8863c440/serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritableV2.java#L531 *intToTimestampInSeconds = false * https://github.com/apache/hive/blob/1758c8c857f8a6dc4c9dc9c522de449f53e5e5cc/serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/primitive/PrimitiveObjectInspectorUtils.java#L1181 While for double: https://github.com/apache/hive/blob/e6900fea9108b2dd00f0e4bf2a598f6fc9ba01cf/common/src/java/org/apache/hadoop/hive/common/type/TimestampUtils.java#L43 Not sure where the assumption that Double is in seconds comes from ? Maybe we should make this configurable as well -- as we do in *longToTimestamp* method > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17176135#comment-17176135 ] László Bodor commented on HIVE-23927: - unfortunately, I cannot recall anything from ORC-554 which is related to this, in ORC-554 we handled an overflow case, where a float is not precise enough to represent a timestamp, and messes up the values in TimestampColumnVector ([fix is here|https://github.com/apache/orc/commit/7de945b080c5ca83b84397db105f70082a2107f4#diff-9090b54d59f8163ec2be71169d4813c8R1412-R1426]) this one is indeed not related to ORC/schemaevolution, but the reported problem is present on master, [as my repro shows|https://github.com/abstractdog/hive/commit/54ec318203#diff-219ede90fa98943fb8e1518350ff074dR36] > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175829#comment-17175829 ] Jesus Camacho Rodriguez commented on HIVE-23927: [~prasad-acit], I remember discussing a similar type conversion issue in ORC too, since ORC schema evolution copied the automatic conversions that Hive was doing (probably in the context of ORC-539 / ORC-554). I think the conversion is uniform across different types in ORC now... That would cause some backwards compatibility issues, however I am not sure how common schema evolution from those types is. [~abstractdog], do you recall what was done for ORC? Cc [~omalley] > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175788#comment-17175788 ] Renukaprasad C commented on HIVE-23927: --- Thanks [~klcopp] [~jcamachorodriguez], Please suggest how to proceed with this compatibility issue? Thank you. > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175576#comment-17175576 ] Karen Coppage commented on HIVE-23927: -- I think [~jcamachorodriguez] is your person. > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (HIVE-23927) Cast to Timestamp generates different output for Integer & Float values
[ https://issues.apache.org/jira/browse/HIVE-23927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17175458#comment-17175458 ] Renukaprasad C commented on HIVE-23927: --- [~gopalv] Shall we change the input unit to Millis for double datatype as well? Changing this may break the compatibility for existing users. Please suggest on this. Thank you. > Cast to Timestamp generates different output for Integer & Float values > > > Key: HIVE-23927 > URL: https://issues.apache.org/jira/browse/HIVE-23927 > Project: Hive > Issue Type: Bug >Reporter: Renukaprasad C >Priority: Major > > Double consider the input value as SECOND and converts into Millis internally. > Whereas, Integer value will be considered as Millis and produce different > output. > org.apache.hadoop.hive.serde2.objectinspector.primitive.PrimitiveObjectInspectorUtils.getTimestamp(Object, > PrimitiveObjectInspector, boolean) - Handles Integral & Decimal values > differently. This cause the issue. > 0: jdbc:hive2://localhost:1> select cast(1.204135216E9 as timestamp) > Double2TimeStamp, cast(1204135216 as timestamp) Int2TimeStamp from abc > tablesample(1 rows); > OK > INFO : Compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Semantic Analysis Completed (retrial = false) > INFO : Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:double2timestamp, type:timestamp, > comment:null), FieldSchema(name:int2timestamp, type:timestamp, > comment:null)], properties:null) > INFO : Completed compiling > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.175 seconds > INFO : Concurrency mode is disabled, not creating a lock manager > INFO : Executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14): > select cast(1.204135216E9 as timestamp) Double2TimeStamp, cast(1204135216 as > timestamp) Int2TimeStamp from abc tablesample(1 rows) > INFO : Completed executing > command(queryId=renu_20200724140642_70132390-ee12-4214-a2ca-a7e10556fc14); > Time taken: 0.001 seconds > INFO : OK > INFO : Concurrency mode is disabled, not creating a lock manager > ++--+ > |double2timestamp| int2timestamp | > ++--+ > | 2008-02-27 18:00:16.0 | 1970-01-14 22:28:55.216 | > ++--+ -- This message was sent by Atlassian Jira (v8.3.4#803005)