[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628544#comment-13628544 ] Hudson commented on HIVE-4271: -- Integrated in Hive-trunk-h0.21 #2055 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2055/]) HIVE-4271 : Limit precision of decimal type (Gunther Hagleitner via Ashutosh Chauhan) (Revision 1466305) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1466305 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java * /hive/trunk/data/files/kv8.txt * /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveBaseResultSet.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAbs.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericOp.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericUnaryOp.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFFloor.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMinus.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMod.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMultiply.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNegative.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPlus.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPositive.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPosMod.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPower.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRound.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect2.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToDecimal.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java * /hive/trunk/ql/src/test/queries/clientpositive/decimal_precision.q * /hive/trunk/ql/src/test/results/clientpositive/decimal_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/decimal_precision.q.out * /hive/trunk/ql/src/test/results/clientpositive/decimal_serde.q.out * /hive/trunk/ql/src/test/results/clientpositive/decimal_udf.q.out * /hive/trunk/ql/src/test/results/clientpositive/literal_decimal.q.out * /hive/trunk/ql/src/test/results/clientpositive/serde_regex.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/BigDecimalWritable.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBigDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyHiveDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBigDecimalObjectInspector.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyHiveDecimalObjectInspector.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryBigDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryHiveDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627638#comment-13627638 ] Hudson commented on HIVE-4271: -- Integrated in Hive-trunk-hadoop2 #149 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/149/]) HIVE-4271 : Limit precision of decimal type (Gunther Hagleitner via Ashutosh Chauhan) (Revision 1466305) Result = FAILURE hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1466305 Files : * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type * /hive/trunk/common/src/java/org/apache/hadoop/hive/common/type/HiveDecimal.java * /hive/trunk/data/files/kv8.txt * /hive/trunk/jdbc/src/java/org/apache/hadoop/hive/jdbc/HiveBaseResultSet.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFAbs.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericOp.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFBaseNumericUnaryOp.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFCeil.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFFloor.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPDivide.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMinus.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMod.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPMultiply.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPNegative.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPlus.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFOPPositive.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPosMod.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFPower.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFRound.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToBoolean.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToByte.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToDouble.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToFloat.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToInteger.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToLong.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToShort.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDFToString.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFAverage.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFReflect2.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFToDecimal.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/TestFunctionRegistry.java * /hive/trunk/ql/src/test/queries/clientpositive/decimal_precision.q * /hive/trunk/ql/src/test/results/clientpositive/decimal_3.q.out * /hive/trunk/ql/src/test/results/clientpositive/decimal_precision.q.out * /hive/trunk/ql/src/test/results/clientpositive/decimal_serde.q.out * /hive/trunk/ql/src/test/results/clientpositive/decimal_udf.q.out * /hive/trunk/ql/src/test/results/clientpositive/literal_decimal.q.out * /hive/trunk/ql/src/test/results/clientpositive/serde_regex.q.out * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/RegexSerDe.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/binarysortable/BinarySortableSerDe.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/BigDecimalWritable.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/HiveDecimalWritable.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/io/TimestampWritable.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyBigDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyFactory.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyHiveDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/LazyUtils.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyBigDecimalObjectInspector.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyHiveDecimalObjectInspector.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazy/objectinspector/primitive/LazyPrimitiveObjectInspectorFactory.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryBigDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryFactory.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinary/LazyBinaryHiveDecimal.java * /hive/trunk/serde/src/java/org/apache/hadoop/hive/serde2/lazybinar
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627112#comment-13627112 ] Ashutosh Chauhan commented on HIVE-4271: The exact precision and how to implement in performant way we can discuss on separate jira. Lets use this jira to limit the max precision. +1 to the latest patch. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch, HIVE-4271.5.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627100#comment-13627100 ] Gunther Hagleitner commented on HIVE-4271: -- Thanks Carter and Eric for the feedback. I've opened HIVE-4320 in response to your comments. I'd like to think through how we're going to do math on these numbers, but you make a great point. I'd still like to move forward with this jira and do the rest in HIVE-4320. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch, HIVE-4271.5.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627004#comment-13627004 ] Eric Hanson commented on HIVE-4271: --- SQL Server's max size is 38 digits. SQL Server columnstore query execution gets a performance benefit if precision is 18 or less since 18 digits fit in one 64-bit int. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch, HIVE-4271.5.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13626636#comment-13626636 ] Carter Shanklin commented on HIVE-4271: --- Gunther, one thing to consider is Teradata interoperability and they support up to 38 rather than 36. They claim to do this with 16 bytes. See http://developer.teradata.com/tools/articles/how-many-digits-in-a-decimal I believe SQL Server is also 38 but I'm not sure. If we can get 38 that would be ideal from a compatibility point of view. If there is a big performance hit due to encoding or whatever other reason that's a good reason to go with 36 rather than 38 since there's probably not too many apps using 37 or 38. There are bound to be some out there somewhere though. Last thought, starting with 18 is fine since it futureproofed from a DDL point of view but there is good upside to being able to make stronger compatibility statements. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch, HIVE-4271.5.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624259#comment-13624259 ] Eric Hanson commented on HIVE-4271: --- Okay, that sounds fine. If we document from the beginning to use 18 digits or less in your schema design unless you need more, that'll help. It should be possible to make the case for 19..36 digits quite fast too, though there will be a penalty. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch, HIVE-4271.5.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624237#comment-13624237 ] Gunther Hagleitner commented on HIVE-4271: -- [~ehans]: Thanks for the feedback. I like your idea of the single LONG default, however I'm not sure we'll be able to easily do that. Right now there are no defaults. Just one "decimal" type with a max precision (of 36) - reducing that to 18 would be a really small range. The plan is to eventually extend to decimal(p)/decimal(p,s), but at that point I'm thinking the default "decimal" type will still have to be the max precision (e.g.: that'll be the fallback for decimal operations where we don't know the exact precision returned). We'll see, hopefully we can at least coach folks to use as little as they need for improved performance. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch, HIVE-4271.5.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624094#comment-13624094 ] Eric Hanson commented on HIVE-4271: --- I like this proposal. It will make life easier when the time comes have to implement support for vectorized comparisons and arithmetic (https://issues.apache.org/jira/browse/HIVE-4160) for decimal, because the data can be stored in an array of LONG or a pair of LONG values. This will enable faster query execution. If there are defaults, please make the default be such that the value will fit in 18 digits or less (a single LONG). Then the standard integer arithmetic code path can be used for vectorized QE for the common case for decimal. Users should be coached to use 18 digits or less for decimal unless their app really needs more. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch, HIVE-4271.3.patch, > HIVE-4271.4.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621662#comment-13621662 ] Ashutosh Chauhan commented on HIVE-4271: Following two tests failed in CliDriver * literal_decimal.q [junit] Running: diff -a /home/ashutosh/hive/build/ql/test/logs/clientpositive/literal_decimal.q.out /home/ashutosh/hive/ql/src/test/results/clientpositive/literal_decimal.q.out [junit] 61c61 [junit] < -10 1 3.14-3.14 9 9.9 NULLNULL [junit] --- [junit] > -10 1 3.14-3.14 9 9.9 1E-99 1E+99 Looks like need to update the golden file, now that we have lost ability to read in values like 1E-99 * serde_regex.q Hadoop job failed. We recently added support for decimals in regex_serde in HIVE-3951 , possibly related to that. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13621089#comment-13621089 ] Ashutosh Chauhan commented on HIVE-4271: +1 will commit if tests pass. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch, HIVE-4271.2.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620556#comment-13620556 ] Ashutosh Chauhan commented on HIVE-4271: Left some comments on phabricator. > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4271) Limit precision of decimal type
[ https://issues.apache.org/jira/browse/HIVE-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13618687#comment-13618687 ] Gunther Hagleitner commented on HIVE-4271: -- Review: https://reviews.facebook.net/D9855 > Limit precision of decimal type > --- > > Key: HIVE-4271 > URL: https://issues.apache.org/jira/browse/HIVE-4271 > Project: Hive > Issue Type: Bug >Reporter: Gunther Hagleitner >Assignee: Gunther Hagleitner > Attachments: HIVE-4271.1.patch > > > The current decimal implementation does not limit the precision of the > numbers. This has a number of drawbacks. A maximum precision would allow us > to: > - Have SerDes/filformats store decimals more efficiently > - Speed up processing by implementing operations w/o generating java > BigDecimals > - Simplify extending the datatype to allow for decimal(p) and decimal(p,s) > - Write a more efficient BinarySortable SerDe for sorting/grouping/joining > Exact numeric datatype are typically used to represent money, so if the limit > is high enough it doesn't really become an issue. > A typical representation would pack 9 decimal digits in 4 bytes. So, with 2 > longs we can represent 36 digits - which is what I propose as the limit. > Final thought: It's easier to restrict this now and have the option to do the > things above than to try to do so once people start using the datatype. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira