[ 
https://issues.apache.org/jira/browse/IMPALA-8145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16756465#comment-16756465
 ] 

Jim Apple commented on IMPALA-8145:
-----------------------------------

Can this turn into a correctness issue when some rows of the result set are not 
scanned because they are in a partition that is spuriously pruned?

> Partition metadata key muddle
> -----------------------------
>
>                 Key: IMPALA-8145
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8145
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 3.1.0
>            Reporter: Paul Rogers
>            Priority: Minor
>
> Impala stores metadata, including about HDFS partitions. Partitions are 
> defined as a collection of keys which are in terms of {{(column, value)}} 
> pairs. For example, "year=2018/month=1". The columns are defined in HMS with 
> a name and type. Values are defined as part of the partition definition.
> Impala performs partition pruning. This means that a query that says {{WHERE 
> month=2}} will omit the above partition, but will scan one for 
> "year=2018/month=2". To perform the pruning, the value of the partition key 
> must be converted from text (as used to define the directory) to the same 
> type as a column, say TINYINT here.
> Conversion is done in the catalog server when loading a partition. Given the 
> type of the column, the catalog parses the string value of the key, in this 
> case into a NumericLiteral of type TINYINT. The resulting object is then 
> converted into a Thrift TExpr node, sent over the network to the Coordinator, 
> where it is deserialized back into a NumericLiteral.
> All of this works fine for String and integer keys. It fails, however, for 
> float and double keys. (Let's set aside the fact that partitioning on 
> floating point numbers is a very bad idea for a number of reasons. Impala 
> supports this bad choice. Our job here is just to deal with that decision.)
> NumericLiteral stores its value as a Java BigDecimal. BigDecimal stores 
> values in decimal and so can easily represent, say 0.1, if that is the 
> partition key. Unfortunately, floating point numbers are binary, and cannot 
> accurately represent anything other than a sum of binary fractions. The value 
> 0.1 is a repeating fraction in binary.
> Because of magic I don't fully understand, storing 0.1 as double will render 
> 0.1 when printed. Presumably the floating point standard handles this in some 
> way.
> But, when the process above occurs, upon deserialization from Thrift, the 
> double value is converted to a BigDecimal. The result is the value 
> 1.100000000000000088817841970012523233890533447265625. That is, BigDecimal is 
> more precise than double, and can represent (in decimal) the sum of binary 
> fractions used to approximate 0.1
> This issue is fully described in the [BigDecimal 
> javadoc|https://docs.oracle.com/javase/8/docs/api/java/math/BigDecimal.html#BigDecimal-double-],
>  using, as it happens, the very value of 0.1 discussed above.
> The result is that, if a query is planned in local catalog mode, partition 
> pruning for "WHERE float_col=0.1" works, because the parser parses "0.1" 
> directly from string to BigDecimal, then onto decimal.
> But, if the same query is planned in traditional model, the extra Thrift 
> conversion cause the bogus value shown above to be used in comparisons, 
> resulting in a failed partition match.
> The temporary solution is to convert from double in Thrift back to string, 
> and from String to BigDecimal. This is, obviously, quite silly.
> The bigger issue is that there is no good reason for the catalog server to 
> parse partition keys into literal expressions only to be converted to Thrift. 
> Better to leave the partition keys as strings and allow the coordinator to do 
> any required parsing to literal expressions.
> Note that, in the current design, with code before a recent revision, the 
> catalog server must analyze each literal expression, but there is no 
> coordinator to provide the analyzer, so special code was needed to allow 
> analysis with a null analyzer, needlessly complicating the logic.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to