[ 
https://issues.apache.org/jira/browse/PARQUET-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699686#comment-17699686
 ] 

Gang Wu commented on PARQUET-2255:
----------------------------------

These are good questions. Let me try to answer them from the perspective of 
Java:
 # +0.0 and -0.0 are different things but they are equal on the Java side. 
https://stackoverflow.com/a/24238344
 # Java does not have signaling NaN. And it only has a single NaN 
representation. [https://stackoverflow.com/a/25051746]

To support better interoperability, I think we should do two things:
 * If +0.0 is inserted into the bloom filter, so should -0.0. Vice versa for 
-0.0.
 * No NaN should be inserted into the bloom filter. I doubt any user really 
wants to test existence of NaN.

> BloomFilter and float point is ambiguous
> ----------------------------------------
>
>                 Key: PARQUET-2255
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2255
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-format
>            Reporter: Xuwei Fu
>            Priority: Major
>             Fix For: format-2.9.0
>
>
> Currently, our Parquet can use BloomFilter for any physical types. However, 
> when BloomFilter apply on float:
>  # What does +0 -0 means? Are they equal?
>  # Should qNaN sNaN written in BloomFilter? Are they equal?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to