[
https://issues.apache.org/jira/browse/PARQUET-2255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17699686#comment-17699686
]
Gang Wu commented on PARQUET-2255:
----------------------------------
These are good questions. Let me try to answer them from the perspective of
Java:
# +0.0 and -0.0 are different things but they are equal on the Java side.
https://stackoverflow.com/a/24238344
# Java does not have signaling NaN. And it only has a single NaN
representation. [https://stackoverflow.com/a/25051746]
To support better interoperability, I think we should do two things:
* If +0.0 is inserted into the bloom filter, so should -0.0. Vice versa for
-0.0.
* No NaN should be inserted into the bloom filter. I doubt any user really
wants to test existence of NaN.
> BloomFilter and float point is ambiguous
> ----------------------------------------
>
> Key: PARQUET-2255
> URL: https://issues.apache.org/jira/browse/PARQUET-2255
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-format
> Reporter: Xuwei Fu
> Priority: Major
> Fix For: format-2.9.0
>
>
> Currently, our Parquet can use BloomFilter for any physical types. However,
> when BloomFilter apply on float:
> # What does +0 -0 means? Are they equal?
> # Should qNaN sNaN written in BloomFilter? Are they equal?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)