[
https://issues.apache.org/jira/browse/BEAM-10462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kenneth Knowles updated BEAM-10462:
-----------------------------------
Status: Open (was: Triage Needed)
> org.apache.beam.sdk.transforms corrupt data when a value is Double.NaN
> ----------------------------------------------------------------------
>
> Key: BEAM-10462
> URL: https://issues.apache.org/jira/browse/BEAM-10462
> Project: Beam
> Issue Type: Bug
> Components: sdk-java-core
> Affects Versions: 0.2.0-incubating, 2.22.0
> Reporter: Andrew Pilloud
> Assignee: Andrew Pilloud
> Priority: P1
>
> When there is a NaN value in the PCollection passed into Min or Max we get a
> random value back due to the way the CombineFn works. Per the SQL standard,
> we should always get NaN back. I'm going to add a special case get the right
> answer.
> Looks like we switched from using Double.compare to `>=` operator in
> https://github.com/apache/beam/commit/21a5b44c3b541ba6c89df5649afe00412df73d10,
> which introduced a data corruption bug.
> A test case demonstrating this issue:
> {code}
> @Test
> public void testDouble() {
> Assert.assertFalse(Double.NaN >= 0.9);
> Assert.assertFalse(0.9 >= Double.NaN);
> Assert.assertFalse(Double.NaN >= Double.POSITIVE_INFINITY);
> Assert.assertFalse(Double.POSITIVE_INFINITY >= Double.NaN);
> Assert.assertTrue(Double.compare(Double.NaN, 0.9) >= 0);
> Assert.assertFalse(Double.compare(0.9, Double.NaN) >= 0);
> Assert.assertTrue(Double.compare(Double.NaN, Double.POSITIVE_INFINITY) >=
> 0);
> Assert.assertFalse(Double.compare(Double.POSITIVE_INFINITY, Double.NaN)
> >= 0);
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)