[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Reviewed-on: http://gerrit.cloudera.org:8080/11535
Tested-by: Impala Public Jenkins 
Reviewed-by: Michael Ho 
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 202 insertions(+), 35 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Michael Ho: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 24
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 23: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 23
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Nov 2018 02:58:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 23: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 23
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 02 Nov 2018 01:40:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 23:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1252/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 23
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 22:09:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 23:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3406/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 23
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 21:35:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#23).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 202 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/23
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 23
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 22: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3401/


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 22
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 20:55:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1246/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 22
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 17:22:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 22:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3401/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 22
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 01 Nov 2018 16:38:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-11-01 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#22).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 202 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/22
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 22
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 20: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3387/


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 20
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 31 Oct 2018 18:59:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 20:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3387/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 20
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 31 Oct 2018 15:12:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-26 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 19:

There seems to be other test failures in addition to the one you fixed: 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3421/

Please run a private build to confirm all newly added tests actually passed.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 19
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Oct 2018 22:24:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 19:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1171/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 19
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 26 Oct 2018 13:33:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-26 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#19).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 202 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/19
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 19
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 18:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test
File testdata/workloads/functional-query/queries/QueryTest/aggregation.test:

http://gerrit.cloudera.org:8080/#/c/11535/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1402
PS18, Line 1402: bigint, float
You forgot to update result type for the extra column in the select list.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 18
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 25 Oct 2018 02:17:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 18: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3348/


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 18
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 25 Oct 2018 01:39:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 18:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3348/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 18
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 21:45:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 18: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 18
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 21:45:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 17: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 17
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 21:44:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 17:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1144/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 17
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 19:42:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 17:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@801
PS16, Line 801: (VALUES((1.6 x, 0 y), (3.2, 1), (5.4,2), (0.5, 3), (0.5, 4), 
(-0.5, 5))) XX),
> It looks like we should be running test_basic_join_queries with disable_cod
Done


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818
PS16, Line 818: select t1.float_col as v
  : from functional.alltypessmall t1, functional.alltypessmall 
t2
  : where sqrt(0.5-t
> Should this be addressed ?
Done



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 17
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 19:06:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, 
Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#17).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 202 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/17
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 17
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-24 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16: Code-Review+2

(2 comments)

LGTM. Two minor comments which need to be addressed.

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.h@40
PS16, Line 40:
nit: blank space


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818
PS16, Line 818: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))),
  :  r as (select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> 
sqrt(1.0-t2.x))
  :  select * from r
> If you plan to keep this test case, this can be simplified as:
Should this be addressed ?



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 24 Oct 2018 06:41:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-23 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@751
PS16, Line 751:   // Mirror logic in HashTableCtx::Equals - IMPALA-6661
> This is the codegen path.  HashTableCtx::Equals is the corresponding non-co
Yeah this is kind of a cross-cutting feature of codegen. Long-term we'd like to 
minimise the duplication of logic, e.g. by cross-compiling C++ code into LLVM 
IR and substituting constants (you can go a long way with this if the the 
constants are complex things like descriptor tables that are variable in the 
interpreted code but constant in the codegen'd version).


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074
PS16, Line 3074:  QUERY
> The issue in this bug is that we want "GROUP BY" to consider NaN==NaN, but
expr-test.cc is probably the right place for exhaustive operator tests. Those 
get run with and without codegen so there's no trickier required to ensure that 
both code paths are tested.

Seems worth documenting the test gap in a JIRA and fixing it soonish just to 
get the extra coverage - this patch is close enough to being done that I don't 
want to hold it up to tack on something tangential.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 23 Oct 2018 20:15:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-22 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

(3 comments)

> Patch Set 16:
>
> (3 comments)
>
> A few more comments as I learn this code.

http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG@17
PS16, Line 17:   additional logic to consider NaN values as equal.
> The comment mentions equality. Is the rule that NaN = NaN is always false?
NaN = NaN is always false.  NaN <=> NaN is always true.
The "DISTINCT" implementation does not need to change.  Testing before/after 
shows that after this change the behavior is correct.


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc@33
PS16, Line 33: const float RawValue::CANONICAL_FLOAT_NAN = nanf("");
> NaN is not a single bit pattern (which is why you've defined a canonical fo
The only requirement is that within a particular process we are consistent.  
But, as I understand it these functions
should be consistent anyways.


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074
PS16, Line 3074:  QUERY
> The thought is that we might want to ensure all paths work to handle the Na
The issue in this bug is that we want "GROUP BY" to consider NaN==NaN, but not 
"WHERE NaN op floatCal".  The JOIN behavior is implicitly doing a "GROUP BY" 
with respect ot the hash table.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 23 Oct 2018 02:15:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-22 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

(3 comments)

A few more comments as I learn this code.

http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG@17
PS16, Line 17:   additional logic to consider NaN values as equal.
The comment mentions equality. Is the rule that NaN = NaN is always false? But 
that Nan <=> NaN is true?

Looks like the IS DISTINCT (<=>) code is implemented in operators-ir.cc, but 
that file is not in this patch set. Does the implementation of IS DISTINCT need 
to change?


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc@33
PS16, Line 33: const float RawValue::CANONICAL_FLOAT_NAN = nanf("");
NaN is not a single bit pattern (which is why you've defined a canonical form), 
rather it is a family of bit patterns. Do the above functions guarantee to 
return the exact same bit pattern on all OSs, all versions, and all runs?

Or, should we use the LLVM trick to specify the exact bit pattern we want to 
use for NAN so that it is guaranteed to be the same on all hosts in the 
cluster, even if they use slightly different CPUs or library versions?


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074
PS16, Line 3074:  QUERY
> I've assumed that NaN tests have existed in other places.  Exhaustive valid
The thought is that we might want to ensure all paths work to handle the NaN 
case, not just the grouping case.

For example, we we have tests for all three (I believe) places that this can 
occur:

* SELECT clause: SELECT sqrt(-1) op floatCol ...
* WHERE clause: WHERE sqrt(-1) op floatCol ...
* JOIN: what is tested here.

Along with two operators

* = (NaN's are not equal)
* <=> (NaN's are equal)

We also have both the code gen and interpreted cases:

* SELECT: never code generated (when in the root fragment)
* WHERE, JOIN: code generated if enough rows, interpreted otherwise

The thought is that we might want to cover all these cases just to save the 
hassle of fixing any gaps later.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Oct 2018 18:04:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

(3 comments)

Hey sorry for not getting back earlier. Bogged down again with some users' 
issues which I have to help with.

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/aggregation.test
File testdata/workloads/functional-query/queries/QueryTest/aggregation.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1384
PS16, Line 1384: (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T
Please see comments in joins.test


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@801
PS16, Line 801: (VALUES((1.6 x, 0 y), (3.2, 1), (5.4,2), (0.5, 3), (0.5, 4), 
(-0.5, 5))) XX),
The problem with this kind of query with few number VALUES() is that codegen 
will be disabled as the planner knows the number of rows will be small. I think 
it may make sense to have another test cases to scan some sizable table. This 
is also a good test case to keep as this exercises the interpretation path.

Of course, one can also set the query option DISABLE_CODEGEN_ROWS_THRESHOLD to 
a small value but it seems better to have a more realistic test query with 
scans in there instead of joining two union nodes of constants.

You can check the query profile to see codegen is enabled in the HASH JOIN node.


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818
PS16, Line 818: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))),
  :  r as (select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> 
sqrt(1.0-t2.x))
  :  select * from r
If you plan to keep this test case, this can be simplified as:
  with q as (VALUES((cast(1.0 as FLOAT) x), (2.0)))
  select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x)



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 19 Oct 2018 18:18:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-18 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@469
PS16, Line 469: void CodegenAnyVal::ConvertToCanonicalForm() {
> This code handles the codegen case. Is there an interpreted code path chang
Yes.  See changes in hash-table.cc


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@728
PS16, Line 728: llvm::Value* CodegenAnyVal::EqToNativePtr(llvm::Value* 
native_ptr,
> Maybe add a comment to define inclusive equality? The description in the co
It's defined  in the header comments


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@751
PS16, Line 751:   // Mirror logic in HashTableCtx::Equals - IMPALA-6661
> Maybe explain why we're mirroring the logic? So that two NaN compare as equ
This is the codegen path.  HashTableCtx::Equals is the corresponding 
non-codegen path.


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc@238
PS16, Line 238: bool HashTableCtx::Equals(const TupleRow* build_row, const 
uint8_t* expr_values,
> Per earlier comment, maybe a comment here about what "INCLUSIVE" means? (Pa
Defined in the header comments.


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074
PS16, Line 3074:  QUERY
> A quick check of this file found no cases where we test base NaN functional
I've assumed that NaN tests have existed in other places.  Exhaustive 
validation of testing of NaN is out-of-scope of this change.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 19 Oct 2018 01:50:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-18 Thread Paul Rogers (Code Review)
Paul Rogers has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

(5 comments)

Good stuff. A few random comments.

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@469
PS16, Line 469: void CodegenAnyVal::ConvertToCanonicalForm() {
This code handles the codegen case. Is there an interpreted code path change 
that is needed for when codegen is disabled?


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@728
PS16, Line 728: llvm::Value* CodegenAnyVal::EqToNativePtr(llvm::Value* 
native_ptr,
Maybe add a comment to define inclusive equality? The description in the commit 
message is good, maybe just add it here for future reference.


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@751
PS16, Line 751:   // Mirror logic in HashTableCtx::Equals - IMPALA-6661
Maybe explain why we're mirroring the logic? So that two NaN compare as equal? 
Or, so that they have the same hash code? Both?


http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc@238
PS16, Line 238: bool HashTableCtx::Equals(const TupleRow* build_row, const 
uint8_t* expr_values,
Per earlier comment, maybe a comment here about what "INCLUSIVE" means? 
(Parallel handling of NULL and NaN.)


http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074
PS16, Line 3074:  QUERY
A quick check of this file found no cases where we test base NaN functionality. 
According to Wikipedia(https://en.wikipedia.org/wiki/NaN), NaN has specific 
rules. Here we test IS DISTINCT, but should we add base tests for other 
operators? Having such tests will catch any regressions in the future if 
someone accidentally turns on the new mode at the wrong time.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 18 Oct 2018 22:11:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-18 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075
PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any 
other NaN via <=>
   : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT)  P, CAST(SQRT(Y) 
AS FLOAT) Q
   : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) 
T )
   : SELECT * FROM W WHERE W.Q<=>W.P
   :  RESULTS
   : 0,0,0,0
   :  TYPES
   : FLOAT, FLOAT, FLOAT, FLOAT
> Sorry, I wasn't clear in my comment. I meant to say a test case which exerc
Well, I'll keep this now that it's here anyway.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 18 Oct 2018 17:47:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-17 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1076/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 17 Oct 2018 17:02:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-17 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#16).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 203 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/16
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 16
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075
PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any 
other NaN via <=>
   : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT)  P, CAST(SQRT(Y) 
AS FLOAT) Q
   : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) 
T )
   : SELECT * FROM W WHERE W.Q<=>W.P
   :  RESULTS
   : 0,0,0,0
   :  TYPES
   : FLOAT, FLOAT, FLOAT, FLOAT
> This test case is here in response to your comment: "It may warrant a test
Sorry, I wasn't clear in my comment. I meant to say a test case which exercise 
the join condition on Nan <=> Nan out of caution. hash-table.cc logic is used 
both by join and aggregation.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 17 Oct 2018 01:51:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-16 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075
PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any 
other NaN via <=>
   : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT)  P, CAST(SQRT(Y) 
AS FLOAT) Q
   : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) 
T )
   : SELECT * FROM W WHERE W.Q<=>W.P
   :  RESULTS
   : 0,0,0,0
   :  TYPES
   : FLOAT, FLOAT, FLOAT, FLOAT
> I wasn't trying to test join code... I was trying to test the <=> operator.
This test case is here in response to your comment: "It may warrant a test case 
to verify the behavior of Nan <=> Nan "
I don't see how joins relate to this.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 17 Oct 2018 01:38:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-16 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075
PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any 
other NaN via <=>
   : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT)  P, CAST(SQRT(Y) 
AS FLOAT) Q
   : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) 
T )
   : SELECT * FROM W WHERE W.Q<=>W.P
   :  RESULTS
   : 0,0,0,0
   :  TYPES
   : FLOAT, FLOAT, FLOAT, FLOAT
> This is not the right test to exercise join code. You can check the query p
I wasn't trying to test join code... I was trying to test the <=> operator.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 17 Oct 2018 01:36:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-16 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075
PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any 
other NaN via <=>
   : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT)  P, CAST(SQRT(Y) 
AS FLOAT) Q
   : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) 
T )
   : SELECT * FROM W WHERE W.Q<=>W.P
   :  RESULTS
   : 0,0,0,0
   :  TYPES
   : FLOAT, FLOAT, FLOAT, FLOAT
This is not the right test to exercise join code. You can check the query plan 
but it doesn't have a join in it. You can check the query plan by doing explain 
of the query in Impala shell and see if it matches your expectation. You can do 
a self-join like the following. Also, this test belongs to joins.test.

  with y as (select t1.float_col as v
   from functional.alltypessmall t1,
functional.alltypessmall t2
   where sqrt(3.5-t1.float_col) <=> sqrt(3.5-t2.float_col))
  select count(*) from y where is_nan(v);



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 17 Oct 2018 00:21:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-16 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 15:

(1 comment)

> Patch Set 14:
>
> (1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798
PS14, Line 798:  QUERY
> Yes I think t1.c1 = t2.c1 is not the same as t1.c1 <=> t2.c1; Since your pa
Test in exprs.test



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 16 Oct 2018 18:57:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 15:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1053/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 16 Oct 2018 02:14:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#15).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
11 files changed, 193 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/15
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 15
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798
PS14, Line 798:  QUERY
> This test should cover that...  it compares NaN values and considers them t
Yes I think t1.c1 = t2.c1 is not the same as t1.c1 <=> t2.c1; Since your patch 
touches some logic in the hash table which indirectly affects <=> comparison 
case, it may be worth double checking nothing unexpected happens.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 14
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 15 Oct 2018 21:13:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798
PS14, Line 798:  QUERY
> Still missing a test case for nan <=> nan being false. Please see earlier c
This test should cover that...  it compares NaN values and considers them to be 
distinct from one another.  Or are you referring to something else?



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 14
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 15 Oct 2018 20:26:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test
File testdata/workloads/functional-query/queries/QueryTest/joins.test:

http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798
PS14, Line 798:  QUERY
Still missing a test case for nan <=> nan being false. Please see earlier 
comments.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 14
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 15 Oct 2018 18:35:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#14).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/aggregation.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
10 files changed, 183 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/14
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 14
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-15 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25
PS9, Line 25: ('millisec
> This is not yet addressed
Done


http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12
PS11, Line 12:  check that add_months alias is working
 : select ADD_MONTHS(cast('2013-02-18 16:46:00.01' as timestamp), 1)
 :  RESULTS
 : 2013-03-18 16:46:00.01000
 :  TYPES
 : timestamp
 : 
 :  QUERY
 : # test extract with non-constant field name
 : select b.unit, extract(a.ts, b.unit) from
 : (values(cast('2013-02-18 16:46:00.01' as timestamp) ts)) a
 : cross join
 : (values('year' unit), ('month'), ('day'), ('hour'), ('minute'), 
('second'),
 : ('millisecond'), ('epoch' )) b
 :  RESULTS
 : 'year',2013
 : 'month',2
 : 'day',18
 : 'hour',16
 : 'minute',46
 : 'second',0
 : 'milliseco
> This comment is not yet addressed
Done



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 14
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 15 Oct 2018 16:32:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-12 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25
PS9, Line 25: group by Z
> The new test case still seems to be group-by a single column. May be you ca
This is not yet addressed


http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12
PS11, Line 12:  GROUP BY of NaN values aggregates NaN's as one grouping
 : select count(*), sqrt(0.5-x) as Z
 : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T
 : group by Z
 :  RESULTS
 : 3, NaN
 :  TYPES
 : bigint, double
 : 
 :  QUERY
 : # GROUP BY of NaN values aggregates NaN's as one grouping
 : select count(*), cast(sqrt(0.5-x) as FLOAT) as Z
 : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 
3), (-0.5, 1))) T
 : group by Z order by Z
 :  RESULTS
 : 3, NaN
 : 2, 0
 : 1, 1
 :  TYPES
 : bigint, float
 : 
 :  QUERY
> These tests are testing the behavior of group-by so they should belong to a
This comment is not yet addressed



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 11
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 13 Oct 2018 04:42:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1042/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 13
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 12 Oct 2018 23:13:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-12 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#13).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
M testdata/workloads/functional-query/queries/QueryTest/joins.test
10 files changed, 182 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/13
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 13
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-12 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 13:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801
PS9, Line 801: builder.SetIn
> The probe-only behavior here doesn't match the behavior of the non-codegen'
We should eliminate this test and convert unconditionally.  This block is about 
generating consistent hash values, which we always want.  The 
"inclusive_equality" flag applies only to "eq" comparisons.


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@237
PS11, Line 237: INCLUSIVE_EQUALITY
> This change in the logic of hash table may have implication to Join too.
Added a more compact test case.


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@256
PS11, Line 256: val_is_nan = Raw
> nit: val_is_nan
Done


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@257
PS11, Line 257: local_is_nan = Raw
> nit: local_is_nan
Done



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 13
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 12 Oct 2018 22:31:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-11 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 11:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748
PS9, Line 748:   llvm::Value* cmp_raw = builder_->CreateFCmpOEQ(local_val, 
val, "cmp_raw");
 :   if (!inclusive_equality) return cmp_raw;
> I need to generate that comparison in either case, so if I followed you sug
Good point. Missed that it's used twice.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801
PS9, Line 801:
> There is no "inclusive_equality" flag here.  The restriction to probe-only
The probe-only behavior here doesn't match the behavior of the non-codegen'd 
version of the code (i.e. EvalRow which is called by EvalBuildRow() and 
EvalProbeRow()) which unconditionally converts Nan to its canonical form.


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@237
PS11, Line 237: INCLUSIVE_EQUALITY
This change in the logic of hash table may have implication to Join too.

It may warrant a test case to verify the behavior of Nan <=> Nan (which should 
evaluate to false isn't changed after this patch in this file: 
https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/joins.test

Please see the section "Handling NULLs in Join Columns" here 
(https://www.cloudera.com/documentation/enterprise/latest/topics/impala_joins.html)
 for background.

Can you please also verify the behavior of join on columns with Nan value 
doesn't change after this patch ?

  create table nan_test (c1 float, c2 float);

  insert into nan_test select cast(sqrt(0.5-x) as FLOAT), cast(sqrt(3.5-y) as 
FLOAT) from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 
1))) T;

  select t1.c1,t2.c1 from nan_test t1 right outer join nan_test t2 on t1.c1 = 
t2.c1;


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@256
PS11, Line 256: val_is_ambiguous
nit: val_is_nan


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@257
PS11, Line 257: local_is_ambiguous
nit: local_is_nan


http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/runtime/raw-value.h@40
PS11, Line 40:
nit: extra space


http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25
PS9, Line 25: group by Z
> Done
The new test case still seems to be group-by a single column. May be you can 
try something like:

  select count(*), cast(sqrt(0.5-x) as FLOAT) as Z, cast(sqrt(0.5+x) as FLOAT) 
as Y
   from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 
1))) T
   group by Y,Z;


http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12
PS11, Line 12:  GROUP BY of NaN values aggregates NaN's as one grouping
 : select count(*), sqrt(0.5-x) as Z
 : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T
 : group by Z
 :  RESULTS
 : 3, NaN
 :  TYPES
 : bigint, double
 : 
 :  QUERY
 : # GROUP BY of NaN values aggregates NaN's as one grouping
 : select count(*), cast(sqrt(0.5-x) as FLOAT) as Z
 : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 
3), (-0.5, 1))) T
 : group by Z order by Z
 :  RESULTS
 : 3, NaN
 : 2, 0
 : 1, 1
 :  TYPES
 : bigint, float
 : 
 :  QUERY
These tests are testing the behavior of group-by so they should belong to 
aggregation.test



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 11
Gerrit-Owner: Michal Ostrowski 

[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1021/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 11
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 11 Oct 2018 03:14:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 11:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40
PS9, Line 40:   /// Single NaN values  to ensure all NaN values can be assigned 
one bit pattern
:   /// that will always compare and hash t
> Done
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107
PS9, Line 107:  val/type c
> I don't see the point of keeping this ambiguity unless you can come up with
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112
PS9, Line 112: CanonicalNaNVa
> Ack
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112
PS9, Line 112: ue(const ColumnTyp
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 11
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 11 Oct 2018 02:40:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#11).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 165 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/11
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 11
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107
PS9, Line 107: IsAmbiguous
> The ambiguity is intentional so as to not define float/double-specific mech
I don't see the point of keeping this ambiguity unless you can come up with 
reasonable examples which are applicable to other types. The general rule of 
operation is that we try not to generalize things too much until there are 
needs for it.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 9
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 10 Oct 2018 17:52:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/1008/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 10
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 10 Oct 2018 15:45:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 10:

(24 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@236
PS9, Line 236: ensure
> ensures
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@239
PS9, Line 239:   /// value of the value. (Currently this only has an impact for 
NaN
> May want to clarify that this is a no-op for types other than FLOAT and DOU
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@240
PS9, Line 240: loat and double values
> ConvertNanToCanonicalForm();
Intentionally not named with "NaN" because the interface as defined now does 
not depend on it.
It just so happens that for non-NaN, non-double/float values it is a no-op.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@249
PS9, Line 249: clusive_equality"
> Warrants a quick comment.
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@476
PS9, Line 476: llvm::Value*
> llvm::Value* . Same below.
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@477
PS9, Line 477:
> Seems unnecessary to initialize this value here as it's set in all paths wh
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@483
PS9, Line 483: is_n
> "is_nan" is a more fitting name.
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@487
PS9, Line 487: default:
 :   ;
> May as well remove these lines.
Needed to avoid compiler errors/warnings.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@747
PS9, Line 747: llvm::Value*
> llvm::Value* . Same below.
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748
PS9, Line 748:   llvm::Value* cmp_raw = builder_->CreateFCmpOEQ(local_val, 
val, "cmp_raw");
 :   if (!inclusive_equality) return cmp_raw;
> nit: seems slightly easier to follow:
I need to generate that comparison in either case, so if I followed you 
suggestion I'd need to follow that  with:

llvm::Value *cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw");

Note that the current version uses "cmp_raw" in 2 places.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h@464
PS9, Line 464: means "NULL==NULL" and "NaN==NaN". This
> means "NULL == NULL" and "Nan == Nan"
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@198
PS9, Line 198: const ColumnType&
> nit: const ColumnType& expr_type
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@257
PS9, Line 257: = R
> huh ? Did you mean val ?
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801
PS9, Line 801:
> Don't we need to check for inclusive_equality flag here ? Also, why is the
There is no "inclusive_equality" flag here.  The restriction to probe-only is a 
continuation of the original logic for NULL's, which was based on that 
distinction.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1149
PS9, Line 1149: }
  : if (inclusive_equality) result.ConvertToCanonicalForm();
  :
> nit: one line
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1179
PS9, Line 1179:   }
> nit: extra blank line
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40
PS9, Line 40:   /// Single NaN values  to ensure all NaN values can be assigned 
one bit pattern
:   /// that will always compare and hash t
> These deserve some comments on their purposes.
Done


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107
PS9, Line 107:  there are
> IsNan() seems to be a more appropriate name. The name and the description o
The ambiguity is intentional so as to not define float/double-specific 
mechanisms into a generic interface.
By keeping the ambiguity  it allows for the interfaces to be used 
unconditionally, irrespective of the type of value being
represented by a RawValue object.  If these mechanisms were specifically 
defined for float/double, then the caller would have to consider whether or not 
it should use 

[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-10 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#10).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 168 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/10
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 10
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-08 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 9:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@236
PS9, Line 236: ensure
ensures


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@239
PS9, Line 239:   /// value of the value.
May want to clarify that this is a no-op for types other than FLOAT and DOUBLE 
or if the value itself is not a Nan to begin with.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@240
PS9, Line 240: ConvertToCanonicalForm
ConvertNanToCanonicalForm();


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@249
PS9, Line 249: inclusive_equality
Warrants a quick comment.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@476
PS9, Line 476: llvm::Value *
llvm::Value* . Same below.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@477
PS9, Line 477: NULL
Seems unnecessary to initialize this value here as it's set in all paths which 
can be taken below,


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@483
PS9, Line 483: cond
"is_nan" is a more fitting name.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@487
PS9, Line 487: default:
 :   ;
May as well remove these lines.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@747
PS9, Line 747: llvm::Value *
llvm::Value* . Same below.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748
PS9, Line 748:   llvm::Value *cmp_raw = builder_->CreateFCmpOEQ(local_val, 
val, "cmp_raw");
 :   if (!inclusive_equality) return cmp_raw;
nit: seems slightly easier to follow:
   if (!inclusive_equality) {
   return builder_->CreateFCmpOEQ(local_val, val, "cmp_raw");
   }


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h@464
PS9, Line 464: broadens the meaning of equality to null
means "NULL == NULL" and "Nan == Nan"


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@198
PS9, Line 198: const ColumnType &
nit: const ColumnType& expr_type


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@253
PS9, Line 253: const ColumnType &
nit: const Columntype& expr_type


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@257
PS9, Line 257: loc
huh ? Did you mean val ?

Also, the entire thing seems easier to parse if it's broken down:

  if (!RawValue::Eq(loc, val, expr_type)) {
  if (INCLUSIVE_EQUALITY) {
 // Check for the case of Nan == Nan
 bool loc_is_nan = RawValue::IsNan(loc, expr_type) ;
 bool val_is_nan = RawValue::IsNan(val, expr_type);
 if (loc_is_nan && val_is_nan) continue;
  }
  return false;
 }


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801
PS9, Line 801: if (!build) {
Don't we need to check for inclusive_equality flag here ? Also, why is the 
conversion to canonical form limited to probe only ? (i.e. !build) ?


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1149
PS9, Line 1149: if (inclusive_equality) {
  :   result.ConvertToCanonicalForm();
  : }
nit: one line


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1179
PS9, Line 1179:
nit: extra blank line


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40
PS9, Line 40:   static const double CANONICAL_DOUBLE_NAN;
:   static const float CANONICAL_FLOAT_NAN;
These deserve some comments on their purposes.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107
PS9, Line 107: IsAmbiguous
IsNan() seems to be a more appropriate name. The name and the description of 
this function is a bit ambiguous (punt unintended).

The comment seems clearer to just say:

   Returns true if 'val' is of type float or double and it has a Nan value.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112
PS9, Line 112: const ColumnType &
nit: const ColumnType& type


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112
PS9, Line 112: 

[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-04 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 6: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 6
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 04 Oct 2018 23:18:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 6:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/915/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 6
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 03 Oct 2018 11:34:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@104
PS3, Line 104: can
> typo
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@110
PS3, Line 110:   /// that is considered representative of the same ambiguous 
value.
> IsAmbiguous() returns true if a *particular value of a particular type* is
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@55
PS3, Line 55: ColumnType&
> nit: ColumnType&
Done



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 6
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 03 Oct 2018 01:34:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#6).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 163 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/6
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 6
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 5:

(2 comments)

I plan to take a look.

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@104
PS3, Line 104: van
typo


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@55
PS3, Line 55: ColumnType &
nit: ColumnType&



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 5
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 03 Oct 2018 00:05:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 5: Code-Review+1

(1 comment)

Did anyone else plan to review?

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@255
PS3, Line 255: (!INCLUSIVE_EQUALITY ||
> This was intentional for readability, since compression here really does im
WFM



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 5
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Oct 2018 23:51:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/902/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 5
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Oct 2018 19:05:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/901/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 4
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Oct 2018 18:53:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#5).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 163 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/5
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 5
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/4/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/4/be/src/exec/hash-table.h@463
PS4, Line 463:   /// values in 'expr_values' with nullness 'expr_values_null'. 
INCLUSIVE_EQUALITY broadens
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 4
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Oct 2018 18:21:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#4).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 163 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/4
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 4
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

> Patch Set 3:
>
> (1 comment)
>
> Did you mean to push out PS4?

It is coming shortly.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 3
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Oct 2018 17:57:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

(1 comment)

Did you mean to push out PS4?

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@490
PS3, Line 490: ;
> Yes.  Otherwise it doesn't compile.
Ah it looked weird, I think usually I'm used to seeing a "break;" for empty 
case. But yeah, seems ok.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 3
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 02 Oct 2018 17:55:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-01 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@490
PS3, Line 490: ;
> Semicolon needed?
Yes.  Otherwise it doesn't compile.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@731
PS3, Line 731:   bool 
inclusive_equality) {
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 3
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 01 Oct 2018 22:04:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-01 Thread Michal Ostrowski (Code Review)
Michal Ostrowski has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@236
PS3, Line 236: ensusre
> enusre
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@240
PS3, Line 240:
> Extra blank line.
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@473
PS3, Line 473:   static llvm::APFloat 
canonical_float_nan(RawValue::CANONICAL_FLOAT_NAN);
> This is a constant, right? Mark it const and make the name upper case.
Done - actually redundant - cruft from an earlier rev.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@476
PS3, Line 476:   case TYPE_FLOAT:
> nit: we usually indent the cases (see elsewhere in this file).
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@731
PS3, Line 731:   bool 
inclusive_equality) {
> Indentation of continuation is a bit inconsistent. We usually either have 4
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@751
PS3, Line 751:   if (!inclusive_equality) {
> The house style is to put conditionals in a single line if they fit, i.e.
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@756
PS3, Line 756:   
local_val, "local_val_is_nan");
> Inconsistent indentation of continuation line.
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h@463
PS3, Line 463:   /// values in 'expr_values' with nullness 'expr_values_null'. 
INCLUSIVE_EQUALITY broadens
> Not a big deal, but you can prefetch these comments before uploading a patc
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@198
PS3, Line 198: auto
> optional: We often prefer spelling out the type name if it's not too verbos
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@255
PS3, Line 255: (!INCLUSIVE_EQUALITY ||
> Indentation of the conditional here is a little inconsistent with other Imp
This was intentional for readability, since compression here really does impede 
readability IMO.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@110
PS3, Line 110:   /// that is considered representative of the same ambiguous 
value.
> This should only be called is IsAmbiguous() returns true for the type, righ
IsAmbiguous() returns true if a *particular value of a particular type* is 
ambiguous.  "CanonicalValue" is the single canonical value for a given type.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@52
PS3, Line 52:   return false;
> Not reachable
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@62
PS3, Line 62: return NULL;
> Should this have a DCHECK(false)? Since it shouldn't be called for other ty
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@64
PS3, Line 64:   return NULL;
> Not reachable
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@172
PS3, Line 172:   if (std::isnan(*v)) {
> Could fit on one line.
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@183
PS3, Line 183:   if (std::isnan(*v)) {
> Could fit on one line.
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12
PS3, Line 12: # GROUP BY of NaN values aggregates NaN's as one grouping
> Can you also add an end-to-end test for float?
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@13
PS3, Line 13: ;
> Trailing semicolon not needed in .test
Done


http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@13
PS3, Line 13: 

[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/884/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 3
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 01 Oct 2018 20:53:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

(20 comments)

Mostly comments about style and testing.

Some of the style comments probably seem pedantic since your code is perfectly 
readable but we've generally favoured keeping the style around indentation, 
etc, as consistent as possible. Usually I'd look for the style to be consistent 
with the existing code. It's also ok if you ran it through clang-format and it 
did something slightly inconsistent so long as it's readable (having formatting 
automated by a tool avoids a lot of these issues ).

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@236
PS3, Line 236: ensusre
enusre


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@240
PS3, Line 240:
Extra blank line.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@473
PS3, Line 473:   static llvm::APFloat 
canonical_float_nan(RawValue::CANONICAL_FLOAT_NAN);
This is a constant, right? Mark it const and make the name upper case.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@476
PS3, Line 476:   case TYPE_FLOAT:
nit: we usually indent the cases (see elsewhere in this file).


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@490
PS3, Line 490: ;
Semicolon needed?


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@731
PS3, Line 731:   bool 
inclusive_equality) {
Indentation of continuation is a bit inconsistent. We usually either have 4 
space indents for continuations or go with whatever clang-format chooses.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@751
PS3, Line 751:   if (!inclusive_equality) {
The house style is to put conditionals in a single line if they fit, i.e.

  if (!inclusive_equality) return cmp_raw;


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@756
PS3, Line 756:   
local_val, "local_val_is_nan");
Inconsistent indentation of continuation line.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h@463
PS3, Line 463:   /// values in 'expr_values' with nullness 'expr_values_null'. 
INCLUSIVE_EQUALITY broadens
> line too long (91 > 90)
Not a big deal, but you can prefetch these comments before uploading a patch:

 ./bin/jenkins/critique-gerrit-review.py --dryrun


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@198
PS3, Line 198: auto
optional: We often prefer spelling out the type name if it's not too verbose so 
it's more obvious what the type is.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@255
PS3, Line 255: (!INCLUSIVE_EQUALITY ||
Indentation of the conditional here is a little inconsistent with other Impala 
code, since we'd usually try to fit the clauses onto as few lines as possible. 
I don't feel strongly since this is perfectly readable but others might feel 
more strongly about consistency.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@110
PS3, Line 110:   /// that is considered representative of the same ambiguous 
value.
This should only be called is IsAmbiguous() returns true for the type, right?


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@52
PS3, Line 52:   return false;
Not reachable


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@62
PS3, Line 62: return NULL;
Should this have a DCHECK(false)? Since it shouldn't be called for other types.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@64
PS3, Line 64:   return NULL;
Not reachable


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@172
PS3, Line 172:   if (std::isnan(*v)) {
Could fit on one line.


http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@183
PS3, Line 183:   if (std::isnan(*v)) {
Could fit on one line.



[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-01 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h@463
PS3, Line 463:   /// values in 'expr_values' with nullness 'expr_values_null'. 
INCLUSIVE_EQUALITY broadens
line too long (91 > 90)



-- 
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 3
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 01 Oct 2018 19:59:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-10-01 Thread Michal Ostrowski (Code Review)
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#3).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 154 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/3
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 3
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Michal Ostrowski 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-09-27 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/841/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 1
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 27 Sep 2018 18:08:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.

2018-09-27 Thread Michal Ostrowski (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/11535

to look at the new patch set (#2).

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
..

IMPALA-6661 Make NaN values equal for grouping purposes.

Similar to the treatment of NULLs, we want to consider NaN values
as equal when grouping.

- When detecting a NaN in a set of row values, the NaN value must
  be converted to a canonical value - so that all NaN values have
  the same bit-pattern for hashing purposes.

- When doing equality evaluation, floating point types must have
  additional logic to consider NaN values as equal.

- Existing logic for handling NULLs in this way is appropriate for
  triggering this behavior for NaN values.

- Relabel "force null equality" as "inclusive equality" to expand
  the scope of the concept to a more generic form that includes NaN.

Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/codegen-anyval.h
M be/src/exec/hash-table.cc
M be/src/exec/hash-table.h
M be/src/exec/hash-table.inline.h
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.h
M be/src/runtime/raw-value.inline.h
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
9 files changed, 150 insertions(+), 29 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/2
--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 2
Gerrit-Owner: Michal Ostrowski 
Gerrit-Reviewer: Impala Public Jenkins