[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Reviewed-on: http://gerrit.cloudera.org:8080/11535 Tested-by: Impala Public Jenkins Reviewed-by: Michael Ho --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 202 insertions(+), 35 deletions(-) Approvals: Impala Public Jenkins: Verified Michael Ho: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 24 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 23: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 23 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Nov 2018 02:58:53 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 23: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 23 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 02 Nov 2018 01:40:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 23: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1252/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 23 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Nov 2018 22:09:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 23: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3406/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 23 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Nov 2018 21:35:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#23). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 202 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/23 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 23 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 22: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3401/ -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 22 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Nov 2018 20:55:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 22: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1246/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 22 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Nov 2018 17:22:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 22: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3401/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 22 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 01 Nov 2018 16:38:35 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#22). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 202 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/22 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 22 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 20: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3387/ -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 20 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 31 Oct 2018 18:59:56 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 20: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3387/ DRY_RUN=true -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 20 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 31 Oct 2018 15:12:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 19: There seems to be other test failures in addition to the one you fixed: https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/3421/ Please run a private build to confirm all newly added tests actually passed. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 19 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 26 Oct 2018 22:24:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 19: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1171/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 19 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 26 Oct 2018 13:33:42 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#19). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 202 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/19 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 19 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 18: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test File testdata/workloads/functional-query/queries/QueryTest/aggregation.test: http://gerrit.cloudera.org:8080/#/c/11535/18/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1402 PS18, Line 1402: bigint, float You forgot to update result type for the extra column in the select list. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 18 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 25 Oct 2018 02:17:43 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 18: Verified-1 Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/3348/ -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 18 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 25 Oct 2018 01:39:16 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 18: Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/3348/ DRY_RUN=false -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 18 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 21:45:12 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 18: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 18 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 21:45:11 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 17: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 17 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 21:44:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 17: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1144/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 17 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 19:42:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 17: (2 comments) http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@801 PS16, Line 801: (VALUES((1.6 x, 0 y), (3.2, 1), (5.4,2), (0.5, 3), (0.5, 4), (-0.5, 5))) XX), > It looks like we should be running test_basic_join_queries with disable_cod Done http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818 PS16, Line 818: select t1.float_col as v : from functional.alltypessmall t1, functional.alltypessmall t2 : where sqrt(0.5-t > Should this be addressed ? Done -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 17 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 19:06:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Paul Rogers, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#17). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 202 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/17 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 17 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: Code-Review+2 (2 comments) LGTM. Two minor comments which need to be addressed. http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.h@40 PS16, Line 40: nit: blank space http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818 PS16, Line 818: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))), : r as (select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x)) : select * from r > If you plan to keep this test case, this can be simplified as: Should this be addressed ? -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 24 Oct 2018 06:41:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@751 PS16, Line 751: // Mirror logic in HashTableCtx::Equals - IMPALA-6661 > This is the codegen path. HashTableCtx::Equals is the corresponding non-co Yeah this is kind of a cross-cutting feature of codegen. Long-term we'd like to minimise the duplication of logic, e.g. by cross-compiling C++ code into LLVM IR and substituting constants (you can go a long way with this if the the constants are complex things like descriptor tables that are variable in the interpreted code but constant in the codegen'd version). http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074 PS16, Line 3074: QUERY > The issue in this bug is that we want "GROUP BY" to consider NaN==NaN, but expr-test.cc is probably the right place for exhaustive operator tests. Those get run with and without codegen so there's no trickier required to ensure that both code paths are tested. Seems worth documenting the test gap in a JIRA and fixing it soonish just to get the extra coverage - this patch is close enough to being done that I don't want to hold it up to tack on something tangential. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 23 Oct 2018 20:15:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (3 comments) > Patch Set 16: > > (3 comments) > > A few more comments as I learn this code. http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG@17 PS16, Line 17: additional logic to consider NaN values as equal. > The comment mentions equality. Is the rule that NaN = NaN is always false? NaN = NaN is always false. NaN <=> NaN is always true. The "DISTINCT" implementation does not need to change. Testing before/after shows that after this change the behavior is correct. http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc File be/src/runtime/raw-value.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc@33 PS16, Line 33: const float RawValue::CANONICAL_FLOAT_NAN = nanf(""); > NaN is not a single bit pattern (which is why you've defined a canonical fo The only requirement is that within a particular process we are consistent. But, as I understand it these functions should be consistent anyways. http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074 PS16, Line 3074: QUERY > The thought is that we might want to ensure all paths work to handle the Na The issue in this bug is that we want "GROUP BY" to consider NaN==NaN, but not "WHERE NaN op floatCal". The JOIN behavior is implicitly doing a "GROUP BY" with respect ot the hash table. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 23 Oct 2018 02:15:23 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (3 comments) A few more comments as I learn this code. http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11535/16//COMMIT_MSG@17 PS16, Line 17: additional logic to consider NaN values as equal. The comment mentions equality. Is the rule that NaN = NaN is always false? But that Nan <=> NaN is true? Looks like the IS DISTINCT (<=>) code is implemented in operators-ir.cc, but that file is not in this patch set. Does the implementation of IS DISTINCT need to change? http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc File be/src/runtime/raw-value.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/runtime/raw-value.cc@33 PS16, Line 33: const float RawValue::CANONICAL_FLOAT_NAN = nanf(""); NaN is not a single bit pattern (which is why you've defined a canonical form), rather it is a family of bit patterns. Do the above functions guarantee to return the exact same bit pattern on all OSs, all versions, and all runs? Or, should we use the LLVM trick to specify the exact bit pattern we want to use for NAN so that it is guaranteed to be the same on all hosts in the cluster, even if they use slightly different CPUs or library versions? http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074 PS16, Line 3074: QUERY > I've assumed that NaN tests have existed in other places. Exhaustive valid The thought is that we might want to ensure all paths work to handle the NaN case, not just the grouping case. For example, we we have tests for all three (I believe) places that this can occur: * SELECT clause: SELECT sqrt(-1) op floatCol ... * WHERE clause: WHERE sqrt(-1) op floatCol ... * JOIN: what is tested here. Along with two operators * = (NaN's are not equal) * <=> (NaN's are equal) We also have both the code gen and interpreted cases: * SELECT: never code generated (when in the root fragment) * WHERE, JOIN: code generated if enough rows, interpreted otherwise The thought is that we might want to cover all these cases just to save the hassle of fixing any gaps later. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 22 Oct 2018 18:04:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (3 comments) Hey sorry for not getting back earlier. Bogged down again with some users' issues which I have to help with. http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/aggregation.test File testdata/workloads/functional-query/queries/QueryTest/aggregation.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/aggregation.test@1384 PS16, Line 1384: (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T Please see comments in joins.test http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@801 PS16, Line 801: (VALUES((1.6 x, 0 y), (3.2, 1), (5.4,2), (0.5, 3), (0.5, 4), (-0.5, 5))) XX), The problem with this kind of query with few number VALUES() is that codegen will be disabled as the planner knows the number of rows will be small. I think it may make sense to have another test cases to scan some sizable table. This is also a good test case to keep as this exercises the interpretation path. Of course, one can also set the query option DISABLE_CODEGEN_ROWS_THRESHOLD to a small value but it seems better to have a more realistic test query with scans in there instead of joining two union nodes of constants. You can check the query profile to see codegen is enabled in the HASH JOIN node. http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/joins.test@818 PS16, Line 818: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))), : r as (select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x)) : select * from r If you plan to keep this test case, this can be simplified as: with q as (VALUES((cast(1.0 as FLOAT) x), (2.0))) select t1.x from q t1, q t2 where sqrt(1.0-t1.x) <=> sqrt(1.0-t2.x) -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 19 Oct 2018 18:18:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (5 comments) http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@469 PS16, Line 469: void CodegenAnyVal::ConvertToCanonicalForm() { > This code handles the codegen case. Is there an interpreted code path chang Yes. See changes in hash-table.cc http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@728 PS16, Line 728: llvm::Value* CodegenAnyVal::EqToNativePtr(llvm::Value* native_ptr, > Maybe add a comment to define inclusive equality? The description in the co It's defined in the header comments http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@751 PS16, Line 751: // Mirror logic in HashTableCtx::Equals - IMPALA-6661 > Maybe explain why we're mirroring the logic? So that two NaN compare as equ This is the codegen path. HashTableCtx::Equals is the corresponding non-codegen path. http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc@238 PS16, Line 238: bool HashTableCtx::Equals(const TupleRow* build_row, const uint8_t* expr_values, > Per earlier comment, maybe a comment here about what "INCLUSIVE" means? (Pa Defined in the header comments. http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074 PS16, Line 3074: QUERY > A quick check of this file found no cases where we test base NaN functional I've assumed that NaN tests have existed in other places. Exhaustive validation of testing of NaN is out-of-scope of this change. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 19 Oct 2018 01:50:05 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Paul Rogers has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (5 comments) Good stuff. A few random comments. http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@469 PS16, Line 469: void CodegenAnyVal::ConvertToCanonicalForm() { This code handles the codegen case. Is there an interpreted code path change that is needed for when codegen is disabled? http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@728 PS16, Line 728: llvm::Value* CodegenAnyVal::EqToNativePtr(llvm::Value* native_ptr, Maybe add a comment to define inclusive equality? The description in the commit message is good, maybe just add it here for future reference. http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/codegen/codegen-anyval.cc@751 PS16, Line 751: // Mirror logic in HashTableCtx::Equals - IMPALA-6661 Maybe explain why we're mirroring the logic? So that two NaN compare as equal? Or, so that they have the same hash code? Both? http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/16/be/src/exec/hash-table.cc@238 PS16, Line 238: bool HashTableCtx::Equals(const TupleRow* build_row, const uint8_t* expr_values, Per earlier comment, maybe a comment here about what "INCLUSIVE" means? (Parallel handling of NULL and NaN.) http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/16/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3074 PS16, Line 3074: QUERY A quick check of this file found no cases where we test base NaN functionality. According to Wikipedia(https://en.wikipedia.org/wiki/NaN), NaN has specific rules. Here we test IS DISTINCT, but should we add base tests for other operators? Having such tests will catch any regressions in the future if someone accidentally turns on the new mode at the wrong time. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 18 Oct 2018 22:11:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT > Sorry, I wasn't clear in my comment. I meant to say a test case which exerc Well, I'll keep this now that it's here anyway. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 18 Oct 2018 17:47:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 16: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1076/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 17:02:43 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#16). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 203 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/16 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 16 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT > This test case is here in response to your comment: "It may warrant a test Sorry, I wasn't clear in my comment. I meant to say a test case which exercise the join condition on Nan <=> Nan out of caution. hash-table.cc logic is used both by join and aggregation. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 01:51:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT > I wasn't trying to test join code... I was trying to test the <=> operator. This test case is here in response to your comment: "It may warrant a test case to verify the behavior of Nan <=> Nan " I don't see how joins relate to this. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 01:38:54 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT > This is not the right test to exercise join code. You can check the query p I wasn't trying to test join code... I was trying to test the <=> operator. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 01:36:42 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/15/testdata/workloads/functional-query/queries/QueryTest/exprs.test@3075 PS15, Line 3075: # IMPALA-6661 - NaN should not evaluate as the same as any other NaN via <=> : WITH W AS (SELECT T.*, CAST(SQRT(X) AS FLOAT) P, CAST(SQRT(Y) AS FLOAT) Q : FROM (VALUES((-1.0 X, -1.0 Y), (-1.0, 0), (0, -1.0), (0, 0))) T ) : SELECT * FROM W WHERE W.Q<=>W.P : RESULTS : 0,0,0,0 : TYPES : FLOAT, FLOAT, FLOAT, FLOAT This is not the right test to exercise join code. You can check the query plan but it doesn't have a join in it. You can check the query plan by doing explain of the query in Impala shell and see if it matches your expectation. You can do a self-join like the following. Also, this test belongs to joins.test. with y as (select t1.float_col as v from functional.alltypessmall t1, functional.alltypessmall t2 where sqrt(3.5-t1.float_col) <=> sqrt(3.5-t2.float_col)) select count(*) from y where is_nan(v); -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 17 Oct 2018 00:21:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: (1 comment) > Patch Set 14: > > (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798 PS14, Line 798: QUERY > Yes I think t1.c1 = t2.c1 is not the same as t1.c1 <=> t2.c1; Since your pa Test in exprs.test -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 16 Oct 2018 18:57:09 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 15: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1053/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 16 Oct 2018 02:14:27 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#15). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 11 files changed, 193 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/15 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 15 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798 PS14, Line 798: QUERY > This test should cover that... it compares NaN values and considers them t Yes I think t1.c1 = t2.c1 is not the same as t1.c1 <=> t2.c1; Since your patch touches some logic in the hash table which indirectly affects <=> comparison case, it may be worth double checking nothing unexpected happens. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 15 Oct 2018 21:13:18 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798 PS14, Line 798: QUERY > Still missing a test case for nan <=> nan being false. Please see earlier c This test should cover that... it compares NaN values and considers them to be distinct from one another. Or are you referring to something else? -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 15 Oct 2018 20:26:15 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 14: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test File testdata/workloads/functional-query/queries/QueryTest/joins.test: http://gerrit.cloudera.org:8080/#/c/11535/14/testdata/workloads/functional-query/queries/QueryTest/joins.test@798 PS14, Line 798: QUERY Still missing a test case for nan <=> nan being false. Please see earlier comments. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 15 Oct 2018 18:35:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#14). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/aggregation.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 10 files changed, 183 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/14 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 14: (2 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25 PS9, Line 25: ('millisec > This is not yet addressed Done http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12 PS11, Line 12: check that add_months alias is working : select ADD_MONTHS(cast('2013-02-18 16:46:00.01' as timestamp), 1) : RESULTS : 2013-03-18 16:46:00.01000 : TYPES : timestamp : : QUERY : # test extract with non-constant field name : select b.unit, extract(a.ts, b.unit) from : (values(cast('2013-02-18 16:46:00.01' as timestamp) ts)) a : cross join : (values('year' unit), ('month'), ('day'), ('hour'), ('minute'), ('second'), : ('millisecond'), ('epoch' )) b : RESULTS : 'year',2013 : 'month',2 : 'day',18 : 'hour',16 : 'minute',46 : 'second',0 : 'milliseco > This comment is not yet addressed Done -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 14 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 15 Oct 2018 16:32:04 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 11: (2 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25 PS9, Line 25: group by Z > The new test case still seems to be group-by a single column. May be you ca This is not yet addressed http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12 PS11, Line 12: GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), sqrt(0.5-x) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T : group by Z : RESULTS : 3, NaN : TYPES : bigint, double : : QUERY : # GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), cast(sqrt(0.5-x) as FLOAT) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T : group by Z order by Z : RESULTS : 3, NaN : 2, 0 : 1, 1 : TYPES : bigint, float : : QUERY > These tests are testing the behavior of group-by so they should belong to a This comment is not yet addressed -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Sat, 13 Oct 2018 04:42:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 13: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1042/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 13 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 12 Oct 2018 23:13:06 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#13). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test M testdata/workloads/functional-query/queries/QueryTest/joins.test 10 files changed, 182 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/13 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 13 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 13: (4 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801 PS9, Line 801: builder.SetIn > The probe-only behavior here doesn't match the behavior of the non-codegen' We should eliminate this test and convert unconditionally. This block is about generating consistent hash values, which we always want. The "inclusive_equality" flag applies only to "eq" comparisons. http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@237 PS11, Line 237: INCLUSIVE_EQUALITY > This change in the logic of hash table may have implication to Join too. Added a more compact test case. http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@256 PS11, Line 256: val_is_nan = Raw > nit: val_is_nan Done http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@257 PS11, Line 257: local_is_nan = Raw > nit: local_is_nan Done -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 13 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 12 Oct 2018 22:31:26 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 11: (8 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748 PS9, Line 748: llvm::Value* cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); : if (!inclusive_equality) return cmp_raw; > I need to generate that comparison in either case, so if I followed you sug Good point. Missed that it's used twice. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801 PS9, Line 801: > There is no "inclusive_equality" flag here. The restriction to probe-only The probe-only behavior here doesn't match the behavior of the non-codegen'd version of the code (i.e. EvalRow which is called by EvalBuildRow() and EvalProbeRow()) which unconditionally converts Nan to its canonical form. http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@237 PS11, Line 237: INCLUSIVE_EQUALITY This change in the logic of hash table may have implication to Join too. It may warrant a test case to verify the behavior of Nan <=> Nan (which should evaluate to false isn't changed after this patch in this file: https://github.com/apache/impala/blob/master/testdata/workloads/functional-query/queries/QueryTest/joins.test Please see the section "Handling NULLs in Join Columns" here (https://www.cloudera.com/documentation/enterprise/latest/topics/impala_joins.html) for background. Can you please also verify the behavior of join on columns with Nan value doesn't change after this patch ? create table nan_test (c1 float, c2 float); insert into nan_test select cast(sqrt(0.5-x) as FLOAT), cast(sqrt(3.5-y) as FLOAT) from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T; select t1.c1,t2.c1 from nan_test t1 right outer join nan_test t2 on t1.c1 = t2.c1; http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@256 PS11, Line 256: val_is_ambiguous nit: val_is_nan http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/exec/hash-table.cc@257 PS11, Line 257: local_is_ambiguous nit: local_is_nan http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/11/be/src/runtime/raw-value.h@40 PS11, Line 40: nit: extra space http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25 PS9, Line 25: group by Z > Done The new test case still seems to be group-by a single column. May be you can try something like: select count(*), cast(sqrt(0.5-x) as FLOAT) as Z, cast(sqrt(0.5+x) as FLOAT) as Y from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T group by Y,Z; http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/11/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12 PS11, Line 12: GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), sqrt(0.5-x) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6))) T : group by Z : RESULTS : 3, NaN : TYPES : bigint, double : : QUERY : # GROUP BY of NaN values aggregates NaN's as one grouping : select count(*), cast(sqrt(0.5-x) as FLOAT) as Z : from (VALUES((1.6 x, 2 y), (3.2, 4), (5.4,6), (0.5, 2), (0.5, 3), (-0.5, 1))) T : group by Z order by Z : RESULTS : 3, NaN : 2, 0 : 1, 1 : TYPES : bigint, float : : QUERY These tests are testing the behavior of group-by so they should belong to aggregation.test -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 11: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1021/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 11 Oct 2018 03:14:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 11: (4 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40 PS9, Line 40: /// Single NaN values to ensure all NaN values can be assigned one bit pattern : /// that will always compare and hash t > Done Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107 PS9, Line 107: val/type c > I don't see the point of keeping this ambiguity unless you can come up with Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112 PS9, Line 112: CanonicalNaNVa > Ack Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112 PS9, Line 112: ue(const ColumnTyp > Done Done -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 11 Oct 2018 02:40:24 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#11). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 165 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/11 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 11 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 9: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107 PS9, Line 107: IsAmbiguous > The ambiguity is intentional so as to not define float/double-specific mech I don't see the point of keeping this ambiguity unless you can come up with reasonable examples which are applicable to other types. The general rule of operation is that we try not to generalize things too much until there are needs for it. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 9 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 10 Oct 2018 17:52:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 10: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/1008/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 10 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 10 Oct 2018 15:45:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 10: (24 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@236 PS9, Line 236: ensure > ensures Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@239 PS9, Line 239: /// value of the value. (Currently this only has an impact for NaN > May want to clarify that this is a no-op for types other than FLOAT and DOU Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@240 PS9, Line 240: loat and double values > ConvertNanToCanonicalForm(); Intentionally not named with "NaN" because the interface as defined now does not depend on it. It just so happens that for non-NaN, non-double/float values it is a no-op. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@249 PS9, Line 249: clusive_equality" > Warrants a quick comment. Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@476 PS9, Line 476: llvm::Value* > llvm::Value* . Same below. Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@477 PS9, Line 477: > Seems unnecessary to initialize this value here as it's set in all paths wh Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@483 PS9, Line 483: is_n > "is_nan" is a more fitting name. Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@487 PS9, Line 487: default: : ; > May as well remove these lines. Needed to avoid compiler errors/warnings. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@747 PS9, Line 747: llvm::Value* > llvm::Value* . Same below. Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748 PS9, Line 748: llvm::Value* cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); : if (!inclusive_equality) return cmp_raw; > nit: seems slightly easier to follow: I need to generate that comparison in either case, so if I followed you suggestion I'd need to follow that with: llvm::Value *cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); Note that the current version uses "cmp_raw" in 2 places. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h@464 PS9, Line 464: means "NULL==NULL" and "NaN==NaN". This > means "NULL == NULL" and "Nan == Nan" Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@198 PS9, Line 198: const ColumnType& > nit: const ColumnType& expr_type Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@257 PS9, Line 257: = R > huh ? Did you mean val ? Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801 PS9, Line 801: > Don't we need to check for inclusive_equality flag here ? Also, why is the There is no "inclusive_equality" flag here. The restriction to probe-only is a continuation of the original logic for NULL's, which was based on that distinction. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1149 PS9, Line 1149: } : if (inclusive_equality) result.ConvertToCanonicalForm(); : > nit: one line Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1179 PS9, Line 1179: } > nit: extra blank line Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40 PS9, Line 40: /// Single NaN values to ensure all NaN values can be assigned one bit pattern : /// that will always compare and hash t > These deserve some comments on their purposes. Done http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107 PS9, Line 107: there are > IsNan() seems to be a more appropriate name. The name and the description o The ambiguity is intentional so as to not define float/double-specific mechanisms into a generic interface. By keeping the ambiguity it allows for the interfaces to be used unconditionally, irrespective of the type of value being represented by a RawValue object. If these mechanisms were specifically defined for float/double, then the caller would have to consider whether or not it should use
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#10). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 168 insertions(+), 35 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/10 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 10 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 9: (25 comments) http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@236 PS9, Line 236: ensure ensures http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@239 PS9, Line 239: /// value of the value. May want to clarify that this is a no-op for types other than FLOAT and DOUBLE or if the value itself is not a Nan to begin with. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@240 PS9, Line 240: ConvertToCanonicalForm ConvertNanToCanonicalForm(); http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@249 PS9, Line 249: inclusive_equality Warrants a quick comment. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@476 PS9, Line 476: llvm::Value * llvm::Value* . Same below. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@477 PS9, Line 477: NULL Seems unnecessary to initialize this value here as it's set in all paths which can be taken below, http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@483 PS9, Line 483: cond "is_nan" is a more fitting name. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@487 PS9, Line 487: default: : ; May as well remove these lines. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@747 PS9, Line 747: llvm::Value * llvm::Value* . Same below. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748 PS9, Line 748: llvm::Value *cmp_raw = builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); : if (!inclusive_equality) return cmp_raw; nit: seems slightly easier to follow: if (!inclusive_equality) { return builder_->CreateFCmpOEQ(local_val, val, "cmp_raw"); } http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h@464 PS9, Line 464: broadens the meaning of equality to null means "NULL == NULL" and "Nan == Nan" http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@198 PS9, Line 198: const ColumnType & nit: const ColumnType& expr_type http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@253 PS9, Line 253: const ColumnType & nit: const Columntype& expr_type http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@257 PS9, Line 257: loc huh ? Did you mean val ? Also, the entire thing seems easier to parse if it's broken down: if (!RawValue::Eq(loc, val, expr_type)) { if (INCLUSIVE_EQUALITY) { // Check for the case of Nan == Nan bool loc_is_nan = RawValue::IsNan(loc, expr_type) ; bool val_is_nan = RawValue::IsNan(val, expr_type); if (loc_is_nan && val_is_nan) continue; } return false; } http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801 PS9, Line 801: if (!build) { Don't we need to check for inclusive_equality flag here ? Also, why is the conversion to canonical form limited to probe only ? (i.e. !build) ? http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1149 PS9, Line 1149: if (inclusive_equality) { : result.ConvertToCanonicalForm(); : } nit: one line http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1179 PS9, Line 1179: nit: extra blank line http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40 PS9, Line 40: static const double CANONICAL_DOUBLE_NAN; : static const float CANONICAL_FLOAT_NAN; These deserve some comments on their purposes. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107 PS9, Line 107: IsAmbiguous IsNan() seems to be a more appropriate name. The name and the description of this function is a bit ambiguous (punt unintended). The comment seems clearer to just say: Returns true if 'val' is of type float or double and it has a Nan value. http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112 PS9, Line 112: const ColumnType & nit: const ColumnType& type http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112 PS9, Line 112:
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 6: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 6 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 04 Oct 2018 23:18:08 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 6: Build Failed https://jenkins.impala.io/job/gerrit-code-review-checks/915/ : Initial code review checks failed. See linked job for details on the failure. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 6 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 03 Oct 2018 11:34:23 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 6: (3 comments) http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@104 PS3, Line 104: can > typo Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@110 PS3, Line 110: /// that is considered representative of the same ambiguous value. > IsAmbiguous() returns true if a *particular value of a particular type* is Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h File be/src/runtime/raw-value.inline.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@55 PS3, Line 55: ColumnType& > nit: ColumnType& Done -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 6 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 03 Oct 2018 01:34:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#6). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 163 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/6 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 6 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 5: (2 comments) I plan to take a look. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@104 PS3, Line 104: van typo http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h File be/src/runtime/raw-value.inline.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@55 PS3, Line 55: ColumnType & nit: ColumnType& -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 03 Oct 2018 00:05:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 5: Code-Review+1 (1 comment) Did anyone else plan to review? http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@255 PS3, Line 255: (!INCLUSIVE_EQUALITY || > This was intentional for readability, since compression here really does im WFM -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Oct 2018 23:51:22 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 5: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/902/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Oct 2018 19:05:45 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 4: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/901/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 4 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Oct 2018 18:53:13 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#5). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 163 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/5 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 5 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/4/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/4/be/src/exec/hash-table.h@463 PS4, Line 463: /// values in 'expr_values' with nullness 'expr_values_null'. INCLUSIVE_EQUALITY broadens line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 4 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Oct 2018 18:21:45 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#4). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 163 insertions(+), 34 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/4 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 4 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: > Patch Set 3: > > (1 comment) > > Did you mean to push out PS4? It is coming shortly. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Oct 2018 17:57:37 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: (1 comment) Did you mean to push out PS4? http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@490 PS3, Line 490: ; > Yes. Otherwise it doesn't compile. Ah it looked weird, I think usually I'm used to seeing a "break;" for empty case. But yeah, seems ok. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 02 Oct 2018 17:55:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@490 PS3, Line 490: ; > Semicolon needed? Yes. Otherwise it doesn't compile. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@731 PS3, Line 731: bool inclusive_equality) { > Done Done -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 01 Oct 2018 22:04:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Michal Ostrowski has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: (19 comments) http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@236 PS3, Line 236: ensusre > enusre Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@240 PS3, Line 240: > Extra blank line. Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@473 PS3, Line 473: static llvm::APFloat canonical_float_nan(RawValue::CANONICAL_FLOAT_NAN); > This is a constant, right? Mark it const and make the name upper case. Done - actually redundant - cruft from an earlier rev. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@476 PS3, Line 476: case TYPE_FLOAT: > nit: we usually indent the cases (see elsewhere in this file). Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@731 PS3, Line 731: bool inclusive_equality) { > Indentation of continuation is a bit inconsistent. We usually either have 4 Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@751 PS3, Line 751: if (!inclusive_equality) { > The house style is to put conditionals in a single line if they fit, i.e. Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@756 PS3, Line 756: local_val, "local_val_is_nan"); > Inconsistent indentation of continuation line. Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h@463 PS3, Line 463: /// values in 'expr_values' with nullness 'expr_values_null'. INCLUSIVE_EQUALITY broadens > Not a big deal, but you can prefetch these comments before uploading a patc Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@198 PS3, Line 198: auto > optional: We often prefer spelling out the type name if it's not too verbos Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@255 PS3, Line 255: (!INCLUSIVE_EQUALITY || > Indentation of the conditional here is a little inconsistent with other Imp This was intentional for readability, since compression here really does impede readability IMO. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@110 PS3, Line 110: /// that is considered representative of the same ambiguous value. > This should only be called is IsAmbiguous() returns true for the type, righ IsAmbiguous() returns true if a *particular value of a particular type* is ambiguous. "CanonicalValue" is the single canonical value for a given type. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h File be/src/runtime/raw-value.inline.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@52 PS3, Line 52: return false; > Not reachable Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@62 PS3, Line 62: return NULL; > Should this have a DCHECK(false)? Since it shouldn't be called for other ty Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@64 PS3, Line 64: return NULL; > Not reachable Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@172 PS3, Line 172: if (std::isnan(*v)) { > Could fit on one line. Done http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@183 PS3, Line 183: if (std::isnan(*v)) { > Could fit on one line. Done http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test File testdata/workloads/functional-query/queries/QueryTest/exprs.test: http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@12 PS3, Line 12: # GROUP BY of NaN values aggregates NaN's as one grouping > Can you also add an end-to-end test for float? Done http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@13 PS3, Line 13: ; > Trailing semicolon not needed in .test Done http://gerrit.cloudera.org:8080/#/c/11535/3/testdata/workloads/functional-query/queries/QueryTest/exprs.test@13 PS3, Line 13:
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/884/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 01 Oct 2018 20:53:36 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: (20 comments) Mostly comments about style and testing. Some of the style comments probably seem pedantic since your code is perfectly readable but we've generally favoured keeping the style around indentation, etc, as consistent as possible. Usually I'd look for the style to be consistent with the existing code. It's also ok if you ran it through clang-format and it did something slightly inconsistent so long as it's readable (having formatting automated by a tool avoids a lot of these issues ). http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h File be/src/codegen/codegen-anyval.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@236 PS3, Line 236: ensusre enusre http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.h@240 PS3, Line 240: Extra blank line. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc File be/src/codegen/codegen-anyval.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@473 PS3, Line 473: static llvm::APFloat canonical_float_nan(RawValue::CANONICAL_FLOAT_NAN); This is a constant, right? Mark it const and make the name upper case. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@476 PS3, Line 476: case TYPE_FLOAT: nit: we usually indent the cases (see elsewhere in this file). http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@490 PS3, Line 490: ; Semicolon needed? http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@731 PS3, Line 731: bool inclusive_equality) { Indentation of continuation is a bit inconsistent. We usually either have 4 space indents for continuations or go with whatever clang-format chooses. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@751 PS3, Line 751: if (!inclusive_equality) { The house style is to put conditionals in a single line if they fit, i.e. if (!inclusive_equality) return cmp_raw; http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/codegen/codegen-anyval.cc@756 PS3, Line 756: local_val, "local_val_is_nan"); Inconsistent indentation of continuation line. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h@463 PS3, Line 463: /// values in 'expr_values' with nullness 'expr_values_null'. INCLUSIVE_EQUALITY broadens > line too long (91 > 90) Not a big deal, but you can prefetch these comments before uploading a patch: ./bin/jenkins/critique-gerrit-review.py --dryrun http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc File be/src/exec/hash-table.cc: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@198 PS3, Line 198: auto optional: We often prefer spelling out the type name if it's not too verbose so it's more obvious what the type is. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.cc@255 PS3, Line 255: (!INCLUSIVE_EQUALITY || Indentation of the conditional here is a little inconsistent with other Impala code, since we'd usually try to fit the clauses onto as few lines as possible. I don't feel strongly since this is perfectly readable but others might feel more strongly about consistency. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h File be/src/runtime/raw-value.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.h@110 PS3, Line 110: /// that is considered representative of the same ambiguous value. This should only be called is IsAmbiguous() returns true for the type, right? http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h File be/src/runtime/raw-value.inline.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@52 PS3, Line 52: return false; Not reachable http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@62 PS3, Line 62: return NULL; Should this have a DCHECK(false)? Since it shouldn't be called for other types. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@64 PS3, Line 64: return NULL; Not reachable http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@172 PS3, Line 172: if (std::isnan(*v)) { Could fit on one line. http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/runtime/raw-value.inline.h@183 PS3, Line 183: if (std::isnan(*v)) { Could fit on one line.
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h File be/src/exec/hash-table.h: http://gerrit.cloudera.org:8080/#/c/11535/3/be/src/exec/hash-table.h@463 PS3, Line 463: /// values in 'expr_values' with nullness 'expr_values_null'. INCLUSIVE_EQUALITY broadens line too long (91 > 90) -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 01 Oct 2018 19:59:58 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Michael Ho, Thomas Marshall, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#3). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 154 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/3 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 3 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Michal Ostrowski Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/11535 ) Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. Patch Set 1: Build Successful https://jenkins.impala.io/job/gerrit-code-review-checks/841/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests. -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 1 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 27 Sep 2018 18:08:07 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6661 Make NaN values equal for grouping purposes.
Hello Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/11535 to look at the new patch set (#2). Change subject: IMPALA-6661 Make NaN values equal for grouping purposes. .. IMPALA-6661 Make NaN values equal for grouping purposes. Similar to the treatment of NULLs, we want to consider NaN values as equal when grouping. - When detecting a NaN in a set of row values, the NaN value must be converted to a canonical value - so that all NaN values have the same bit-pattern for hashing purposes. - When doing equality evaluation, floating point types must have additional logic to consider NaN values as equal. - Existing logic for handling NULLs in this way is appropriate for triggering this behavior for NaN values. - Relabel "force null equality" as "inclusive equality" to expand the scope of the concept to a more generic form that includes NaN. Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 --- M be/src/codegen/codegen-anyval.cc M be/src/codegen/codegen-anyval.h M be/src/exec/hash-table.cc M be/src/exec/hash-table.h M be/src/exec/hash-table.inline.h M be/src/runtime/raw-value.cc M be/src/runtime/raw-value.h M be/src/runtime/raw-value.inline.h M testdata/workloads/functional-query/queries/QueryTest/exprs.test 9 files changed, 150 insertions(+), 29 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/11535/2 -- To view, visit http://gerrit.cloudera.org:8080/11535 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086 Gerrit-Change-Number: 11535 Gerrit-PatchSet: 2 Gerrit-Owner: Michal Ostrowski Gerrit-Reviewer: Impala Public Jenkins