Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11535 )

Change subject: IMPALA-6661 Make NaN values equal for grouping purposes.
......................................................................


Patch Set 9:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h
File be/src/codegen/codegen-anyval.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@236
PS9, Line 236: ensure
ensures


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@239
PS9, Line 239:   /// value of the value.
May want to clarify that this is a no-op for types other than FLOAT and DOUBLE 
or if the value itself is not a Nan to begin with.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@240
PS9, Line 240: ConvertToCanonicalForm
ConvertNanToCanonicalForm();


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.h@249
PS9, Line 249: inclusive_equality
Warrants a quick comment.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc
File be/src/codegen/codegen-anyval.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@476
PS9, Line 476: llvm::Value *
llvm::Value* . Same below.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@477
PS9, Line 477: NULL
Seems unnecessary to initialize this value here as it's set in all paths which 
can be taken below,


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@483
PS9, Line 483: cond
"is_nan" is a more fitting name.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@487
PS9, Line 487:     default:
             :       ;
May as well remove these lines.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@747
PS9, Line 747: llvm::Value *
llvm::Value* . Same below.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/codegen/codegen-anyval.cc@748
PS9, Line 748:       llvm::Value *cmp_raw = builder_->CreateFCmpOEQ(local_val, 
val, "cmp_raw");
             :       if (!inclusive_equality) return cmp_raw;
nit: seems slightly easier to follow:
   if (!inclusive_equality) {
       return builder_->CreateFCmpOEQ(local_val, val, "cmp_raw");
   }


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.h@464
PS9, Line 464: broadens the meaning of equality to null
means "NULL == NULL" and "Nan == Nan"


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@198
PS9, Line 198: const ColumnType &
nit: const ColumnType& expr_type


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@253
PS9, Line 253: const ColumnType &
nit: const Columntype& expr_type


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@257
PS9, Line 257: loc
huh ? Did you mean val ?

Also, the entire thing seems easier to parse if it's broken down:

  if (!RawValue::Eq(loc, val, expr_type)) {
      if (INCLUSIVE_EQUALITY) {
         // Check for the case of Nan == Nan
         bool loc_is_nan = RawValue::IsNan(loc, expr_type) ;
         bool val_is_nan = RawValue::IsNan(val, expr_type);
         if (loc_is_nan && val_is_nan) continue;
      }
      return false;
 }


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@801
PS9, Line 801: if (!build) {
Don't we need to check for inclusive_equality flag here ? Also, why is the 
conversion to canonical form limited to probe only ? (i.e. !build) ?


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1149
PS9, Line 1149:     if (inclusive_equality) {
              :       result.ConvertToCanonicalForm();
              :     }
nit: one line


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/exec/hash-table.cc@1179
PS9, Line 1179:
nit: extra blank line


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h
File be/src/runtime/raw-value.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@40
PS9, Line 40:   static const double CANONICAL_DOUBLE_NAN;
            :   static const float CANONICAL_FLOAT_NAN;
These deserve some comments on their purposes.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@107
PS9, Line 107: IsAmbiguous
IsNan() seems to be a more appropriate name. The name and the description of 
this function is a bit ambiguous (punt unintended).

The comment seems clearer to just say:

   Returns true if 'val' is of type float or double and it has a Nan value.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112
PS9, Line 112: const ColumnType &
nit: const ColumnType& type


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.h@112
PS9, Line 112: CanonicalValue
CanonicalNanValue().

The comment can be:
    Returns the canonical representation of Nan value corresponding to 'type'.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.inline.h
File be/src/runtime/raw-value.inline.h:

http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.inline.h@34
PS9, Line 34:
nit: blank line.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.inline.h@50
PS9, Line 50: DCHECK(false);
Calling this function with a type other than float or double seems legitimate. 
This DCHECK() seems unnecessary.


http://gerrit.cloudera.org:8080/#/c/11535/9/be/src/runtime/raw-value.inline.h@63
PS9, Line 63: NULL
nullptr


http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test
File testdata/workloads/functional-query/queries/QueryTest/exprs.test:

http://gerrit.cloudera.org:8080/#/c/11535/9/testdata/workloads/functional-query/queries/QueryTest/exprs.test@25
PS9, Line 25: group by Z
May warrant some test cases which group by multiple columns, Some columns which 
may evaluate to Nan and some don't.



--
To view, visit http://gerrit.cloudera.org:8080/11535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I996c4a2e1934fd887046ed0c55457b7285375086
Gerrit-Change-Number: 11535
Gerrit-PatchSet: 9
Gerrit-Owner: Michal Ostrowski <mostr...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Michal Ostrowski <mostr...@cloudera.com>
Gerrit-Reviewer: Thomas Marshall <thomasmarsh...@cmu.edu>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Tue, 09 Oct 2018 01:17:45 +0000
Gerrit-HasComments: Yes

Reply via email to