[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 15: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 15
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 04 Sep 2020 00:55:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

Testing:
Verified that java udf tests pass locally.

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Reviewed-on: http://gerrit.cloudera.org:8080/16314
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/codegen/codegen-util.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
10 files changed, 454 insertions(+), 39 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 16
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 15: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 15
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 Sep 2020 19:39:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 14: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 14
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 Sep 2020 19:39:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 15:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6399/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 15
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 Sep 2020 19:39:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7083/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 14
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 Sep 2020 09:31:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 14:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16314/13/be/src/exprs/scalar-expr-evaluator-ir.cc
File be/src/exprs/scalar-expr-evaluator-ir.cc:

PS13:
> This file doesn't seem to be referenced anywhere. Did you mean to add it?
No, it was an experiment with cross-compiling GetValue and it seems I didn't 
reset git correctly. Thanks.


http://gerrit.cloudera.org:8080/#/c/16314/13/testdata/workloads/functional-query/queries/QueryTest/java-udf.test
File testdata/workloads/functional-query/queries/QueryTest/java-udf.test:

http://gerrit.cloudera.org:8080/#/c/16314/13/testdata/workloads/functional-query/queries/QueryTest/java-udf.test@321
PS13, Line 321: ScalarExprEvaluator
> ScalarExprEvaluator
Done



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 14
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 Sep 2020 09:12:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#14). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

Testing:
Verified that java udf tests pass locally.

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/codegen-util.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
10 files changed, 454 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/14
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 14
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-03 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 13:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@295
PS12, Line 295:  llvm::Value* const child_is_null = 
child_wrapped.GetIsNull("child_is_null");
  :  llvm::Value* const child_is_null_i8 = 
builder->CreateZExtOrTrunc(
  :  child_is_null, codegen->i8_type(), "child_is_null_i8");
  :  builder->CreateCall(set_input_null_buff_elem_fn,
  :  {jni_ctx, codegen->GetI32Constant(i), 
child_is_null_i8});
  :  builder->CreateCondBr(child_is_null, 
next_eval_child_block, child_not_null_block);
  :
  :  // Child is not null.
  :  builder->SetInsertPoint(child_not_null_block);
  :  llvm::Value* const input_ptr = 
builder->CreateCall(get_input_val_buff_
> Hmm, actually I didn't realize that the branch is still needed, so I am not
It's a bit less code both in the .cc and in IR if we don't have an extra basic 
block so I'll keep it that way.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 03 Sep 2020 08:01:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 13:

(2 comments)

LGTM pending these 2 comments

http://gerrit.cloudera.org:8080/#/c/16314/13/be/src/exprs/scalar-expr-evaluator-ir.cc
File be/src/exprs/scalar-expr-evaluator-ir.cc:

PS13:
This file doesn't seem to be referenced anywhere. Did you mean to add it?


http://gerrit.cloudera.org:8080/#/c/16314/13/testdata/workloads/functional-query/queries/QueryTest/java-udf.test
File testdata/workloads/functional-query/queries/QueryTest/java-udf.test:

http://gerrit.cloudera.org:8080/#/c/16314/13/testdata/workloads/functional-query/queries/QueryTest/java-udf.test@321
PS13, Line 321: ScaparExprEvaluator
ScalarExprEvaluator



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 17:52:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-02 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 13: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@295
PS12, Line 295:  llvm::Value* const child_is_null = 
child_wrapped.GetIsNull("child_is_null");
  :  llvm::Value* const child_is_null_i8 = 
builder->CreateZExtOrTrunc(
  :  child_is_null, codegen->i8_type(), "child_is_null_i8");
  :  builder->CreateCall(set_input_null_buff_elem_fn,
  :  {jni_ctx, codegen->GetI32Constant(i), 
child_is_null_i8});
  :  builder->CreateCondBr(child_is_null, 
next_eval_child_block, child_not_null_block);
  :
  :  // Child is not null.
  :  builder->SetInsertPoint(child_not_null_block);
  :  llvm::Value* const input_ptr = 
builder->CreateCall(get_input_val_buff_
> Done
Hmm, actually I didn't realize that the branch is still needed, so I am not 
sure which one is better. Feel free to keep whatever you prefer.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 13:39:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-02 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7075/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 10:28:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-02 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 13:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266:
> I understand that you didn't add any new code to test, I'm saying you need
I accept your point. I have added tests in 
testdata/workloads/functional-query/queries/QueryTest/java-udf.test with 
comments explaining them. Do you think they are ok?


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@295
PS12, Line 295:  llvm::Value* const child_is_null = 
child_wrapped.GetIsNull("child_is_null");
  :  llvm::Value* const child_is_null_i8 = 
builder->CreateZExtOrTrunc(
  :  child_is_null, codegen->i8_type(), "child_is_null_i8");
  :  builder->CreateCall(set_input_null_buff_elem_fn,
  :  {jni_ctx, codegen->GetI32Constant(i), 
child_is_null_i8});
  :  builder->CreateCondBr(child_is_null, 
next_eval_child_block, child_not_null_block);
  :
  :  // Child is not null.
  :  builder->SetInsertPoint(child_not_null_block);
  :  llvm::Value* const input_ptr = 
builder->CreateCall(get_input_val_buff_
> optional: couldn't it be faster if the value of child_wrapped.GetIsNull("ch
Done


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@311
PS12, Line 311:
> nit: extra ;
Done


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@490
PS12, Line 490:
> Do we need to do this here, can't CallJavaAndStoreResult handle it, as jni_
Done


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@508
PS12, Line 508:
  : BooleanVal HiveUdfCall::GetBooleanValInterpreted(
  : ScalarExprEvaluator* eval, const TupleRow* row) const {
  :   DCHECK_EQ(type_.type, TYPE_BOOLEAN);
> Can't we move these as members to JniContext instead of passing them during
Done



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 02 Sep 2020 10:06:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-09-02 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

Testing:
Verified that java udf tests pass locally.

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/codegen-util.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
A be/src/exprs/scalar-expr-evaluator-ir.cc
M testdata/workloads/functional-query/queries/QueryTest/java-udf.test
11 files changed, 587 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/13
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 13
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-27 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266: // TODO: Check if Timestamp works. It is currently not 
supported in Hive
> Also, I don't think we really add any new uncertainty about timestamps, it
I understand that you didn't add any new code to test, I'm saying you need to 
add a test for a more subtle reason.

Your argument that the patch is correct is (I think) that Java UDFs don't 
support date or timestamp and therefore we don't need code that handle that 
case. I think that argument is valid. I'd just like you to write or show me a 
test that proves that we can't create or execute a Java UDF with date or 
timestamp. I took a look just now and I can't find one.

Also if we have a test that proves it, we'll know if a later change breaks it.

Ideally we would have added this test earlier when the date or timestamp patch 
was added, but it looks like we didn't, so we should fix the gap.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 19:33:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-27 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 12: Code-Review+1

(5 comments)

Only nits and small improvement ideas, can go in from my side.

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.h
File be/src/exprs/hive-udf-call.h:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.h@109
PS12, Line 109: ans
typo: and


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@295
PS12, Line 295:  // Child is null.
  :  builder->SetInsertPoint(child_null_block);
  :  builder->CreateCall(set_input_null_buff_elem_fn,
  :  {jni_ctx, codegen->GetI32Constant(i), 
codegen->GetI8Constant(1)});
  :  builder->CreateBr(next_eval_child_block);
  :
  :  // Child is not null.
  :  builder->SetInsertPoint(child_not_null_block);
  :  builder->CreateCall(set_input_null_buff_elem_fn,
  :  {jni_ctx, codegen->GetI32Constant(i), 
codegen->GetI8Constant(0)});
optional: couldn't it be faster if the value of 
child_wrapped.GetIsNull("child_is_null") was passed directly to 
set_input_null_buff_elem_fn instead of branching?


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@311
PS12, Line 311: ;
nit: extra ;


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@490
PS12, Line 490:llvm::Value* const jni_env = 
builder.CreateCall(get_jni_env_fn, {jni_ctx}, "jni_env");
Do we need to do this here, can't CallJavaAndStoreResult handle it, as jni_ctx 
is already passed?


http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@508
PS12, Line 508:llvm::Value* const hdfs_location =
  :codegen->GetStringConstant(, fn_.hdfs_location);
  :llvm::Value* const symbol_name =
  :codegen->GetStringConstant(, 
fn_.scalar_fn.symbol);
Can't we move these as members to JniContext instead of passing them during 
function call? I think it is simpler to do this in "normal" code than here, and 
speed shouldn't matter as these are only used in the error branch.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 15:30:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-27 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@292
PS12, Line 292:  
builder->CreateCondBr(child_wrapped.GetIsNull("child_is_null"),
> In the interpreted code, NULL-ness is determined based on the value of 'eva
Sorry, calling eval->GetValue may not be a good idea, that calls expr.Get*Val 
functions which will check the existence of the codegend function pointer but 
we already have that, so this check is not needed.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 10:44:05 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-27 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/12/be/src/exprs/hive-udf-call.cc@292
PS12, Line 292:  
builder->CreateCondBr(child_wrapped.GetIsNull("child_is_null"),
In the interpreted code, NULL-ness is determined based on the value of 
'eval->GetValue(*GetChild(i), row);'. This is almost the same as calling 
'.GetIsNull' but in the case of Date, 'eval->GetValue' also contains a validity 
check.

Date is not supported for Java UDFs either, so this shouldn't make a difference 
now but if/when we start supporting Date it may be a problem. Also, keeping 
codegen code and interpreted code similar is a good idea in my opinion.

I think it's better to cross-compile eval->GetValue and use that in codegen. Do 
you agree?



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 09:55:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-27 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266: // TODO: Check if Timestamp works. It is currently not 
supported in Hive
> We could but I would think that if we start supporting timestamp UDFs we wo
Also, I don't think we really add any new uncertainty about timestamps, it is 
the same with the interpreted version (see l. 95). I wrote the TODO more as a 
way to point out that we seem to handle timestamps in this class but they are 
not allowed in queries.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 09:28:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-27 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266: // TODO: Check if Timestamp works. It is currently not 
supported in Hive
> Can we add a query that tries to create a Hive UDF with a timestamp argumen
We could but I would think that if we start supporting timestamp UDFs we would 
need to test them extensively anyway. But I can add one if you think it would 
be better.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 27 Aug 2020 09:21:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266: // TODO: Check if Timestamp works. It is currently not 
supported in Hive
> I haven't found any. I think this function should work correctly for timest
Can we add a query that tries to create a Hive UDF with a timestamp argument, 
so to ensure that it fails and we don't need to handle it.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 26 Aug 2020 17:41:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 12:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7003/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 26 Aug 2020 15:20:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

Testing:
Verified that java udf tests pass locally.

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/codegen-util.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
9 files changed, 458 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/12
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 12
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/11/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/11/be/src/exprs/hive-udf-call.cc@273
PS11, Line 273:  else {
  : llvm::Value* child_value = child->GetVal("child_value");
  : return codegen->GetPtrTo(builder, child_value, 
"child_val_ptr");
  :   }
I realised we don't need to branch here, ToNativePtr should work for all types. 
I'll remove this function.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 26 Aug 2020 14:54:43 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7000/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 26 Aug 2020 12:55:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

Testing:
Verified that java udf tests pass locally.

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/codegen-util.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
9 files changed, 476 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/11
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 11
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-26 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

(1 comment)

I started an exhaustive test, I agree that all types are tested in test_udfs.py.

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266: // TODO: Check if Timestamp works. It is currently not 
supported in Hive
> Do we have a negative test calling a hive UDF with a timestamp that would s
I haven't found any. I think this function should work correctly for timestamps 
also but we can't test it.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 26 Aug 2020 12:31:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-25 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10: Code-Review+2

(2 comments)

I'm happy with this modulo confirming testing.

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call-ir.cc
File be/src/exprs/hive-udf-call-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call-ir.cc@84
PS10, Line 84: if (type->type == TYPE_STRING) {
> I implemented string copying here instead of a separate cross-compiled 'Cop
Yeah, I see you're using ColumnType::ToIR() so the type should be propagated 
(assuming inlining, etc, but that's reasonable).

This makes sense to me.


http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call.cc@266
PS10, Line 266: // TODO: Check if Timestamp works. It is currently not 
supported in Hive
Do we have a negative test calling a hive UDF with a timestamp that would start 
failing if this changed?



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 25 Aug 2020 17:38:45 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-25 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6977/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 25 Aug 2020 08:03:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-25 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/codegen-util.h
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
9 files changed, 476 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/10
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-25 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 10:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16314/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16314/8//COMMIT_MSG@11
PS8, Line 11: TODO: Testing
> I believe we should be covering all the data types and codegen enabled/disa
The java tests pass locally, I'll run an exhaustive Jenkins test if this passes 
the initial tests.


http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call-ir.cc
File be/src/exprs/hive-udf-call-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16314/10/be/src/exprs/hive-udf-call-ir.cc@84
PS10, Line 84: if (type->type == TYPE_STRING) {
I implemented string copying here instead of a separate cross-compiled 
'CopyStringValToResultAlloc' function. This way the code is simpler and we have 
one less IR function. On the other hand, the type check becomes a runtime check 
if LLVM doesn't optimise it out. Do you think it might have a performance 
impact? I think even if it is not optimised out it is a predictable branch and 
the cost should not be significant compared to the call to Java.


http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.cc@179
PS8, Line 179:
> Cache
Done


http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.cc@278
PS8, Line 278:
> I think this should return int64_t, since that's what we generally use for
I moved it to codegen-util.h, but I'm not sure it is very codegen-related..



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 10
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 25 Aug 2020 07:42:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-24 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/16314/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16314/8//COMMIT_MSG@11
PS8, Line 11: TODO: Testing
I believe we should be covering all the data types and codegen enabled/disabled 
with test_udf.py but you could double-check.


http://gerrit.cloudera.org:8080/#/c/16314/8//COMMIT_MSG@31
PS8, Line 31: This is a ~7.8% improvement.
Nice


http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.h
File be/src/exprs/hive-udf-call.h:

http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.h@146
PS8, Line 146: : executor(nullptr),
nit: initialise these inline where the initial values are constant -it's more 
concise and readable.


http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.h@153
PS8, Line 153: funxtions
nit:functions


http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.cc@179
PS8, Line 179: Chache
Cache


http://gerrit.cloudera.org:8080/#/c/16314/8/be/src/exprs/hive-udf-call.cc@278
PS8, Line 278: std::size_t GetMemcpySize(const PrimitiveType type) {
I think this should return int64_t, since that's what we generally use for byte 
sizes

Maybe this should be in codegen-util.h?



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 24 Aug 2020 21:03:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6972/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 24 Aug 2020 17:37:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-24 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing

Benchmarks:
Used a UDF from TestUdf.java that adds three integers:

  create function tpch15_parquet.sum3(int, int, int) returns int
  location '/test-warehouse/impala-hive-udfs.jar'
  symbol='org.apache.impala.TestUdf';

Used the following query on the master branch and the change's branch:

  set num_nodes=1; set mt_dop=1;
  select min(tpch15_parquet.sum3(cast(l_orderkey as int),
cast(l_partkey as int), cast(l_suppkey as int)))
  from tpch15_parquet.lineitem;

Results averaged over 100 runs after warmup:
Master: 20.6346s, stddev: 0.3132411856765332
This change: 19.0256s, stddev: 0.42039019873436

This is a ~7.8% improvement.

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
8 files changed, 491 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/8
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 8
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6927/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 14 Aug 2020 09:45:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-14 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing
TODO: Benchmarks

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
8 files changed, 471 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/7
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 7
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 6:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6926/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 14 Aug 2020 08:25:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-14 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing
TODO: Benchmarks

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
8 files changed, 470 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/6
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 5:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/6897/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 12 Aug 2020 16:43:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/5/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/5/be/src/exprs/hive-udf-call.cc@486
PS5, Line 486: /// call_java:; preds = 
%child_not_null14, %child_null15
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 12 Aug 2020 16:20:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-12 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing
TODO: Benchmarks

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
8 files changed, 469 insertions(+), 38 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/5
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-12 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 4:

(1 comment)

Added benchmark results on Jira.

http://gerrit.cloudera.org:8080/#/c/16314/4/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/4/be/src/exprs/hive-udf-call.cc@288
PS4, Line 288: return builder->CreateCall(
> I think you should be able to use ToNativePtr() here instead of calling out
Done



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 12 Aug 2020 16:19:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 4:

I took a pass over this, just treating as a WIP. The overall approach seems 
fine. Will wait for testing and benchmarks before doing a more final review.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 11 Aug 2020 23:41:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16314/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16314/4//COMMIT_MSG@9
PS4, Line 9: Implementing codegen for HiveUdfCall.
I guess we still need to remove the final use of GetCodegenedComputeFnWrapper() 
after this patch, right


http://gerrit.cloudera.org:8080/#/c/16314/4/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/4/be/src/exprs/hive-udf-call.cc@288
PS4, Line 288: return builder->CreateCall(
I think you should be able to use ToNativePtr() here instead of calling out to 
these helpers.



-- 
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 11 Aug 2020 23:41:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-10 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16314/3/be/src/exprs/hive-udf-call.cc
File be/src/exprs/hive-udf-call.cc:

http://gerrit.cloudera.org:8080/#/c/16314/3/be/src/exprs/hive-udf-call.cc@66
PS3, Line 66: // TODO: We don't need to check for null, JniUtil::GetJNIEnv 
guarantees that it is a
: // valid pointer.
> Any arguments for keeping the check?
No, it looks like GetJNIEnv() aborts the process if it fails, so this can never 
be reached.



--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 10 Aug 2020 17:02:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-10 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6855/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 10 Aug 2020 14:35:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-10 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing
TODO: Benchmarks

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
8 files changed, 361 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/4
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16314 )

Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6843/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 08 Aug 2020 19:12:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7658: Proper codegen for HiveUdfCall

2020-08-08 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16314


Change subject: IMPALA-7658: Proper codegen for HiveUdfCall
..

IMPALA-7658: Proper codegen for HiveUdfCall

Implementing codegen for HiveUdfCall.

TODO: Testing
TODO: Benchmarks

Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
8 files changed, 357 insertions(+), 26 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/14/16314/3
--
To view, visit http://gerrit.cloudera.org:8080/16314
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2f994dac550f297ed3c88491816403f237d4d747
Gerrit-Change-Number: 16314
Gerrit-PatchSet: 3
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Daniel Becker