Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5573: Add decimal codegen in text scanner
......................................................................


Patch Set 3:

(8 comments)

Code change looks good, just a few cleanup requests.

http://gerrit.cloudera.org:8080/#/c/7683/2//COMMIT_MSG
Commit Message:

Line 14: StringToDecimal out of LLVM IR.
Can you add a line break to separate the paragraphs to make it more readable.


PS2, Line 15: libUtil.so
nit: libUtil, since we use static linking too.


Line 19: 
Did you do an experiment to see how much scanning sped up for a larger table? 
tpch.lineitem has a few decimal columns. I often copy the data a few times for 
scanner tests like this to get good measurements.

  create table biglineitem as select * from lineitem;
  insert into biglineitem select * from lineitem;
  insert into biglineitem select * from lineitem;

I think you'll see the biggest speedup if you have a condition in the where 
clause, because codegen of that condition also gets disabled. E.g.

  select l_quantity, l_extendedprice, l_discount, l_tax from biglineitem where 
l_quantity > 100.0;


http://gerrit.cloudera.org:8080/#/c/7683/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

Line 134: 
> I'm sorry it works without this. I have no idea how I came to that conclusi
Good to know :) I was worried something subtle was going on. There are some 
places in the code where we resort to hacks to force modules to be linked into 
the output binary: 
https://github.com/apache/incubator-impala/blob/294d42adc117046f975665834af03ddaa53ec27e/be/src/exprs/scalar-expr-evaluator.cc#L428

Once the functions are linked in LLVM by default falls back to resolving 
functions in the binary's dynamic symbol table.


Line 134: 
> Without this it crashes with "Program used external function ... which coul
Maybe when you rebuilt Impala you hadn't updated the CMakeLists.txt, or the 
change hadn't taken affect. That would explain this.


http://gerrit.cloudera.org:8080/#/c/7683/2/be/src/exec/hdfs-scanner-ir.cc
File be/src/exec/hdfs-scanner-ir.cc:

Line 131:   if (*result != ParseResult::PARSE_SUCCESS) *result = 
ParseResult::PARSE_FAILURE;
> The IR generated by Clang still reads decimalblah and we will need a bitcas
Ah ok, makes sense.

I think it would also be ok to return Decimal*Value and bitcast 
unconditionally, but this works for me.


http://gerrit.cloudera.org:8080/#/c/7683/2/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

Line 234:           case 16:
> Done. This is copied from TextConverter::WriteSlot and there is an unnecess
Oh interesting :) Not sure what that was there.


http://gerrit.cloudera.org:8080/#/c/7683/3/be/src/exec/text-converter.cc
File be/src/exec/text-converter.cc:

PS3, Line 292: proper_slot
maybe "cast_slot". It's unclear what "proper" means.


-- 
To view, visit http://gerrit.cloudera.org:8080/7683
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia65820e969d59094dc92d912a5279fa90f6b179d
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to