Tianyi Wang has uploaded a new patch set (#6). Change subject: IMPALA-5573: Add decimal codegen in text scanner ......................................................................
IMPALA-5573: Add decimal codegen in text scanner This patch adds decimal type codegen support in text scanner. Currently codegen would be disabled if there is a decimal column. With this patch StringParser::StringToDecimal will be called in generated code. A new file util/string-parser.cc is created and linked into libUtil. This file contains proxy functions to StringToDecimal in ordered to keep StringToDecimal out of LLVM IR. In a benchmark query: > select l_quantity, l_extendedprice, l_discount, l_tax from biglineitem where > l_quantity > 100.0; where biglineitem is tpch.lineitem repeated 6 times, the codegen version is 19% faster than non-codgen version in scanning, and 8% faster in query time. Codegen time in this simple case is 69ms. Simple performance tests show that putting the parser in libUtil instead of impala-sse.bc would reduce codegen time by 2/3 in cases where only one decimal column is parsed while the scanning time is nearly the same. Change-Id: Ia65820e969d59094dc92d912a5279fa90f6b179d --- M be/src/codegen/gen_ir_descriptions.py M be/src/exec/hdfs-scanner-ir.cc M be/src/exec/hdfs-scanner.cc M be/src/exec/text-converter.cc M be/src/util/CMakeLists.txt A be/src/util/string-parser.cc M be/src/util/string-parser.h 7 files changed, 129 insertions(+), 28 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/83/7683/6 -- To view, visit http://gerrit.cloudera.org:8080/7683 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia65820e969d59094dc92d912a5279fa90f6b179d Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang <[email protected]> Gerrit-Reviewer: Tianyi Wang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]>
