[ https://issues.apache.org/jira/browse/IMPALA-9747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113335#comment-17113335 ]
Tim Armstrong commented on IMPALA-9747: --------------------------------------- I think it's generally fine for the codegen code to call out to an interpreted codepath to handle cases like that, we do that for exprs with GetCodegendComputeFnWrapper(). I think the approach taken in the past of returning errors and trying to bail out of codegen was a bug factory and led to a bunch of problems like what you described. > More fine-grained codegen for text file scanners > ------------------------------------------------ > > Key: IMPALA-9747 > URL: https://issues.apache.org/jira/browse/IMPALA-9747 > Project: IMPALA > Issue Type: Improvement > Components: Backend > Reporter: Csaba Ringhofer > Assignee: Daniel Becker > Priority: Major > > Currently if the materialization of any column cannot be codegend for some > reason (e.g. it is CHAR(N)), then the whole codegen is cancelled for the text > scanner, see: > https://github.com/apache/impala/blob/b5805de3e65fd1c7154e4169b323bb38ddc54f4f/be/src/exec/text-converter.cc#L112 > https://github.com/apache/impala/blob/58273fff601dcc763ac43f7cc275a174a2e18b6b/be/src/exec/hdfs-scanner.cc#L342 > It would be much better to use the non-codegend path only for the problematic > columns and use the codegend materialization for the rest + always do > conjunct evaluation with codegen. > The codegend path orders slots based on the conjuncts that use them and > evaluates conjuncts when the slots it need becomes available, so if the row > is dropped then the rest of the slots do not need to be materialized. A > simple solution would be to always do non-codegend slot materialization first > so that they are ready if a conjunct needs them. Moving the columns that are > not used by conjuncts to the end could be a further optimization. > This came up during the materialization of BINARY columns, which needs > base64 decoding during materialization. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org