[
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586996#comment-14586996
]
Prasanth Jayachandran commented on HIVE-10940:
----------------------------------------------
Patch mostly looks good. Although it will be good to add some debug logging
after each if null checks. Also from simple reference look up we don't seem be
using textual representation of the filter expression anywhere. I don't think
we need to set the text representation of filter expression. If we need text
representation we have methods in PlanUtils to do so.
[~ashutoshc]/[~gopalv] Any idea why we set the filter expression in text form
to job conf?
> HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader
> call
> ---------------------------------------------------------------------------------
>
> Key: HIVE-10940
> URL: https://issues.apache.org/jira/browse/HIVE-10940
> Project: Hive
> Issue Type: Bug
> Components: File Formats
> Affects Versions: 1.2.0
> Reporter: Gopal V
> Assignee: Sergey Shelukhin
> Attachments: HIVE-10940.patch
>
>
> {code}
> String filterText = filterExpr.getExprString();
> String filterExprSerialized = Utilities.serializeExpression(filterExpr);
> {code}
> the serializeExpression initializes Kryo and produces a new packed object for
> every split.
> HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.
> And Kryo is very slow to do this for a large filter clause.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)