[ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14586996#comment-14586996
 ] 

Prasanth Jayachandran commented on HIVE-10940:
----------------------------------------------

Patch mostly looks good. Although it will be good to add some debug logging 
after each if null checks. Also from simple reference look up we don't seem be 
using textual representation of the filter expression anywhere. I don't think 
we need to set the text representation of filter expression. If we need text 
representation we have methods in PlanUtils to do so.

[~ashutoshc]/[~gopalv] Any idea why we set the filter expression in text form 
to job conf?

> HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
> call
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-10940
>                 URL: https://issues.apache.org/jira/browse/HIVE-10940
>             Project: Hive
>          Issue Type: Bug
>          Components: File Formats
>    Affects Versions: 1.2.0
>            Reporter: Gopal V
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-10940.patch
>
>
> {code}
>     String filterText = filterExpr.getExprString();
>     String filterExprSerialized = Utilities.serializeExpression(filterExpr);
> {code}
> the serializeExpression initializes Kryo and produces a new packed object for 
> every split.
> HiveInputFormat::getRecordReader -> pushProjectionAndFilters -> pushFilters.
> And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to