[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-07-06 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10940:

Assignee: (was: Sergey Shelukhin)

 HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
 call
 -

 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
 Fix For: 2.0.0

 Attachments: HIVE-10940.01.patch, HIVE-10940.02.patch, 
 HIVE-10940.03.patch, HIVE-10940.patch


 {code}
 String filterText = filterExpr.getExprString();
 String filterExprSerialized = Utilities.serializeExpression(filterExpr);
 {code}
 the serializeExpression initializes Kryo and produces a new packed object for 
 every split.
 HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.
 And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-07-06 Thread Gopal V (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gopal V updated HIVE-10940:
---
Assignee: Gunther Hagleitner

 HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
 call
 -

 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Gunther Hagleitner
 Fix For: 2.0.0

 Attachments: HIVE-10940.01.patch, HIVE-10940.02.patch, 
 HIVE-10940.03.patch, HIVE-10940.patch


 {code}
 String filterText = filterExpr.getExprString();
 String filterExprSerialized = Utilities.serializeExpression(filterExpr);
 {code}
 the serializeExpression initializes Kryo and produces a new packed object for 
 every split.
 HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.
 And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-30 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10940:
--
Attachment: HIVE-10940.02.patch

.02 sets the expr as a phys opt. This should avoid the overheads and only do it 
after dpp is done. I'm wondering if I can unset the filter altogether then (in 
table scan)

 HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
 call
 -

 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Fix For: 2.0.0

 Attachments: HIVE-10940.01.patch, HIVE-10940.02.patch, 
 HIVE-10940.patch


 {code}
 String filterText = filterExpr.getExprString();
 String filterExprSerialized = Utilities.serializeExpression(filterExpr);
 {code}
 the serializeExpression initializes Kryo and produces a new packed object for 
 every split.
 HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.
 And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-30 Thread Gunther Hagleitner (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gunther Hagleitner updated HIVE-10940:
--
Attachment: HIVE-10940.03.patch

Thanks [~sershe]. Addressed comments in 03. I forgot to handle fliter object. 
Other than that I've added the requested comment and delete the copy/paste 
comment.

 HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
 call
 -

 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Fix For: 2.0.0

 Attachments: HIVE-10940.01.patch, HIVE-10940.02.patch, 
 HIVE-10940.03.patch, HIVE-10940.patch


 {code}
 String filterText = filterExpr.getExprString();
 String filterExprSerialized = Utilities.serializeExpression(filterExpr);
 {code}
 the serializeExpression initializes Kryo and produces a new packed object for 
 every split.
 HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.
 And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-15 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10940:

Attachment: HIVE-10940.01.patch

 HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
 call
 -

 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: HIVE-10940.01.patch, HIVE-10940.patch


 {code}
 String filterText = filterExpr.getExprString();
 String filterExprSerialized = Utilities.serializeExpression(filterExpr);
 {code}
 the serializeExpression initializes Kryo and produces a new packed object for 
 every split.
 HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.
 And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-10940) HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader call

2015-06-12 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-10940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-10940:

Attachment: HIVE-10940.patch

trunk patch

 HiveInputFormat::pushFilters serializes PPD objects for each getRecordReader 
 call
 -

 Key: HIVE-10940
 URL: https://issues.apache.org/jira/browse/HIVE-10940
 Project: Hive
  Issue Type: Bug
  Components: File Formats
Affects Versions: 1.2.0
Reporter: Gopal V
Assignee: Sergey Shelukhin
 Attachments: HIVE-10940.patch


 {code}
 String filterText = filterExpr.getExprString();
 String filterExprSerialized = Utilities.serializeExpression(filterExpr);
 {code}
 the serializeExpression initializes Kryo and produces a new packed object for 
 every split.
 HiveInputFormat::getRecordReader - pushProjectionAndFilters - pushFilters.
 And Kryo is very slow to do this for a large filter clause.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)