[ 
https://issues.apache.org/jira/browse/DRILL-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242279#comment-16242279
 ] 

Arina Ielchiieva commented on DRILL-5106:
-----------------------------------------

The following improvements will be implemented in the scope of DRILL-5941:
a. fileFormats will be removed from skip records inspector;
b. skip header count logic will be applied only once during reader 
initialization;
c. when skip footer won't be required, default processing will be done without 
buffering data in queue.

> Refactor SkipRecordsInspector to exclude check for predefined file formats
> --------------------------------------------------------------------------
>
>                 Key: DRILL-5106
>                 URL: https://issues.apache.org/jira/browse/DRILL-5106
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Storage - Hive
>    Affects Versions: 1.9.0
>            Reporter: Arina Ielchiieva
>            Assignee: Arina Ielchiieva
>            Priority: Minor
>
> After changes introduced in DRILL-4982, SkipRecordInspector is used only for 
> predefined formats (using hasHeaderFooter: false / true). But 
> SkipRecordInspector has its own check for formats where skip strategy can be 
> applied. Acceptable file formats are stored in private final Set<Object> 
> fileFormats and initialized in constructor, currently it contains only one 
> format - TextInputFormat. Now this check is redundant and may lead to 
> ignoring hasHeaderFooter setting to true for any other format except of Text.
> To do:
> 1. remove private final Set<Object> fileFormats
> 2. remove if block from SkipRecordsInspector.retrievePositiveIntProperty:
> {code}
>  if 
> (!fileFormats.contains(tableProperties.get(hive_metastoreConstants.FILE_INPUT_FORMAT)))
>  {
> return propertyIntValue;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to