[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6578: - Labels: orcfile (was: TODOC13 orcfile) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lefty Leverenz updated HIVE-6578: - Labels: TODOC13 orcfile (was: orcfile) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: TODOC13, orcfile Fix For: 0.13.0 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sergey Shelukhin updated HIVE-6578: --- Resolution: Fixed Fix Version/s: 0.13.0 Status: Resolved (was: Patch Available) committed to trunk and 13 Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Fix For: 0.13.0 Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6578: - Attachment: HIVE-6578.4.patch.txt reuploading for jenkins Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch, HIVE-6578.4.patch.txt ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6578: - Attachment: HIVE-6578.4.patch Thanks [~leftylev] for the doc comment. Fixed it. Thanks [~sershe] for the code review. Addressed your concerns in this patch. Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch, HIVE-6578.4.patch ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6578: - Summary: Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command (was: Use ORC file footer statistics for analyze command) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6578: - Description: ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. (was: ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size.) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (HIVE-6578) Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command
[ https://issues.apache.org/jira/browse/HIVE-6578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth J updated HIVE-6578: - Attachment: HIVE-6578.3.patch Addressed [~sershe]'s review comments. Had offline discussion with [~owen.omalley] . Addressed Owen's comment as well. Following are the changes in this patch 1) Added new StatsProvidingRecordReader (similar to StatsProvidingRecordWriter) interface which will be used to get stats from the record reader. This works even if partition contains data written with different file format. 2) Removed ORC specific references from StatsNoJobTask. It should not work with any file formats that implement StatsProvidingRecordReader interface. Use ORC file footer statistics through StatsProvidingRecordReader interface for analyze command --- Key: HIVE-6578 URL: https://issues.apache.org/jira/browse/HIVE-6578 Project: Hive Issue Type: New Feature Affects Versions: 0.13.0 Reporter: Prasanth J Assignee: Prasanth J Labels: orcfile Attachments: HIVE-6578.1.patch, HIVE-6578.2.patch, HIVE-6578.3.patch ORC provides file level statistics which can be used in analyze partialscan and noscan cases to compute basic statistics like number of rows, number of files, total file size and raw data size. On the writer side, a new interface was added earlier (StatsProvidingRecordWriter) that exposed stats when writing a table. Similarly, a new interface StatsProvidingRecordReader can be added which when implemented should provide stats that are gathered by the underlying file format. -- This message was sent by Atlassian JIRA (v6.2#6252)