[ https://issues.apache.org/jira/browse/DRILL-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17678473#comment-17678473 ]
ASF GitHub Bot commented on DRILL-8390: --------------------------------------- cgivre opened a new pull request, #2742: URL: https://github.com/apache/drill/pull/2742 # [DRILL-8390](https://issues.apache.org/jira/browse/DRILL-8390): Minor Improvements to PDF Reader ## Description This PR makes some minor improvements to the PDF reader including: Fixes a minor bug where certain configurations the first row of data was skipped Fixes a minor bug where empty tables were causing crashes with the spreadsheet extraction algorithm was used Adds a `_table_count` metadata field Adds a `_table_index` metadata field to reflect the current table. ## Documentation See above. Updated README. ## Testing Ran existing unit tests. Manually tested against customer data. > Minor Improvements to PDF Reader > -------------------------------- > > Key: DRILL-8390 > URL: https://issues.apache.org/jira/browse/DRILL-8390 > Project: Apache Drill > Issue Type: Improvement > Components: Format - PDF > Reporter: Charles Givre > Assignee: Charles Givre > Priority: Major > > This PR makes some minor improvements to the PDF reader including: > * Fixes a minor bug where certain configurations the first row of data was > skipped > * Fixes a minor bug where empty tables were causing crashes with the > spreadsheet extraction algorithm was used > * Adds a table_count metadata field > * Adds a table_index metadata field to reflect the current table. -- This message was sent by Atlassian Jira (v8.20.10#820010)