[ https://issues.apache.org/jira/browse/DRILL-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16952925#comment-16952925 ]
ASF GitHub Bot commented on DRILL-7177: --------------------------------------- arina-ielchiieva commented on pull request #1749: DRILL-7177: Format Plugin for Excel Files URL: https://github.com/apache/drill/pull/1749#discussion_r335541229 ########## File path: contrib/format-excel/README.md ########## @@ -0,0 +1,36 @@ +# Excel Format Plugin +This plugin enables Drill to read Microsoft Excel files. This format is best used with Excel files that do not have extensive formatting, however it will work with formatted files, by allowing you to define a region within the file where the data is. + +The plugin will automatically evaluate cells which contain formulae. + +## Plugin Configuration +This plugin has several configuration variables which must be set in order to read Excel files effectively. Since Excel files often contain other elements besides data, you can use the configuration variables to define a region within your spreadsheet in which Drill should extract data. This is potentially useful if your spreadsheet contains a lot of formatting or other complications. + +* `headerRow`: Set to -1 if there are no column headers. +* `lastRow`: This defines the last row of your data. The default is an arbitrary large number. You only will need to set this if you want Drill to stop reading at a specific location. +* `sheetName`: This is the name of the sheet you want to query. This will default to the first sheet in the file if left undefined. +* `firstColumn`: If you want to define a region within a spreadsheet, this is the left-most column index. This is indexed from one. If set to `0` Drill will start at the left most column. +* `lastColumn`: If you want to define a region within a spreadsheet, this is the right-most column index. This is indexed from one. If set to `0` Drill will read all available columns. This is not inclusive, so if you ask for columns 2-5 you will get columns 2,3 and 4. + +## Usage +You can specify the configuration at runtime via the `table()` method or in the storage plugin configuration. For instance, if you just want to query an Excel file, you could execute the query as follows: + +``` +SELECT <fields> +FROM dfs.`somefile.xlsx` +``` + +If you wanted to query a different sheet other than the default, use the `table()` method as shown below: Review comment: ```suggestion If you wanted to query a different sheet other than the default, use the `table()` function as shown below: ``` ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Format Plugin for Excel Files > ----------------------------- > > Key: DRILL-7177 > URL: https://issues.apache.org/jira/browse/DRILL-7177 > Project: Apache Drill > Issue Type: Improvement > Affects Versions: 1.17.0 > Reporter: Charles Givre > Assignee: Charles Givre > Priority: Major > Labels: doc-impacting > Fix For: 1.17.0 > > > This pull request adds the functionality which enables Drill to query > Microsoft Excel files. -- This message was sent by Atlassian Jira (v8.3.4#803005)