arina-ielchiieva commented on a change in pull request #1749: DRILL-7177: Format Plugin for Excel Files URL: https://github.com/apache/drill/pull/1749#discussion_r339342000
########## File path: contrib/format-excel/README.md ########## @@ -0,0 +1,59 @@ +# Excel Format Plugin +This plugin enables Drill to read Microsoft Excel files. This format is best used with Excel files that do not have extensive formatting, however it will work with formatted files, by allowing you to define a region within the file where the data is. + +The plugin will automatically evaluate cells which contain formulae. + +## Plugin Configuration +This plugin has several configuration variables which must be set in order to read Excel files effectively. Since Excel files often contain other elements besides data, you can use the configuration variables to define a region within your spreadsheet in which Drill should extract data. This is potentially useful if your spreadsheet contains a lot of formatting or other complications. + +### Configuration Options: +The most basic configuration is simply to add the following to a file based storage plugin: +``` +"excel": { + "type": "excel" + } +``` +The plugin has many other configuration options listed below: + +* `headerRow`: Set to `-1` if there are no column headers. Defaults to `0`. If the data does not have a header row, Drill will assign column names of `field_n` for each column +. If the sheet starts with a series of empty rows, Drill will ignore these empty rows, so there is no need to set the `headerRow` in that case. +* `lastRow`: This defines the last row of data. The default is 1048576 which is the theoretical row limit for Excel files. It is only necessary to set this if you want Drill to + stop reading at a specific location. +* `sheetName`: This is the name of the sheet which Drill will query. This will default to the first sheet in the file if left undefined. Drill will throw an exception if the Review comment: @cgivre I still see two spaces between sentences, please fix. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services