[
https://issues.apache.org/jira/browse/PARQUET-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated PARQUET-1826:
------------------------------------
Labels: pull-request-available (was: )
> Document hadoop configuration options
> -------------------------------------
>
> Key: PARQUET-1826
> URL: https://issues.apache.org/jira/browse/PARQUET-1826
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Reporter: Gabor Szadovszky
> Assignee: Walid Gara
> Priority: Major
> Labels: pull-request-available
>
> The currently available hadoop configuration options is not documented
> properly. The only documentation we have is the javadoc comment and the
> implementation ofÂ
> [ParquetOutputFormat|https://github.com/apache/parquet-mr/blob/master/parquet-hadoop/src/main/java/org/apache/parquet/hadoop/ParquetOutputFormat.java].
> We shall investigate all the possible options and their usage/default values
> and document them properly in a way that it is easily accessible by our users.
> I would suggest creating a `README.md` file in the sub-module
> [parquet-hadoop|https://github.com/apache/parquet-mr/tree/master/parquet-hadoop]
> that would describe the purpose of the module and would have a section that
> lists the possible hadoop configuration options. (Later on we shall extend
> this document with other descriptions about the purpose and usage of our
> library in the hadoop ecosystem. These efforts shall be covered by other
> jiras.)
> By adding the description to the source code it would be easy to extend it by
> the new features we implement so it will be up-to-date for every release.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)