[ https://issues.apache.org/jira/browse/PARQUET-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17768418#comment-17768418 ]
ASF GitHub Bot commented on PARQUET-2347: ----------------------------------------- amousavigourabi commented on code in PR #1141: URL: https://github.com/apache/parquet-mr/pull/1141#discussion_r1335214709 ########## parquet-hadoop/src/main/java/org/apache/parquet/hadoop/InternalParquetRecordReader.java: ########## @@ -167,13 +169,13 @@ public float getProgress() throws IOException, InterruptedException { public void initialize(ParquetFileReader reader, ParquetReadOptions options) { // copy custom configuration to the Configuration passed to the ReadSupport - Configuration conf = new Configuration(); - if (options instanceof HadoopReadOptions) { - conf = ((HadoopReadOptions) options).getConf(); - } + ParquetConfiguration conf = Objects.requireNonNull(options).getConfiguration(); for (String property : options.getPropertyNames()) { conf.set(property, options.getProperty(property)); } + for (Map.Entry<String, String> property : new Configuration()) { Review Comment: The Hadoop specific stuff I agree that it would be a bit silly. However, this class `InternalParquetRecordReader` is part of the read/write API for which we're trying to address it. That the read/write API is part of parquet-hadoop is somewhat unfortunate, but changing it at this point would break stuff. > Add interface layer between Parquet and Hadoop Configuration > ------------------------------------------------------------ > > Key: PARQUET-2347 > URL: https://issues.apache.org/jira/browse/PARQUET-2347 > Project: Parquet > Issue Type: Improvement > Components: parquet-mr > Reporter: Atour Mousavi Gourabi > Priority: Minor > > Parquet relies heavily on a few Hadoop classes, such as its Configuration > class, which is used throughout Parquet's reading and writing logic. If we > include our own interface for this, this could potentially allow users to use > Parquet's readers and writers without the Hadoop dependency later on. > In order to preserve backward compatibility and avoid breaking downstream > projects, the constructors and methods using Hadoop's constructor should be > preserved for the time being, though I would favour deprecation in the near > future. > This is part of an effort that has been [discussed on the dev mailing > list|https://lists.apache.org/thread/4wl0l3d9dkpx4w69jx3rwnjk034dtqr8]. -- This message was sent by Atlassian Jira (v8.20.10#820010)