Mengran Lan created SPARK-45334:
-----------------------------------

             Summary: Remove misleading comment in parquetSchemaConverter
                 Key: SPARK-45334
                 URL: https://issues.apache.org/jira/browse/SPARK-45334
             Project: Spark
          Issue Type: Documentation
          Components: SQL
    Affects Versions: 3.5.0
            Reporter: Mengran Lan


I'm debugging a parquet issue and reading spark code as references. Happened to 
find a misleading comment which remains in the latest version as well.
{code:java}
Types
  .buildGroup(repetition).as(LogicalTypeAnnotation.listType())
  .addField(Types
    .buildGroup(REPEATED)
    // "array" is the name chosen by parquet-hive (1.7.0 and prior version)
    .addField(convertField(StructField("array", elementType, nullable)))
    .named("bag"))
  .named(field.name) {code}
the comment above is misleading since Hive always uses "array_element" as the 
name.

It is imported by this PR [https://github.com/apache/spark/pull/14399] and 
relates to this issue https://issues.apache.org/jira/browse/SPARK-16777

Furthermore, the parquet-hive module has been removed from the parquet-mr 
project https://issues.apache.org/jira/browse/PARQUET-1676 

I suggest removing this piece of comment and will submit a PR later.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to