Alex Behm has posted comments on this change. ( http://gerrit.cloudera.org:8080/9534 )
Change subject: IMPALA-6240: [DOCS] Document PARQUET_ARRAY_RESOLUTION query option ...................................................................... Patch Set 2: (13 comments) http://gerrit.cloudera.org:8080/#/c/9534/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/9534/2//COMMIT_MSG@10 PS2, Line 10: Cherry-picks: not for 2.x move above Change-Id http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml File docs/topics/impala_parquet_array_resolution.xml: http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@21 PS2, Line 21: <concept id="parquet_annotate_strings_utf8" rev="2.6.0 IMPALA-2069"> Wrong refs? http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@47 PS2, Line 47: order of the indexed-based resolution for nested arrays in Parquet. I don't think "order" is correct here. "depth" would be more appropriate, but probably too confusing suggestion: ... controls the behavior of the index-based field resolution for nested arrays in Parquet. http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@48 PS2, Line 48: </p> Mention the relevant query option for name vs. index based resolution: PARQUET_FALLBACK_SCHEMA_RESOLUTION=position http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@50 PS2, Line 50: <p> In Parquet, you can represent an array using a 2-level or 3-level We should clearly state that the modern, standard representation is 3-level. The legacy 2-level scheme is supported for compatibility with older Parquet files. http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@51 PS2, Line 51: representation. However, there is no metadata within Parquet files to However, there is no reliable metadata ... http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@81 PS2, Line 81: All of the above options resolves arrays encoded with a single level. resolve http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@85 PS2, Line 85: A failure to resolve a schema path with a given array resolution policy A failure to resolve a column/field reference in a query with a given array resolution ... http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@87 PS2, Line 87: A mismatch might be treated like a missing field, and it is not possible ... like a missing column (returns NULL values), and it is not possible ... http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@89 PS2, Line 89: field' cases. 'legitimately missing column' http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@93 PS2, Line 93: The name-based policy generally does not have the problem of ambiguous Mention how the name based resolution is set: PARQUET_FALLBACK_SCHEMA_RESOLUTION=name http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@116 PS2, Line 116: SingleFieldGroupInList { Maybe you can give this a more documenty-name such as ParquetSchemaExampleA http://gerrit.cloudera.org:8080/#/c/9534/2/docs/topics/impala_parquet_array_resolution.xml@165 PS2, Line 165: ThriftPrimitiveInList { ParquetSchemaExampleA? The "Thrift" stuff in the name might confuse readers -- To view, visit http://gerrit.cloudera.org:8080/9534 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I12696b609609ea16c05d8b7e84b2bae0be6d6cb5 Gerrit-Change-Number: 9534 Gerrit-PatchSet: 2 Gerrit-Owner: Alex Rodoni <arod...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Rodoni <arod...@cloudera.com> Gerrit-Reviewer: Greg Rahn <gr...@cloudera.com> Gerrit-Reviewer: John Russell <jruss...@cloudera.com> Gerrit-Comment-Date: Thu, 08 Mar 2018 05:19:07 +0000 Gerrit-HasComments: Yes