Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/22289 )
Change subject: IMPALA-12927: Support specifying format for reading JSON BINARY columns ...................................................................... Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/22289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/22289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@572 PS1, Line 572: > Sadly this is still not a proper solution, as different tables used in the I read that example, and propagating partition properties to the backend via HdfsStorageDescriptor is feasible. However, it seems that the members should remain immutable to avoid breaking the role of the INTERNER. Therefore, I am not sure if it is appropriate to include the jsonBinaryFormat_ as a property in it, because we might need to modify the jsonBinaryFormat_ based on query options during the planning phase. Of course, this format determination can also be done at the backend, in which case the jsonBinaryFormat_ in HdfsStorageDescriptor can be determined solely based on metadata. What do you think? http://gerrit.cloudera.org:8080/#/c/22289/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java: http://gerrit.cloudera.org:8080/#/c/22289/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@570 PS3, Line 570: String specificFormat = : tbl_.getMetaStoreTable().getParameters().get(JSON_BINARY_FORMAT); > I am concerned that we are not completely correct here. I briefly looked at the Hive code, and it does indeed merge table and SerDe properties for lookup, with SerDe properties having higher priority and able to override table properties with the same name. Currently, it seems that Impala only considers SerDe properties. Perhaps we need to modify HdfsStorageDescriptor.fromStorageDescriptor() to respect table properties as well? However, this modification should probably be beyond the scope of this patch, and it would be more reasonable to address it within IMPALA-13748. -- To view, visit http://gerrit.cloudera.org:8080/22289 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idf61fa3afc0f33caa63fbc05393e975733165e82 Gerrit-Change-Number: 22289 Gerrit-PatchSet: 3 Gerrit-Owner: Zihao Ye <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Mon, 24 Feb 2025 09:05:17 +0000 Gerrit-HasComments: Yes
