Zihao Ye has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22289 )

Change subject: IMPALA-12927: Support specifying format for reading JSON BINARY 
columns
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/22289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/22289/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@572
PS1, Line 572:
> Sadly this is still not a proper solution, as different tables used in the
I read that example, and propagating partition properties to the backend via 
HdfsStorageDescriptor is feasible. However, it seems that the members should 
remain immutable to avoid breaking the role of the INTERNER. Therefore, I am 
not sure if it is appropriate to include the jsonBinaryFormat_ as a property in 
it, because we might need to modify the jsonBinaryFormat_ based on query 
options during the planning phase. Of course, this format determination can 
also be done at the backend, in which case the jsonBinaryFormat_ in 
HdfsStorageDescriptor can be determined solely based on metadata. What do you 
think?


http://gerrit.cloudera.org:8080/#/c/22289/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/22289/3/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@570
PS3, Line 570:     String specificFormat =
             :         
tbl_.getMetaStoreTable().getParameters().get(JSON_BINARY_FORMAT);
> I am concerned that we are not completely correct here.
I briefly looked at the Hive code, and it does indeed merge table and SerDe 
properties for lookup, with SerDe properties having higher priority and able to 
override table properties with the same name. Currently, it seems that Impala 
only considers SerDe properties. Perhaps we need to modify 
HdfsStorageDescriptor.fromStorageDescriptor() to respect table properties as 
well? However, this modification should probably be beyond the scope of this 
patch, and it would be more reasonable to address it within IMPALA-13748.



--
To view, visit http://gerrit.cloudera.org:8080/22289
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idf61fa3afc0f33caa63fbc05393e975733165e82
Gerrit-Change-Number: 22289
Gerrit-PatchSet: 3
Gerrit-Owner: Zihao Ye <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Zihao Ye <[email protected]>
Gerrit-Comment-Date: Mon, 24 Feb 2025 09:05:17 +0000
Gerrit-HasComments: Yes

Reply via email to