Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22456 )

Change subject: IMPALA-13739: Part2: Introduce IcebergFileDescriptor
......................................................................


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/22456/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/22456/3//COMMIT_MSG@8
PS3, Line 8:
> I think it would be good to shortly describe 'IcebergFileDescriptor' here,
Done


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@710
PS3, Line 710: Hdfs
> In the review of Part1 there was a comment (https://gerrit.cloudera.org/#/c
That was a different story in Part1: FileDescriptor had parts that were 
relevant to Iceberg tables too, so that was another reason we couldn't call it 
HdfsFileDescriptor. With this change, technically we could call FileDescriptor 
HdfsFileDescriptor because the Iceberg part is removed now. The reason I didn't 
do that is because then IcebergFileDescriptor would derive from 
HdfsFileDescriptor that doesn't seem a good alignment between classes. The most 
optimal design would be to have an abstract FileDescriptor class that has 2 
implementations IcebergFileDescriptor and HdfsFileDescriptor. However, 
currently the HdfsFileDescriptor class would be an empty one, so I didn't see 
the point implementing it.

Getting back to the names of these functions. I figured that calling 
FeIcebergTable.getFileDescriptor() in this form is misleading because callers 
would expect this to return IcebergFileDescriptors, however it doesn't fo that 
and return the parent FileDescriptor, that is in fact a FileDescriptor for the 
HDFS world. I wanted to avoid this comfusion by explicitly say that this is not 
returning Iceberg FDs and instead return the one for HDFS tables (without the 
Iceberg part.)

Hope this makes sense.


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/FileDescriptor.java
File fe/src/main/java/org/apache/impala/catalog/FileDescriptor.java:

http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/FileDescriptor.java@56
PS3, Line 56: private f
> We already have a getFbFileDescriptor() method which could be used by subcl
Done


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java
File fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java:

http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@59
PS3, Line 59: private s
> This could remain private.
Done


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergContentFileStore.java@297
PS3, Line 297: hdfs
> Maybe 'rawFileDesc' to avoid reference to HDFS?
This is what it is, it's an Hdfs FileDescriptor that we get from the 
'hdfsFileDescMap'.


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java
File fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java:

http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java@32
PS3, Line 32:
> Now in this new class it is always Iceberg metadata, isn't it?
Done


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java@33
PS3, Line 33: FbFileMetadata
> Couldn't we use FbIcebergMetadata directly?
Yes, we could. I planned to do that in a separate change to keep this on only 
for extracting stuff from FileDescriptor into IcebergFileDescriptor without 
changing the Fb representation.


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java@83
PS3, Line 83:     if (!(obj instanceof IcebergFileDescriptor)) return false;
> If 'obj' is NULL, the "instanceof" operator will also return false.
Done


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/catalog/IcebergFileDescriptor.java@90
PS3, Line 90: &&
> As far as I understand we can rely on these arrays being the same objects f
Well, this is existing code, just moved from one place to another. 
Additionally, similar functions for FileDescriptor (the HDFS part) are also 
called the same


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@27
PS3, Line 27:
> Unused import.
Done


http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/22456/3/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@742
PS3, Line 742: hdfs
> 'rawFileDesc'?
see my other comment somewhere else.
Additionally, we eventually might want to get to the point of making 
FileDrecriptor abstract and have IcebergFileDescriptor introduced now and 
HdfsFileDescriptor introduced later on (if we see the point of doing so).

For me 'raw' wouldn't make more clarity here about what this FileDescriptor is.



--
To view, visit http://gerrit.cloudera.org:8080/22456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id08baf4db32197bab74178777c4ba3fec98c2451
Gerrit-Change-Number: 22456
Gerrit-PatchSet: 4
Gerrit-Owner: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: Gabor Kaszab <gaborkas...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Comment-Date: Tue, 11 Feb 2025 09:14:08 +0000
Gerrit-HasComments: Yes

Reply via email to