Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19119 )

Change subject: IMPALA-11646 IMPALA-11562: Fix 
test_unsupported_text_compression in s3
......................................................................


Patch Set 3:

> Patch Set 3:
>
> > Patch Set 3: Code-Review+1
> >
> > Thanks for the prompt fix Michael! I do not have any other comment.
> >
> > I just like to check if my understanding in the following is correct after 
> > reading your fix and the corresponding functions.
> >
> > When the underlying storage is HDFS and we call 
> > filesystem_client.create_file() in test_unsupported_text_compression(), 
> > DelegatingHdfsClient::create_file() is called. In this case, 
> > DelegatingHdfsClient::_normalize_path() will be invoked. _normalize_path() 
> > removes the leading slash in the given path if there is any and then 
> > webhdfs_client.create_file() will be used to create the file.
> >
> > In your patch for IMPALA-11562, you removed the leading slash before 
> > calling filesystem_client.create_file(), which still works when the 
> > underlying storage is HDFS but doesn't work in s3 because 
> > HadoopFsCommandLineClient("S3") behaves a bit differently, i.e., it expects 
> > that leading slash, and thus this test failed.
>
> My understanding above is not completely accurate. IMPALA-11562 added that 
> _normalize_path() in DelegatingHdfsClient. Thus 
> test_unsupported_text_compression() has to be changed accordingly to pass the 
> test in s3. IMPALA-11562 did not change test_unsupported_text_compression().
>
> Now I seem to understand this part of the Python test more. :-)

Yes, tests were written to remove the leading slash, because 
DelegatingHdfsClient expected a path without slash, while the 
HadoopFsCommandLineClient would normalize the path by adding a leading slash if 
it was missing.

For the Ozone work I started also using URIs with a schema, which would have 
complicated the HadoopFsCommandLineClient logic. I decided to invert what 
"normal" is, so you should now use absolute paths and DelegatingHdfsClient will 
normalize the path (by removing the leading slash) to fit the expectations of 
PyWebHdfsClient.


--
To view, visit http://gerrit.cloudera.org:8080/19119
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I74724b2d5af0bd879d92d5e5febe92d01c972525
Gerrit-Change-Number: 19119
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Fang-Yu Rao <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Comment-Date: Wed, 12 Oct 2022 16:17:59 +0000
Gerrit-HasComments: No

Reply via email to