Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
......................................................................


Patch Set 13:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/hdfs-fs-cache.cc
File be/src/runtime/hdfs-fs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/hdfs-fs-cache.cc@103
PS13, Line 103:       if (options != nullptr && !options->empty()) {
              :         for (auto option : *options) {
              :           hdfsBuilderConfSetStr(
              :               hdfs_builder, option.first.c_str(), 
option.second.c_str());
              :         }
              :       }
> does this actually work? were you able to confirm that non-default configs
Modified.


http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/disk-io-mgr.h@370
PS13, Line 370:     REMOTE_DFS_DISK_FILE_OPER_OFFSET,
              :     REMOTE_S3_DISK_FILE_OPER_OFFSET,
> you cover this a bit in the commit message, but is the reason they were add
The purpose of the "DISK_FILE_OPER" queues is to isolate the file operations 
from normal read/write IO operations. It may increase the efficiency of file 
operations and provide a separate control of the threads working on file 
operation jobs.
Have added some comments in the code and the design document "Queue" chapter.


http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/disk-io-mgr.cc@30
PS13, Line 30: #include "runtime/tmp-file-mgr-internal.h"
> Why are we including the tmp-file-mgr here? I see a lot of references to th
Thank you for the comment. Have decoupled the use of TmpFile from disk-io-mgr. 
Created a DiskFile in the io layer. The tmp-file-mgr would pass the DiskFiles 
to the disk-io-mgr for IO operations.


http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/disk-io-mgr.cc@241
PS13, Line 241: goto end;
> I think we generally try to avoid 'goto' statements because they make the c
Done


http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/request-ranges.h
File be/src/runtime/io/request-ranges.h:

http://gerrit.cloudera.org:8080/#/c/16318/13/be/src/runtime/io/request-ranges.h@113
PS13, Line 113:     READ,
              :     WRITE,
              :     FETCH,
              :     UPLOAD,
> not sure I understand the difference between all of these. fetch seems like
FETCH and UPLOAD are for the operations of the entire file, while READ and 
WRITE are related to a certain range of a file (page). Have changed the name to 
FILE_FETCH and FILE_UPLOAD, and added some comments in the code.



--
To view, visit http://gerrit.cloudera.org:8080/16318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89
Gerrit-Change-Number: 16318
Gerrit-PatchSet: 13
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Thu, 08 Oct 2020 16:51:01 +0000
Gerrit-HasComments: Yes

Reply via email to