Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16318 )

Change subject: IMPALA-9867: Add Support for Spilling to S3: Milestone 1
......................................................................


Patch Set 19:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc
File be/src/runtime/io/hdfs-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc@38
PS19, Line 38:   if (expected_local_) {
Can we remove "expected_local_" altogether from this class?


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc@68
PS19, Line 68:     range->SetOffset(written_bytes_);
written_bytes_ is updated when 'UpdateWrittenSize' is called in next line. 
Should we set Offset to "range->len()" instead?


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/hdfs-file-writer.cc@77
PS19, Line 77:   DCHECK(hdfs_file_ != nullptr);
Should we also acquire lock_ here?


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-system.cc
File be/src/runtime/io/local-file-system.cc:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-system.cc@101
PS19, Line 101:   if (bytes_read < length) {
I am not sure if this is always the case? Do we always allocate buffer the same 
length as the file size. If yes, then this check works.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.h
File be/src/runtime/io/local-file-writer.h:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.h@48
PS19, Line 48:   Status WritePadding(WriteRange* range, int64_t padding_size);
Since we are adding padding to the file, how do we determine how much to read 
from the file i.e., when reading how to differentiate between actual data vs 
padded data? Could it not cause wrong result if we end up consuming padded data?


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.cc
File be/src/runtime/io/local-file-writer.cc:

http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.cc@51
PS19, Line 51:   range->SetOffset(written_bytes_);
Similar comment as before. I don't think `written_bytes_` has the updated value 
for this particular write.


http://gerrit.cloudera.org:8080/#/c/16318/19/be/src/runtime/io/local-file-writer.cc@87
PS19, Line 87: Status LocalFileWriter::WriteOne(WriteRange* write_range) {
Maybe add a comment, explaining why the lock_ is not acquired in this function.



--
To view, visit http://gerrit.cloudera.org:8080/16318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I419b1d5dbbfe35334d9f964c4b65e553579fdc89
Gerrit-Change-Number: 16318
Gerrit-PatchSet: 19
Gerrit-Owner: Yida Wu <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Yida Wu <[email protected]>
Gerrit-Comment-Date: Fri, 30 Oct 2020 20:05:12 +0000
Gerrit-HasComments: Yes

Reply via email to