Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17979 )

Change subject: IMPALA-10791 Add batching reading for remote temporary files
......................................................................


Patch Set 1:

(18 comments)

Just try to understand this piece of work. Sounds promising.

http://gerrit.cloudera.org:8080/#/c/17979/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17979/1//COMMIT_MSG@38
PS1, Line 38: now the system
You mean Impala or OS?. Is it a guarantee?


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h
File be/src/runtime/io/disk-file.h:

http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@47
PS1, Line 47: DONE
nit. WRITTEN probably is better.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@55
PS1, Line 55: DISABLED
nit. Could be UNINIT too?


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@62
PS1, Line 62: poool
nit.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@70
PS1, Line 70:  bool IsDisabled() {
            :     std::unique_lock<SpinLock> l(lock_);
            :     return IsDisabledLocked(l);
            :   }
            :
            :   bool IsDisabledLocked(const std::unique_lock<SpinLock>& lock) {
            :     DCHECK(lock.mutex() == &lock_ && lock.owns_lock());
            :     return status_ == MemBlockStatus::DISABLED;
            :   }
Can this pair of methods to be simplified as follows?

 bool IsDisabled() {
    std::unique_lock<SpinLock> l(lock_);
    return IsInStateLocked(l, MemBlockStatus::DISABLED);
  }

  bool IsInStateLocked(const std::unique_lock<SpinLock>& lock, enum x) {
    DCHECK(lock.mutex() == &lock_ && lock.owns_lock());
    return status_ == x;
  }

In this way, other similar methods can call IsInStateLocked() directly.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@193
PS1, Line 193: lock_
A little bit confusing when read code which applies lock_. since there are two 
lock_ in two classes.

Suggest to name the lock_ here to memory_block_lock_.

I also wonder if we can just use the lock in disk file instead.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@292
PS1, Line 292:   std::lock_guard<SpinLock> l(lock_);
             :     return actual_file_size_;
actual_file_size_ is AtomicInt64. Do we need to lock with lock_ here first?


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@379
PS1, Line 379: num_of_read_buffer
nit. num_of_read_buffers.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@381
PS1, Line 381: equals
nit equal


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@391
PS1, Line 391: DCHECK_LT(buffer_idx, num_of_read_buffer());
             :     DCHECK_GE(buffer_idx, 0);
Should call DCheckReadBufferIdx().


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@393
PS1, Line 393:  std::lock_guard<SpinLock> lock(lock_);
should we lock it first before check?


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@415
PS1, Line 415:  DCheckReadBufferIdx(buffer_idx);
             :     std::lock_guard<SpinLock> lock(lock_);
Lock before check?


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.h@509
PS1, Line 509: lock_
Suggest to rename as disk_file_lock_.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.cc
File be/src/runtime/io/disk-file.cc:

http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/io/disk-file.cc@123
PS1, Line 123:  *allocated = true;
If the memory is deleted, the state in its control block should become reserved 
at most.

Suggest to also update its control block in this method.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/tmp-file-mgr-internal.h
File be/src/runtime/tmp-file-mgr-internal.h:

http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/tmp-file-mgr-internal.h@233
PS1, Line 233: of
nit off.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/tmp-file-mgr-internal.h@234
PS1, Line 234: is
nit if.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/tmp-file-mgr-internal.h@283
PS1, Line 283: SetToDelete
nit. May name as SetToDeleteFlag() since the argument can be false which 
indicates the files are not deleted.


http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/tmp-file-mgr.h
File be/src/runtime/tmp-file-mgr.h:

http://gerrit.cloudera.org:8080/#/c/17979/1/be/src/runtime/tmp-file-mgr.h@245
PS1, Line 245:  Return if batching read for remote temporary files is enabled.
nit. Indicate ...

May rename as CanRemoteBatchingRead()



--
To view, visit http://gerrit.cloudera.org:8080/17979
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dcc5d0881ffaeff09c5c514306cd668373ad31b
Gerrit-Change-Number: 17979
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu <wydbaggio...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Wed, 27 Oct 2021 18:34:58 +0000
Gerrit-HasComments: Yes

Reply via email to