Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13178 )

Change subject: IMPALA-4658: Potential race if compiler reorders ReachedLimit() 
usage.
......................................................................


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG@9
PS4, Line 9: The ExecNode::num_rows_returned_ member is shared by the scan node
           : (thread) and the related scanner threads. There is a potential 
race here
           : since the compiler is free to reorder the load/store of
           : num_rows_returned_ member. Also, num_rows_returned_ is not 
accessed in
           : a thread safe manner.
Can this actually cause problems? I think that all scanners read 
num_rows_returned_ in order to stop earlier if no more rows are needed, but it 
is not a problem if they return some extra row (the ScannerNode will ignore 
them), it is just wasted work.


http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG@17
PS4, Line 17: The getters/modifiers for single threaded code-paths
            : are: rows_returned(), ReachedLimit(), SetNumRowsReturned(),
            : IncrementNumRowsReturned(), and DecrementNumRowsReturned(). 
Thread safe
            : counterparts are: rows_returned_shared(), ReachedLimitShared(),
            : IncrementNumRowsReturnedShared(), and 
DecrementNumRowsReturnedShared().
nit: this could be more readable if the functions were written to separate lines


http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG@33
PS4, Line 33: Some serious thoughts were given to other approaches such as
            : updating the atomic num_rows_returned_ variable outside of a 
loop, but
            : such a change would have been very extensive and error prone.
Did you think about creating a separate variable like limit_reached_shared_, 
and using only that one in a synchronized way? Scanners are only interested 
about whether the limit is reached, and IncrementNumRowsShared() could update 
it if needed, leading to less synchronized writes.


http://gerrit.cloudera.org:8080/#/c/13178/4//COMMIT_MSG@56
PS4, Line 56: .
nit: please wrap commit message at 72 chars


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/exec-node.h
File be/src/exec/exec-node.h:

http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/exec-node.h@211
PS4, Line 211: !IsScanNode() || type() == TPlanNodeType::HBASE_SCAN_NODE
             :         || type() == TPlanNodeType::DATA_SOURCE_NODE
I would prefer to move this logic to a virtual function like 
IsScanNodeNonTaskBasedMultiThreadingSupport() or 
LimitCheckedFromMultipleThreads(). (the _MT naming makes it hard to find a nice 
name) It will make it easier to add more scanners without multi-threading 
support in the future.


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/exec-node.h@401
PS4, Line 401: int64_t num_rows_returned_;
Can you add a comment about this variable's usage? It is not too typical that a 
variable is used both in an atomic and a non-atomic way, and readers may think 
that the non-atomic access is done by mistake.

+ 1 reason for highlighting this is that it may make sense to collect "shared" 
variables and change layout to move them to a separate cache line to reduce the 
effect of atomic operations.


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node-mt.cc
File be/src/exec/hdfs-scan-node-mt.cc:

http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node-mt.cc@114
PS4, Line 114: IncrementNumRowsReturnedShared
The comment states that we use Shared versions in MT nodes for the sake of 
simplicity. Can you add a comment here too? I am afraid about readers coming to 
wrong conclusions because of the unnecessary synchronization.


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node.cc@134
PS4, Line 134:     // Update the number of materialized rows now instead of 
when they are materialized.
             :     // This means that scanners might process and queue up more 
rows than are necessary
             :     // for the limit case but we want to avoid the synchronized 
writes to
             :     // num_rows_returned_.
             :     IncrementNumRowsReturnedShared(row_batch->num_rows());
The comment + the Shared function seem weird together - the comment says that 
we avoid synchronized writes, then we do a synchronized write.


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/hdfs-scan-node.cc@144
PS4, Line 144:       DecrementNumRowsReturnedShared(num_rows_over);
> Is this logic racy in that multiple threads could see that they pushed the
I do not understand the reason for doing this "increment, then decrement if it 
is more than the limit" thing. Couldn't we just increment with limit_ == -1 ? 
row_batch->num_rows() : min(row_batch->num_rows(), limit_)? Most nodes seem to 
use a similar logic.

Creating a function for this logic would be nice, e.g. bool 
ExecNode::CheckLimitAndTruncateRowBatchIfNeeded(RowBatch* row_batch, bool* eos).


http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/sort-node.cc
File be/src/exec/sort-node.cc:

http://gerrit.cloudera.org:8080/#/c/13178/4/be/src/exec/sort-node.cc@138
PS4, Line 138:   IncrementNumRowsReturned(row_batch->num_rows());
             :   if (ReachedLimit()) {
             :     row_batch->set_num_rows(row_batch->num_rows() - 
(rows_returned() - limit_));
             :     *eos = true;
             :   }
             :
             :   COUNTER_SET(rows_returned_counter_, rows_returned());
Is it normal that we can leave rows_returned_counter_ > limit_? This seems to 
be the only node where it is not capped at limit_.



--
To view, visit http://gerrit.cloudera.org:8080/13178
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cbbfad80f7ab87dd6f192a24e2c68f7c66b047e
Gerrit-Change-Number: 13178
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Abhishek Rawat <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Thu, 02 May 2019 18:32:26 +0000
Gerrit-HasComments: Yes

Reply via email to