[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 4: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

Fix multiple bugs that could occur when a block was unpinned then pinned
again while the write was in flight. There were two problems:

1. A block's buffer could be transferred while a write is in flight,
  leaving the block in an invalid state. The fix is to wait for
  the in-flight write to complete before transferring the buffer.
2. On certain code paths in WriteComplete(), condition variables weren't
  signalled, leading to threads waiting for write completion not being
  woken up. The fix is to clarify when condition variables will be
  signalled and ensure that the appropriate condition variables are
  always signalled when the write completes.

Testing:
Added a targeted unit test that exercises these code paths using a
debug option that controls timing of writes.

Reran the stress test configuration that reproducibly triggered the
bug: TPC-H query 18 on a release build with a single impalad.
It succeeded.

Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Reviewed-on: http://gerrit.cloudera.org:8080/3832
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
3 files changed, 121 insertions(+), 27 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3832/2//COMMIT_MSG
Commit Message:

Line 28: It succeeded.
> I'm fine with the unit test coverage, but just thinking that if this worklo
Agreed, I think it's timing-dependent though so I couldn't see a way to make it 
reproducible - I think the main thing would be to run the stress test on a 
variety of data sizes and hardware configs.

I plan to port many of the BufferedBlockMgr unit tests to the new buffer pool 
so we should have coverage of this interesting sequence of operations.


-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 4: Code-Review+2

rebase, carry +2

-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3832/2//COMMIT_MSG
Commit Message:

Line 28: It succeeded.
> Hard to say - I only saw it in a specific configuration I was running local
I'm fine with the unit test coverage, but just thinking that if this workload 
provides an interesting stress on spilling, then it'd be nice to carry forward 
to the buffer pool.  But I suppose we don't have any reason to believe it would 
stress the new code in the same way.


-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-12 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 2:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/3832/2//COMMIT_MSG
Commit Message:

Line 14:   the in-flight write to complete before transferring the buffer.
> TransferBlock() will either unpin or delete.  In the delete case, WriteComp
The invariant is !in_write_ || buffer_desc_ != NULL

The problem is in the delete case if src->in_write_ is true, then it will grab 
the buffer_desc_ from src, leaving it in the invalid state where in_write_ is 
true and buffer_desc_ == NULL.

Your comment makes me think that there may be a similar problem in the unpin 
case that is also solved by this fix. I don't think we ever unpin a block twice 
so we wouldn't have hit it though.


Line 17:   woken up. The fix is to clarify when condition variables will be
> was this manifesting in a hang?
Yep the unit test hung in many cases and crashed in others. I didn't see that 
in Impala.


PS2, Line 19: right
> write
Done


Line 20:   notify_all() in case there are multiple threads waiting.
> is using notify_all() necessary, or just done for consistency?
It isn't really necessary in some cases now that I've reasoned through it. I 
reverted those changes and updated the header comments to better explain the 
invariants.


Line 28: It succeeded.
> is this a configuration that will get coverage at some point during release
Hard to say - I only saw it in a specific configuration I was running locally 
on a specific scale factor with a release build and the PHJ refactor patch. We 
haven't seen it in other stress tests. Maybe with the PHJ refactor patch we'll 
see it on other stress tests.

I'm pretty happy with the coverage that the unit test provides.


http://gerrit.cloudera.org:8080/#/c/3832/2/be/src/runtime/buffered-block-mgr.cc
File be/src/runtime/buffered-block-mgr.cc:

Line 929: block->write_complete_cv_.notify_all();
> seems okay to use notify_all(), but in practice why would there be multiple
I reverted this change and updated the header comment because, as you pointed 
out, it isn't possible.


Line 976: buffer_available_cv_.notify_all();
I'm more convinced now that this change was unnecessary. I was thinking that it 
would be a problem if a waiter woke up then decided not to grab the buffer, but 
that shouldn't be possible. 

I reverted this change and updated the header comment.


http://gerrit.cloudera.org:8080/#/c/3832/2/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 449:   /// needs to not have been taken when this function is called.
> looks like this comment needs updating if we'll always wait for in-flight w
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#3).

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..

IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

Fix multiple bugs that could occur when a block was unpinned then pinned
again while the write was in flight. There were two problems:

1. A block's buffer could be transferred while a write is in flight,
  leaving the block in an invalid state. The fix is to wait for
  the in-flight write to complete before transferring the buffer.
2. On certain code paths in WriteComplete(), condition variables weren't
  signalled, leading to threads waiting for write completion not being
  woken up. The fix is to clarify when condition variables will be
  signalled and ensure that the appropriate condition variables are
  always signalled when the write completes.

Testing:
Added a targeted unit test that exercises these code paths using a
debug option that controls timing of writes.

Reran the stress test configuration that reproducibly triggered the
bug: TPC-H query 18 on a release build with a single impalad.
It succeeded.

Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
---
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
3 files changed, 121 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/3832/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-11 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/3832/2//COMMIT_MSG
Commit Message:

Line 14:   the in-flight write to complete before transferring the buffer.
TransferBlock() will either unpin or delete.  In the delete case, 
WriteComplete() should finish the delete, so we should be okay, right?  In the 
unpin case, it looks like we will hit the DCHECK in WriteUnpinnedBlock() that 
!block->in_write_.  Is that what's happening? Or is there some other corruption?


Line 17:   woken up. The fix is to clarify when condition variables will be
was this manifesting in a hang?


PS2, Line 19: right
write


Line 20:   notify_all() in case there are multiple threads waiting.
is using notify_all() necessary, or just done for consistency?


Line 28: It succeeded.
is this a configuration that will get coverage at some point during releases 
(i.e. do we have good regression coverage)?


http://gerrit.cloudera.org:8080/#/c/3832/2/be/src/runtime/buffered-block-mgr.cc
File be/src/runtime/buffered-block-mgr.cc:

Line 929: block->write_complete_cv_.notify_all();
seems okay to use notify_all(), but in practice why would there be multiple 
threads waiting on this cv?  It's only used by TransferBlock() and two threads 
shouldn't be trying to do that, right?


http://gerrit.cloudera.org:8080/#/c/3832/2/be/src/runtime/buffered-block-mgr.h
File be/src/runtime/buffered-block-mgr.h:

Line 449:   /// needs to not have been taken when this function is called.
looks like this comment needs updating if we'll always wait for in-flight 
writes.


-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

2016-08-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in 
flight.
..

IMPALA-3936: BufferedBlockMgr fixes for Pin() while write in flight.

Fix multiple bugs that could occur when a block was unpinned then pinned
again while the write was in flight. There were two problems:

1. A block's buffer could be transferred while a write is in flight,
  leaving the block in an invalid state. The fix is to wait for
  the in-flight write to complete before transferring the buffer.
2. On certain code paths in WriteComplete(), condition variables weren't
  signalled, leading to threads waiting for write completion not being
  woken up. The fix is to clarify when condition variables will be
  signalled and ensure that the appropriate condition variables are
  always signalled when the right completes. Also consistently use
  notify_all() in case there are multiple threads waiting.

Testing:
Added a targeted unit test that exercises these code paths using a
debug option that controls timing of writes.

Reran the stress test configuration that reproducibly triggered the
bug: TPC-H query 18 on a release build with a single impalad.
It succeeded.

Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
---
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-block-mgr.cc
M be/src/runtime/buffered-block-mgr.h
3 files changed, 114 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/3832/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3832
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4be4fad8e6f2303db19ea1e2bd0f13523781ae8e
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong