[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-12-01 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 12: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-12-01 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 12: Code-Review+2

Rebase. Promote +1 to +2.

-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 11: Code-Review+1

(1 comment)

Carry +1

http://gerrit.cloudera.org:8080/#/c/5250/11/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 83: per_host_mem_usage_(NULL),
later patchset depended on this being NULL but it was left uninitialized which 
can sometimes lead to a crash, so fixing this.


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#11).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 67 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/11
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS9, Line 94: prepared_promise_.Set(status);
> How about moving this above line 93, so anyone waiting on it won't have to 
Yeah, I thought the semantics of the promise were clearer to do it last (it's a 
barrier to the completion of Prepare()), but I don't feel too strongly about it 
so I've reversed it.


http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS9, Line 109:  
> "Open() fails"
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#10).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 63 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/10
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 9: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS9, Line 94: prepared_promise_.Set(status);
How about moving this above line 93, so anyone waiting on it won't have to wait 
for the report RPC to complete?


http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS9, Line 109:  
"Open() fails"


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 93:   if (!status.ok()) FragmentComplete(status);
> Yeah, but that's also how the old code worked.  It's benign since in case o
Changed this up a bit so we still call FragmentComplete() here.


http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 488:   DCHECK(!report_thread_active_);
this patch was mostly a rebase, but also added these DCHECKs.


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#9).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 62 insertions(+), 120 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/9
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 8: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#8).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 59 insertions(+), 120 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/8
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5250/7/be/src/service/fragment-exec-state.cc
File be/src/service/fragment-exec-state.cc:

PS7, Line 51: prepare_promise_
> Can't we just use the PFEs prepared_promise_ ? They both track the same val
Yes, if we expose a timeout interface from PFE. Given this is preexisting and 
these classes will probably need major refactoring for MT soon, I'd rather not 
muck with it now.


PS7, Line 62: DCHECK(status.ok() || done);  // if !status.ok() => done
> Redundant with SendReport()
Yes, but it is a precondition to both, and since this is the dcheck that caught 
the bug in the first place, I'm inclined to leave it.  (there's no reason there 
can't be other callers to this method, and SendReport() doesn't always call 
this-- well it does after the coordinator fragment change, but the code hasn't 
been fully cleaned up to reflect that yet).


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 93:   if (!status.ok() && !report_status_cb_.empty()) {
> There's a bug here. If PrepareInternal() fails after AcquireThreadToken(), 
Yeah, but that's also how the old code worked.  It's benign since in case of 
failure the resource pool will go away shortly, but I agree it's best to 
address this (and it's the original reason why I made Exec() always call 
FragmentComplete(), whereas it used to not on failure).


PS6, Line 94: // FragmentComplete() depends on Prepare() having executed 
successfully, so must
: // call report_status_cb_ directly if Prepare() failed.
> Is this only because of the usage of per_host_mem_usage_ in SendReport()? I
Also the profile() is in an "undefined" state.  Let me fix this up.


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

Line 93:   if (!status.ok() && !report_status_cb_.empty()) {
There's a bug here. If PrepareInternal() fails after AcquireThreadToken(), the 
token will never be released.


PS6, Line 94: // FragmentComplete() depends on Prepare() having executed 
successfully, so must
: // call report_status_cb_ directly if Prepare() failed.
Is this only because of the usage of per_host_mem_usage_ in SendReport()? If 
so, you can just check if that is 'nullptr' before using it there.

Or we can call ReleaseThreadToken() here. Whatever seems better.


http://gerrit.cloudera.org:8080/#/c/5250/7/be/src/service/fragment-exec-state.cc
File be/src/service/fragment-exec-state.cc:

PS7, Line 51: prepare_promise_
Can't we just use the PFEs prepared_promise_ ? They both track the same value.


PS7, Line 62: DCHECK(status.ok() || done);  // if !status.ok() => done
Redundant with SendReport()


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 7:

> Uploaded patch set 7.

Fixed a comment about cancellation which, I don't think is true. And added 
comments to document the lifecycle requirements.

-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#7).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 58 insertions(+), 118 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/7
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 5:

(5 comments)

Henry, could you take a look at patchset 6? The changes to address your 
comments were non-trivial to please re-review.

http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS4, Line 294: VLOG_QUERY << "Open(): instance_id=" << 
runtime_state_->fragment_instance_id();
 :   S
> nit: one line
Done


PS4, Line 298: status;
> here it seems clearer to return status.
Oops, yes, done.


PS4, Line 331: {
> I'm still of the opinion that this logic is better placed in FragmentExecSt
I see what you mean now. Yeah, made that change, and also moved the callback 
invocation for when prepare fails into this class, so that at least it's 
consistent. I still think we should remove the callback indirection and 
simplify more, but agree this is a better intermediate state.


PS4, Line 447: 
> the
Done


http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS4, Line 249:  
> Nit: extra space.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#6).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 50 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/6
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-30 Thread Dan Hecht (Code Review)
Hello Henry Robinson,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/5250

to look at the new patch set (#5).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
M be/src/service/fragment-exec-state.cc
M be/src/service/fragment-exec-state.h
4 files changed, 46 insertions(+), 113 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 4: Code-Review+1

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS4, Line 294: if (!status.ok()) {
 :
nit: one line


PS4, Line 298: opened_promise_.Get()
here it seems clearer to return status.


PS4, Line 331: RETURN_IF_ERROR(opened_promise_.Get());
I'm still of the opinion that this logic is better placed in FragmentExecState, 
e.g.:

  if (fragment->Open().OK()) {
fragment->Exec();
  }
  fragment->Close();

and then here you could just DCHECK(opened_promise.IsSet() && 
opened_promise_.Get().OK()); which is a simpler lifecycle invariant. That's in 
keeping with how the PFE lifecycle is managed now, and it seems to make sense 
to have all that logic in one place.

For example, we don't have the same logic in Open() wrt prepare_promise_, which 
we should do for consistency with this method as implemented here.


PS4, Line 447: that
the


Line 489: 
how about DCHECK(!report_thread_active_) here? I'd like to get some guarantee 
that FragmentComplete() was called in all executions it should have been before 
Close().


http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS3, Line 66: setting that flag to 0 disables periodic reporting altogether
> Isn't that what the next sentence is saying? Or am I misunderstanding your 
Oh yeah. I was thrown off by 'disables periodic reporting altogether', which 
while correct was confusing.


http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS4, Line 249:  
Nit: extra space.


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 4:

Henry, I tried doing some of the refactoring we discussed, but they don't seem 
like clear improvements to me because of how intertwined FES and PFE currently 
are.  For example, FES::ReportStatusCb() accesses PFE state directly and calls 
PFE methods like PFE::Cancel(), and so executing that code from either inside 
or after PFE::Close() seems confusing at best.

I do agree further cleanup here between FES and PFE classes is needed, but I 
think without a more wholistic cleanup we don't get anywhere better.  And I 
think the cleanup needs to be driven by the needs of MT or we will just redo it 
shortly.  So, I'd like to move forward with the fix in current form, which I 
think is still a nice improvement over the existing code since we get rid of 
some unnecessary thread-shared state and complexity.  Okay with you?

-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.cc
File be/src/runtime/plan-fragment-executor.cc:

PS3, Line 421: id PlanFragmentExecutor::Sen
nit: This is in 2 places now. Here and ReportStatusCb().


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..


Patch Set 3:

(1 comment)

Patchset 4 addresses the comments that don't involve refactoring, in case we 
want to use this option. I'll try some minor refactoring in the next patchset.

http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.h
File be/src/runtime/plan-fragment-executor.h:

PS3, Line 66: setting that flag to 0 disables periodic reporting altogether
> Update comment to say that at least one report (the final one) will always 
Isn't that what the next sentence is saying? Or am I misunderstanding your 
comment?


-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

2016-11-29 Thread Dan Hecht (Code Review)
Dan Hecht has uploaded a new patch set (#4).

Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status 
reporting
..

IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting

The PlanFragmentExecutor has some state shared between the main
execution thread and the periodic reporting thread that isn't
synchronized properly.  IMPALA-4504 describes one such problem, and that
bug was introduced in an attempt to fix another similar race.

Let's just simplify all of this and remove this shared state.  Instead,
the profile thread will always be responsible for sending periodic
'!done' messages, and the main execution thread will always be
responsible for sending the final 'done' message (after joining the
periodic thread).

This will allow for even more simplification, in particular the
interaction between FragementExecState and PlanFragementExecutor, but
I'm not doing that now as to avoid more conflicts with the MT work.

Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
---
M be/src/runtime/plan-fragment-executor.cc
M be/src/runtime/plan-fragment-executor.h
2 files changed, 41 insertions(+), 96 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5250
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Dan Hecht 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson