[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 12: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 12: Code-Review+2 Rebase. Promote +1 to +2. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 12 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 11: Code-Review+1 (1 comment) Carry +1 http://gerrit.cloudera.org:8080/#/c/5250/11/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 83: per_host_mem_usage_(NULL), later patchset depended on this being NULL but it was left uninitialized which can sometimes lead to a crash, so fixing this. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#11). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 67 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/11 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 11 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS9, Line 94: prepared_promise_.Set(status); > How about moving this above line 93, so anyone waiting on it won't have to Yeah, I thought the semantics of the promise were clearer to do it last (it's a barrier to the completion of Prepare()), but I don't feel too strongly about it so I've reversed it. http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS9, Line 109: > "Open() fails" Done -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#10). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 63 insertions(+), 121 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/10 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 10 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Henry Robinson has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 9: Code-Review+1 (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS9, Line 94: prepared_promise_.Set(status); How about moving this above line 93, so anyone waiting on it won't have to wait for the report RPC to complete? http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS9, Line 109: "Open() fails" -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 9: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 93: if (!status.ok()) FragmentComplete(status); > Yeah, but that's also how the old code worked. It's benign since in case o Changed this up a bit so we still call FragmentComplete() here. http://gerrit.cloudera.org:8080/#/c/5250/9/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 488: DCHECK(!report_thread_active_); this patch was mostly a rebase, but also added these DCHECKs. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#9). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 62 insertions(+), 120 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/9 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 8: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#8). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 59 insertions(+), 120 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/8 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 8 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 7: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/7/be/src/service/fragment-exec-state.cc File be/src/service/fragment-exec-state.cc: PS7, Line 51: prepare_promise_ > Can't we just use the PFEs prepared_promise_ ? They both track the same val Yes, if we expose a timeout interface from PFE. Given this is preexisting and these classes will probably need major refactoring for MT soon, I'd rather not muck with it now. PS7, Line 62: DCHECK(status.ok() || done); // if !status.ok() => done > Redundant with SendReport() Yes, but it is a precondition to both, and since this is the dcheck that caught the bug in the first place, I'm inclined to leave it. (there's no reason there can't be other callers to this method, and SendReport() doesn't always call this-- well it does after the coordinator fragment change, but the code hasn't been fully cleaned up to reflect that yet). -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 6: (2 comments) http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 93: if (!status.ok() && !report_status_cb_.empty()) { > There's a bug here. If PrepareInternal() fails after AcquireThreadToken(), Yeah, but that's also how the old code worked. It's benign since in case of failure the resource pool will go away shortly, but I agree it's best to address this (and it's the original reason why I made Exec() always call FragmentComplete(), whereas it used to not on failure). PS6, Line 94: // FragmentComplete() depends on Prepare() having executed successfully, so must : // call report_status_cb_ directly if Prepare() failed. > Is this only because of the usage of per_host_mem_usage_ in SendReport()? I Also the profile() is in an "undefined" state. Let me fix this up. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/5250/6/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: Line 93: if (!status.ok() && !report_status_cb_.empty()) { There's a bug here. If PrepareInternal() fails after AcquireThreadToken(), the token will never be released. PS6, Line 94: // FragmentComplete() depends on Prepare() having executed successfully, so must : // call report_status_cb_ directly if Prepare() failed. Is this only because of the usage of per_host_mem_usage_ in SendReport()? If so, you can just check if that is 'nullptr' before using it there. Or we can call ReleaseThreadToken() here. Whatever seems better. http://gerrit.cloudera.org:8080/#/c/5250/7/be/src/service/fragment-exec-state.cc File be/src/service/fragment-exec-state.cc: PS7, Line 51: prepare_promise_ Can't we just use the PFEs prepared_promise_ ? They both track the same value. PS7, Line 62: DCHECK(status.ok() || done); // if !status.ok() => done Redundant with SendReport() -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 7: > Uploaded patch set 7. Fixed a comment about cancellation which, I don't think is true. And added comments to document the lifecycle requirements. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#7). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 58 insertions(+), 118 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/7 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 5: (5 comments) Henry, could you take a look at patchset 6? The changes to address your comments were non-trivial to please re-review. http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS4, Line 294: VLOG_QUERY << "Open(): instance_id=" << runtime_state_->fragment_instance_id(); : S > nit: one line Done PS4, Line 298: status; > here it seems clearer to return status. Oops, yes, done. PS4, Line 331: { > I'm still of the opinion that this logic is better placed in FragmentExecSt I see what you mean now. Yeah, made that change, and also moved the callback invocation for when prepare fails into this class, so that at least it's consistent. I still think we should remove the callback indirection and simplify more, but agree this is a better intermediate state. PS4, Line 447: > the Done http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS4, Line 249: > Nit: extra space. Done -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#6). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 50 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/6 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Hello Henry Robinson, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/5250 to look at the new patch set (#5). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h M be/src/service/fragment-exec-state.cc M be/src/service/fragment-exec-state.h 4 files changed, 46 insertions(+), 113 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/5 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Henry Robinson has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 4: Code-Review+1 (7 comments) http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS4, Line 294: if (!status.ok()) { : nit: one line PS4, Line 298: opened_promise_.Get() here it seems clearer to return status. PS4, Line 331: RETURN_IF_ERROR(opened_promise_.Get()); I'm still of the opinion that this logic is better placed in FragmentExecState, e.g.: if (fragment->Open().OK()) { fragment->Exec(); } fragment->Close(); and then here you could just DCHECK(opened_promise.IsSet() && opened_promise_.Get().OK()); which is a simpler lifecycle invariant. That's in keeping with how the PFE lifecycle is managed now, and it seems to make sense to have all that logic in one place. For example, we don't have the same logic in Open() wrt prepare_promise_, which we should do for consistency with this method as implemented here. PS4, Line 447: that the Line 489: how about DCHECK(!report_thread_active_) here? I'd like to get some guarantee that FragmentComplete() was called in all executions it should have been before Close(). http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS3, Line 66: setting that flag to 0 disables periodic reporting altogether > Isn't that what the next sentence is saying? Or am I misunderstanding your Oh yeah. I was thrown off by 'disables periodic reporting altogether', which while correct was confusing. http://gerrit.cloudera.org:8080/#/c/5250/4/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS4, Line 249: Nit: extra space. -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 4: Henry, I tried doing some of the refactoring we discussed, but they don't seem like clear improvements to me because of how intertwined FES and PFE currently are. For example, FES::ReportStatusCb() accesses PFE state directly and calls PFE methods like PFE::Cancel(), and so executing that code from either inside or after PFE::Close() seems confusing at best. I do agree further cleanup here between FES and PFE classes is needed, but I think without a more wholistic cleanup we don't get anywhere better. And I think the cleanup needs to be driven by the needs of MT or we will just redo it shortly. So, I'd like to move forward with the fix in current form, which I think is still a nice improvement over the existing code since we get rid of some unnecessary thread-shared state and complexity. Okay with you? -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.cc File be/src/runtime/plan-fragment-executor.cc: PS3, Line 421: id PlanFragmentExecutor::Sen nit: This is in 2 places now. Here and ReportStatusCb(). -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Sailesh Mukil Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has posted comments on this change. Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. Patch Set 3: (1 comment) Patchset 4 addresses the comments that don't involve refactoring, in case we want to use this option. I'll try some minor refactoring in the next patchset. http://gerrit.cloudera.org:8080/#/c/5250/3/be/src/runtime/plan-fragment-executor.h File be/src/runtime/plan-fragment-executor.h: PS3, Line 66: setting that flag to 0 disables periodic reporting altogether > Update comment to say that at least one report (the final one) will always Isn't that what the next sentence is saying? Or am I misunderstanding your comment? -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting
Dan Hecht has uploaded a new patch set (#4). Change subject: IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting .. IMPALA-4504: fix races in PlanFragmentExecutor regarding status reporting The PlanFragmentExecutor has some state shared between the main execution thread and the periodic reporting thread that isn't synchronized properly. IMPALA-4504 describes one such problem, and that bug was introduced in an attempt to fix another similar race. Let's just simplify all of this and remove this shared state. Instead, the profile thread will always be responsible for sending periodic '!done' messages, and the main execution thread will always be responsible for sending the final 'done' message (after joining the periodic thread). This will allow for even more simplification, in particular the interaction between FragementExecState and PlanFragementExecutor, but I'm not doing that now as to avoid more conflicts with the MT work. Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e --- M be/src/runtime/plan-fragment-executor.cc M be/src/runtime/plan-fragment-executor.h 2 files changed, 41 insertions(+), 96 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/5250/4 -- To view, visit http://gerrit.cloudera.org:8080/5250 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I052b7b4fabb341ad27ad294cd5b0a53728d87d0e Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Dan HechtGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Henry Robinson