[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 8: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 15 Feb 2018 21:58:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-15 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id, and also stores the
name and system thread id of its parent.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record the name (and a system thread id) of their parent,
it might be also possible to reconstruct the thread creation
graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Reviewed-on: http://gerrit.cloudera.org:8080/9053
Reviewed-by: Dan Hecht 
Tested-by: Impala Public Jenkins
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 75 insertions(+), 25 deletions(-)

Approvals:
  Dan Hecht: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 9
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 8: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 8
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 15 Feb 2018 18:14:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-15 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 15 Feb 2018 18:14:38 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-15 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 7:

Fixed the clang-tidy warnings.


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 15 Feb 2018 13:28:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-15 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9053

to look at the new patch set (#7).

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id, and also stores the
name and system thread id of its parent.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record the name (and a system thread id) of their parent,
it might be also possible to reconstruct the thread creation
graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 75 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/7
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 7
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 6: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/1938/


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 15 Feb 2018 00:56:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/1938/


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:11:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 6
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:11:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 5: Code-Review+2

Thanks, I think this makes sense.


-- 
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 21:10:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Tim Armstrong, Dan Hecht,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9053

to look at the new patch set (#5).

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id, and also stores the
name and system thread id of its parent.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record the name (and a system thread id) of their parent,
it might be also possible to reconstruct the thread creation
graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 74 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/5
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-14 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115
PS4, Line 115: system_thread_id_
> Are you saying that thread_debug_info_ pointer may point to stray
 > memory?

Yes, I think that can potentially happen.

 > If that's the case, maybe we shouldn't keep that pointer
 > but instead have a way to traverse TDIs and find it if still
 > exists?

We can use the system thread id to find the parent thread and then its TDI if 
it still exists.

I also added a copy of the parent thread name. Even if the parent thread exits, 
at least we will now what was its name.
We can also check if parent thread name equals to the TDI's thread name that we 
found via the system thread id. If they're not the same, we'll know that the 
parent thread was re-cycled in some way.


http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119
PS4, Line 119:   boost::thread::id boost_thread_id_;
 :   int64_t system_thread_id_ = 0;
> I'm not sure what you mean by "switch to the parent thread
 > quickly".

I was just thinking of a GDB session. GDB prints out the system thread id for 
each thread, so we can traverse the threads quickly if we know the ID. It can 
be also useful if we write a script for that and we don't want to iterate 
through all the threads.

 > I don't think I fully understand the tradeoffs.  Just seems
 > confusing to have multiple IDs and I don't understand how each of
 > these fields helps in debugging.  Perhaps some comments attached to
 > them explaining how one can use them for debugging would help?

I removed boost::thread::id, and added the parent's thread name. I think it is 
more clear, and knowing the parent thread name can be useful on its own. And we 
can also use it to verify the parent thread.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 14 Feb 2018 17:32:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-12 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115
PS4, Line 115: system_thread_id_
> I think it is possible that the parent TDI object gets destroyed by the tim
Are you saying that thread_debug_info_ pointer may point to stray memory?  If 
that's the case, maybe we shouldn't keep that pointer but instead have a way to 
traverse TDIs and find it if still exists?


http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119
PS4, Line 119:   boost::thread::id boost_thread_id_;
 :   int64_t system_thread_id_ = 0;
> boost::thread::id is less prone to being recycled, but I think this propert
I'm not sure what you mean by "switch to the parent thread quickly". 

I don't think I fully understand the tradeoffs.  Just seems confusing to have 
multiple IDs and I don't understand how each of these fields helps in 
debugging.  Perhaps some comments attached to them explaining how one can use 
them for debugging would help?



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 12 Feb 2018 19:45:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-12 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115
PS4, Line 115: system_thread_id_
> if that matches thread_debug_info_->system_thread_id_, why store it here al
I think it is possible that the parent TDI object gets destroyed by the time 
the minidump is generated, but the parent thread is still running.

Maybe the parent thread is a member of a thread pool, and it didn't wait for 
its child thread to finish and now it is already working on something else. 
From the system thread id at least we'll know which thread pool created the 
child thread.


http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119
PS4, Line 119:   boost::thread::id boost_thread_id_;
 :   int64_t system_thread_id_ = 0;
> are these related? are they both needed for debugging or is it possible to
boost::thread::id is less prone to being recycled, but I think this property 
doesn't add much value currently.

If we had stored the boost::thread::id in ParentInfo, then we could have some 
use of it. E.g. if the parent TDI is invalid we could use the system thread id 
to switch to the parent thread quickly (if it's still running), then with some 
luck we can check the boost thread id (maybe with the help of a new TDI), and 
from that we could tell if this thread is really the same thread that created 
the child.

So, we have two options here, delete the use of boost::thread::id, or use it 
more extensively by adding it to ParentInfo as well. Do you have a preference?



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 12 Feb 2018 13:35:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-09 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@115
PS4, Line 115: system_thread_id_
if that matches thread_debug_info_->system_thread_id_, why store it here also?


http://gerrit.cloudera.org:8080/#/c/9053/4/be/src/common/thread-debug-info.h@119
PS4, Line 119:   boost::thread::id boost_thread_id_;
 :   int64_t system_thread_id_ = 0;
are these related? are they both needed for debugging or is it possible to 
standardize on one?



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 09 Feb 2018 21:47:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 4: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info.h@114
PS3, Line 114: const ThreadDebugInfo* thread_debug_info_ = nullptr;
> Could you elaborate more on that?
You're right, I overlooked that the strings have a fixed length.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 09 Feb 2018 17:00:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-09 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9053

to look at the new patch set (#4).

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id, and also stores a
pointer and system thread id to its parent.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record a pointer (and a system thread id) to their parent,
it might be also possible to reconstruct the thread creation
graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 82 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/4
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-09 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 3:

(4 comments)

thanks!

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc
File be/src/common/thread-debug-info-test.cc:

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc@70
PS3, Line 70: S
> nit: lowercase S
woops :) Done.


http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc@72
PS3, Line 72:   // Child should set its parent_thread_name_ to the parent's 
thread_name_
> Stale comment?
yes, deleted it and added comments about the new behavior


http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc@75
PS3, Line 75: parent
> parent_name for consistency with parent_* below? That would make the EXPECT
Renamed it to parent_name.
Also renamed 'child' below to 'child_name'.


http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info.h@114
PS3, Line 114: const ThreadDebugInfo* thread_debug_info_ = nullptr;
> Nit: I'm not sure how helpful it would be, but you could order everything i
Could you elaborate more on that?
To my understanding all data member has fix width. However, we could think of 
the fix-sized arrays in a way that they contain variable-length data. But, they 
are (instance_id_ and thread_name_) already at the end of the TDI class.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 09 Feb 2018 16:47:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 3:

(4 comments)

Mostly looks good to me, had a few minor comments.

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc
File be/src/common/thread-debug-info-test.cc:

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc@70
PS3, Line 70: S
nit: lowercase S


http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc@72
PS3, Line 72:   // Child should set its parent_thread_name_ to the parent's 
thread_name_
Stale comment?


http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info-test.cc@75
PS3, Line 75: parent
parent_name for consistency with parent_* below? That would make the EXPECT_EQ 
easier to read, too.


http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/3/be/src/common/thread-debug-info.h@114
PS3, Line 114: const ThreadDebugInfo* thread_debug_info_ = nullptr;
Nit: I'm not sure how helpful it would be, but you could order everything in 
this class and the parent to have fixed width data come before variable length 
data. That way it is possible to pry it out of the binary with a debugger. Feel 
free to ignore if you think that's unnecessary.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Wed, 07 Feb 2018 18:29:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-02 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@79
PS2, Line 79: ExtractInfoFromParent
> Extract... reads like something will be returned. How about "SetParentInfo"
yeah I like that. Done.


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@80
PS2, Line 80: if (parent != nullptr) {
> We usually early return instead of scoping the whole function, i.e. if (par
Done


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@93
PS2, Line 93: t
> nit: capital T
Done


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@121
PS2, Line 121:   struct ParentTdi {
> Can you add a comment to this struct, too? The test has a ThreadDebugInfo p
Added comment, and renamed it to "ParentInfo".


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@124
PS2, Line 124: char thread_name_[THREAD_NAME_SIZE] = {};
> I wonder if we can streamline the selective duplication of the parent threa
I chose the first alternative, i.e. to store the system thread id and a pointer 
to the parent TDI.

If it is a common case that the parent exits, we can switch to the second 
alternative, but also adding a pointer to the parent for convenience.


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@128
PS2, Line 128:
> nit: I think this newline and the next one don't add much to readability.
Done


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/util/thread.h
File be/src/util/thread.h:

http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/util/thread.h@191
PS2, Line 191: const ThreadDebugInfo* parent_thread_info
> This should go to the front (behind category or functor) now since it's str
oh I see. I placed it behind functor.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 02 Feb 2018 14:38:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-02 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9053

to look at the new patch set (#3).

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id and also saves its
parent's name.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record their parent's name, it might be also possible
to reconstruct the thread creation graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 81 insertions(+), 25 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/3
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-01 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 2:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@79
PS2, Line 79: ExtractInfoFromParent
Extract... reads like something will be returned. How about "SetParentInfo"?


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@80
PS2, Line 80: if (parent != nullptr) {
We usually early return instead of scoping the whole function, i.e. if (parent 
== nullptr) return;


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@93
PS2, Line 93: t
nit: capital T


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@121
PS2, Line 121:   struct ParentTdi {
Can you add a comment to this struct, too? The test has a ThreadDebugInfo 
parent_tdi. Maybe "ParentThreadInfo" or "ParentInfo" would be more clear?


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@124
PS2, Line 124: char thread_name_[THREAD_NAME_SIZE] = {};
I wonder if we can streamline the selective duplication of the parent thread's 
data here.

Could we instead store two things: a numerical ID for minidump based analysis 
that allows to find the next thread to inspect. This would likely be the system 
thread id. Secondly we could store a pointer to the parent thread's info. 
However, that would break if the parent exits before the child.

As an alternative, can we wrap the relevant fields in some internal struct and 
then have two members, "S thread_info;" and "S parent_info;"? This way we keep 
all data twice, but we know for sure that we can traverse the thread ancestry 
two levels deep, even if the parent has exited.


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/common/thread-debug-info.h@128
PS2, Line 128:
nit: I think this newline and the next one don't add much to readability.


http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/util/thread.h
File be/src/util/thread.h:

http://gerrit.cloudera.org:8080/#/c/9053/2/be/src/util/thread.h@191
PS2, Line 191: const ThreadDebugInfo* parent_thread_info
This should go to the front (behind category or functor) now since it's 
strictly an input parameter and we order them as inputs, then output.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 01 Feb 2018 22:54:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-01 Thread Zoltan Borok-Nagy (Code Review)
Hello Lars Volker, Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/9053

to look at the new patch set (#2).

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id and also saves its
parent's name.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record their parent's name, it might be also possible
to reconstruct the thread creation graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 99 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/2
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-02-01 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 1:

(6 comments)

Thanks for the comments.
I checked the callsites of Thread::Create() and the name is always set 
explicitly. If it weren't, then Thread::SuperviseThread() would create a unique 
thread name with the system thread id.
Therefore, as far as parent threads outlive their child threads we can restore 
the full thread trees.
I think this "2 gen by Instance Worker Foo" might be more relevant for the 
general purpose thread pools where we just submit work items and don't set any 
names at all.

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/common/thread-debug-info.h@70
PS1, Line 70:   void ExtractInfoFromParent(const ThreadDebugInfo* parent) {
> Have you considered storing the parent thread ID (possibly both system and
Yeah it's a good idea. I created a nested struct called ParentTdi for the 
parent fields, to keep things organized.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/common/thread-debug-info.h@71
PS1, Line 71: DCHECK(parent != nullptr);
> Is there a benefit of not allowing nullptr vs. allowing and ignoring it? Th
No, I think we can allow and ignore nullptrs.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/exec/hdfs-scan-node.cc@a353
PS1, Line 353:
> My understanding is that this is no longer needed because the information i
Yes. I checked all these places with GDB to test if instance id propagates to 
the child TDI.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.h
File be/src/util/thread.h:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.h@183
PS1, Line 183:   static void SuperviseThread(const string& name, const string& 
category,
> Our convention is usually to keep the std:: namespace prefix in header file
Oops, I think I've copied the new signature from the .cc

I got wonder how did it compile, then I realized that I've included 
common/names.h in thread-debug-info.h. Removed the include.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.h@185
PS1, Line 185: parent_thread_info
> The comment should explain what this parameter does (and whether it can be
Added comment, and also made the parameter a pointer to const.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.cc
File be/src/util/thread.cc:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.cc@350
PS1, Line 350:   if (parent_thread_info) 
thread_debug_info.ExtractInfoFromParent(parent_thread_info);
> We usually make the comparison to != nullptr explicit to make it more clear
Removed the check since ExtractInfoFromParent() now accepts nullptrs as well. 
And since it is an inline function we don't save the cost of a function call 
with this check.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Thu, 01 Feb 2018 13:28:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-01-24 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9053 )

Change subject: IMPALA-6416: extend Thread::Create to track instance id
..


Patch Set 1:

(6 comments)

I left some inline comments. I feel it would be better to find a way to 
propagate thread names indefinitely, compared to propagating them for one 
generation (which is my understanding of this patch). Currently it allows us to 
remove the call to the explicit setter in the scan node, but one generation 
further the same mechanism will not work anymore. Maybe we could think of 
deriving names automatically (if they are not explicitly set), e.g. by 
prepending the parent name with an increasing counter. If "Instance Worker Foo" 
creates a child, then its name could be "1 gen by Instance Worker Foo", and if 
this child spawns a thread w/out an explicit name, it could be called "2 gen by 
Instance Worker Foo" or similar. This way we could programatically capture full 
thread trees.

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/common/thread-debug-info.h
File be/src/common/thread-debug-info.h:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/common/thread-debug-info.h@70
PS1, Line 70:   void ExtractInfoFromParent(const ThreadDebugInfo* parent) {
Have you considered storing the parent thread ID (possibly both system and 
boost) in the ThreadDebugInfo, too? While every thread had its own Info it 
seemed superflous to store it again, but for the parent threads it might be 
helpful, especially when thinking of programmatic ways to analyze the thread 
graph.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/common/thread-debug-info.h@71
PS1, Line 71: DCHECK(parent != nullptr);
Is there a benefit of not allowing nullptr vs. allowing and ignoring it? The 
latter could make the calling code a bit more concise, but I may be missing 
something.

If you keep the requirement that parent != nullptr, you could consider making 
this a static factory that creates and pre-populates a TDI.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/exec/hdfs-scan-node.cc@a353
PS1, Line 353: 
My understanding is that this is no longer needed because the information is 
now kept in the parent TDI. Is this correct?


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.h
File be/src/util/thread.h:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.h@183
PS1, Line 183:   static void SuperviseThread(const string& name, const string& 
category,
Our convention is usually to keep the std:: namespace prefix in header files 
but drop it in .cc files.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.h@185
PS1, Line 185: parent_thread_info
The comment should explain what this parameter does (and whether it can be 
null), especially since we usually pass input parameters as const& and output 
parameters as non-const ptrs.


http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.cc
File be/src/util/thread.cc:

http://gerrit.cloudera.org:8080/#/c/9053/1/be/src/util/thread.cc@350
PS1, Line 350:   if (parent_thread_info) 
thread_debug_info.ExtractInfoFromParent(parent_thread_info);
We usually make the comparison to != nullptr explicit to make it more clear 
what is happening.



--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 25 Jan 2018 00:24:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6416: extend Thread::Create to track instance id

2018-01-18 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/9053


Change subject: IMPALA-6416: extend Thread::Create to track instance id
..

IMPALA-6416: extend Thread::Create to track instance id

This commit builds upon IMPALA-3703. Each thread that
was created through Thread::Create() has a ThreadDebugInfo
object on the stack frame of Thread::SuperviseThread().
This object has stack allocated char buffers that can be
read during a debug session even if we only have minidumps.

However, with the old solution ThreadDebugInfo::instance_id
was set manually for each thread. It is too easy to forget
to set instance_id every time we create a new thread.

This commit has the assumption that if a thread has an
instance id associated, then the threads spawned by it will
always work on the same instance id. In Thread::StartThread
the parent thread passes its ThreadDebugInfo object to
its child who copies the instance id and also saves its
parent's name.

This means if we set ThreadDebugInfo::instance_id in some
"root thread", then all descendant threads will annotate
themselves with the instance id automatically. Since threads
also record their parent's name, it might be also possible
to reconstruct the thread creation graph.

With GDB I tested if it copies the instance id at every
place where we previously needed to set it manually.

I added an automated test to thread-debug-info-test.cc

Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
---
M be/src/common/thread-debug-info-test.cc
M be/src/common/thread-debug-info.cc
M be/src/common/thread-debug-info.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/hdfs-scan-node.cc
M be/src/runtime/fragment-instance-state.cc
M be/src/util/thread.cc
M be/src/util/thread.h
8 files changed, 65 insertions(+), 28 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/53/9053/1
--
To view, visit http://gerrit.cloudera.org:8080/9053
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I27de2962cf0b224c17b685d77dcba3bf2e9db187
Gerrit-Change-Number: 9053
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy