[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Reviewed-on: http://gerrit.cloudera.org:8080/3312
Reviewed-by: Lars Volker 
Tested-by: Internal Jenkins
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 74 insertions(+), 10 deletions(-)

Approvals:
  Lars Volker: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-09 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 9: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 9: Code-Review+2

+2 from Dan. Thanks for the reviews.

-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-09 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/8/be/src/util/minidump.cc
File be/src/util/minidump.cc:

Line 94: static void SetupSignalHandler() {
> DCHECK(minidump_exception_handler != NULL);
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-09 Thread Lars Volker (Code Review)
Hello David Knupp, Dan Hecht, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3312

to look at the new patch set (#9).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 74 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/9
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 9
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/8/be/src/util/minidump.cc
File be/src/util/minidump.cc:

Line 94: static void SetupSignalHandler() {
DCHECK(minidump_exception_handler != NULL);


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 8: Code-Review+1

(1 comment)

I think someone else (other than me) needs to give a +2 because of the .cc 
file, yes?

http://gerrit.cloudera.org:8080/#/c/3312/7/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

Line 117:   def wait_for_num_processes(self, daemon, num_expected, timeout=60):
> Yes, my original assumption was that the time we need to write all minidump
Sounds good to me.


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 7:

(2 comments)

Thanks for the reviews, please see PS8.

http://gerrit.cloudera.org:8080/#/c/3312/7/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

Line 117:   def wait_for_num_processes(self, daemon, num_expected, timeout=60):
> This method seems to be doing a bit more than it needs to, which I'm mainly
Yes, my original assumption was that the time we need to write all minidumps 
should never exceed 5 seconds (which was an more of an  educated guess). The 
number of expected processes is at most cluster_size, which is the number of 
local impalads running on the same machine. At 100MB/sec and 2.5MB per file we 
should be able to write about 200 minidumps in 5 seconds, so that looks safe to 
me. However there could be unknown things affecting this time.

Anyways, now there's a wait loop in place. Should I modify it to scale the 
timeout with num_expected. If you're ok I'd leave it at 5sec, as I'd be very 
interested to have pop up as a failure if it ever takes longer than this.


PS7, Line 173: 3
> This could still be replaced with 'cluster_size', right?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread Lars Volker (Code Review)
Hello David Knupp, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3312

to look at the new patch set (#8).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 73 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/8
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 8
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3312/7/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

Line 117:   def wait_for_num_processes(self, daemon, num_expected, timeout=60):
This method seems to be doing a bit more than it needs to, which I'm mainly 
deriving from the fact that it raises two different kinds of exceptions (a 
RuntimeException, and an AssertionError) for essentially the same reason -- 
finding the wrong number of processes.

Since you're already calling calling this method within the context of an 
assert statement (down in your test function) I would consider simplifying 
wait_for_num_processes() so that it merely returns the number of processes it 
finds after trying for the allotted timeout. Then you could give the *caller* 
the responsibility of deciding what to do with that information, e.g., in your 
test, you could do something like:


   assert self.wait_for_num_processes(daemon='impalad', 
num_expected=cluster_size) == cluster_size


Also, for what it's worth, this solution kind of misses the intent of my 
original comment/question, since it basically still just introduces a static 
sleep time (although it does have the possibility of returning sooner if the 
terminating condition of the while loop is met.)

My original question (which no doubt stems from my inexperience with all of the 
testing permutations here) was really more about finding out whether a single 
static timeout would be valid in all possible test scenarios. In other words, 
will we be running this test on varying-sized clusters, or clusters that might 
be deployed in the cloud, versus locally, versus as a mini-cluster on a single 
machine? Does the size of the cluster or where it's deployed even matter? I 
didn't know.

For what it's worth, that aspect could be addressed easily by making the 
default timeout a factor of cluster_size if None is given, e.g.:


  def wait_for_num_processes(self, daemon, num_expected, timeout=None):
  if timeout is None:
  timeout = 5 * num_expected
  end = time.time() + timeout
  etc...


...or something similar.

I'm sorry if this all became more complicated than it deserved to be.


PS7, Line 173: 3
This could still be replaced with 'cluster_size', right?


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread Lars Volker (Code Review)
Hello David Knupp, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3312

to look at the new patch set (#7).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 78 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/7
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 7
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-08 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/6/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS6, Line 121: while time.time() < end:
This loop also needs to sleep some instead of spinning.


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 6: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 6:

Please ignore PS5 and see PS6 instead.

-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS3, Line 149: time.sleep(5)
> Oops -- sorry! I didn't see Tim's comment. If there's consensus that adding
I added a timed loop in PS5. It was a bit of a hassle, as the breakpad forks to 
write minidumps on request, so there were extra processes in the way, even when 
the files had already started to show up. I ended up waiting on the number or 
processes instead, and found IMPALA-3898 on the way. :)


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Lars Volker (Code Review)
Hello David Knupp, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3312

to look at the new patch set (#6).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 76 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/6
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Lars Volker (Code Review)
Hello David Knupp, Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3312

to look at the new patch set (#5).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 77 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/5
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS3, Line 149: time.sleep(5)
> If it's not worth the effort, then I agree - even if it's not ideal, leave 
Oops -- sorry! I didn't see Tim's comment. If there's consensus that adding a 
loop is the right thing to do, then that's obviously fine with me too.


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 4: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/3312/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS3, Line 143: assert self.count_all_minidumps() == 0
> We use self.tmp_dir to write the minidumps to, so it should always be empty
I think it's probably fine to leave as is -- I do see that most of the tests in 
this module follow this same pattern.

For what it's worth, in my previous experience, I've tended to see assertions 
used for failures to meet the passing criteria of the test expectations, and 
other kinds of exceptions for setup / framework / environment errors that had 
nothing to do with the product under test. (e.g. the test framework I worked 
with in the past had its own custom exceptions library for this kind of thing.)

As a general practice, this check probably is a little more defensive than 
necessary. Assuming the call to tempfile.mkdtemp() works, there's no reason to 
believe it would have anything other than 0 files in it. If there were a 
problem creating the temp dir, one presumes that the tempfile module would 
itself throw an exception before this line ever executes.


PS3, Line 149: time.sleep(5)
> I agree with you, this is not optimal. For killing the processes we already
If it's not worth the effort, then I agree - even if it's not ideal, leave it 
as is until it poses a problem.


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS3, Line 149: time.sleep(5)
> I agree with you, this is not optimal. For killing the processes we already
I think David makes a good point. I think it's less effort to add the retry 
loop now than it is to debug a flaky test in the future, especially if we run 
into problems with code coverage.


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3:

(3 comments)

Thank you David for the review. Let me know if those need more explanation in 
the comments.

http://gerrit.cloudera.org:8080/#/c/3312/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS3, Line 143: assert self.count_all_minidumps() == 0
> Is raising an AssertionError the right thing to do if count_all_minidumps()
We use self.tmp_dir to write the minidumps to, so it should always be empty at 
the beginning of a test. I thought of these assertions more as a way to 
document what the expectations of the tests are, plus a last line of defense in 
case something goes wrong, or someone disables using a temporary directory. 
They should never fail, and if they do we can investigate why. Should I add a 
comment or a log message to explain this?


PS3, Line 149: time.sleep(5)
> Bearing in mind that I don't know really anything about how long it takes t
I agree with you, this is not optimal. For killing the processes we already 
have a timed wait loop, and they can sometimes take a long time to terminate, 
especially when writing core dumps, too. Minidumps of short-lived daemons are 
usually around 2.5MB large, so I expect them to be written fairly quickly and 
would be surprised if it ever took longer than 5 seconds. I also thought about 
adding a wait loop here, but then I'd set it to 5 seconds still, just so we get 
notified in case this takes longer. Therefore I figured it's not worth the 
effort for now, and that we should add a loop if 5s ever becomes a problem. Do 
you think I should elaborate on this in a comment?


PS3, Line 152: assert len(self.cluster.impalads) == 3
> Should we always assume that the number of impalads is 3? Since you saved t
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-07 Thread Lars Volker (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/3312

to look at the new patch set (#4).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 53 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/4
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-06 Thread David Knupp (Code Review)
David Knupp has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3312/3/tests/custom_cluster/test_breakpad.py
File tests/custom_cluster/test_breakpad.py:

PS3, Line 143: assert self.count_all_minidumps() == 0
Is raising an AssertionError the right thing to do if count_all_minidumps() is 
not 0 at the very start of the test? Does that just mean that there are 
minidump artifacts left over from a previous test before this one has even 
started? If that's so, it doesn't seem to really be a case of SIGUSR1 failing 
to write minidumps.

Perhaps there should be some kind cleanup process we could implement instead -- 
to clear out any *.dmp files before calling start_cluster()? If that were 
written as, say, a local test fixture or setup method, then perhaps all of the 
tests in this module could use it?


PS3, Line 149: time.sleep(5)
Bearing in mind that I don't know really anything about how long it takes to 
write a minidump, I'm always a bit leery when I see a hard-coded sleep in a 
test. It generally tends to be something of a test anti-pattern. E.g., if we 
ever start to run tests on a much larger cluster, would 5 still suffice? Or is 
5 just such an imaginably long period time that it's guaranteed to always work?

Would it make sense to move the various asserts involving count_minidumps() 
here -- since this is where we expect them to be created -- and instead use 
some kind of loop to keep checking until self.count_minidumps('impalad') 
returns cluster_size, setting some limit to the number of iterations before we 
force the test to fail?

Just asking, by the way. Maybe that's more trouble than it's worth. I'm 
perfectly happy to accept that 5 is the right answer.


PS3, Line 152: assert len(self.cluster.impalads) == 3
Should we always assume that the number of impalads is 3? Since you saved the 
cluster_size to a variable in an earlier line before sending the SIGUSR1, 
wouldn't it work here to assert that len(self.cluster.impalads) is the same 
after the refresh as whatever cluster_size was before?


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: David Knupp 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/3312/2//COMMIT_MSG
Commit Message:

Line 13: Example: killall -SIGUSR1 catalogd
> True. This has happened to me multiple times now. Any ideas what might caus
Not sure. I don't use the gerrit hook so I haven't seen the problem.


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-06 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-01 Thread Lars Volker (Code Review)
Lars Volker has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 2:

(1 comment)

Thanks for the review. I went over the java documentation, too, and could not 
find any reason why we should receive spurious signals. I added the link to the 
commit message. Please see PS3.

http://gerrit.cloudera.org:8080/#/c/3312/2//COMMIT_MSG
Commit Message:

Line 13: Example: killall -SIGUSR1 catalogd
> The Change-Id got put in a weird place
True. This has happened to me multiple times now. Any ideas what might cause 
this to happen?


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-07-01 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#3).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

The hotspot JVM also uses SIGUSR1 internally. However the documentation
explains, that existing signal handlers will be transparently wrapped by
the JVM and no spurious signals should be received by the daemon signal
handler:
http://www.oracle.com/technetwork/java/javase/signals-139944.html

Example: killall -SIGUSR1 catalogd

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 53 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/3
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-06-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..


Patch Set 2:

(1 comment)

It looks like there's some possible interaction with the JVM, which uses the 
signal for implementing a thread interrupt method:

http://www.oracle.com/technetwork/java/javase/signals-139944.html

Based on my reading of the docs, it sounds like we should be ok (the JVM should 
not send us spurious signals SIGUSR1 signals on Linux), but could you also 
double-check the docs?

http://gerrit.cloudera.org:8080/#/c/3312/2//COMMIT_MSG
Commit Message:

Line 13: Example: killall -SIGUSR1 catalogd
The Change-Id got put in a weird place


-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-06-06 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new patch set (#2).

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Example: killall -SIGUSR1 catalogd
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 50 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/2
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 2
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker 


[Impala-CR](cdh5-trunk) IMPALA-3677: Write minidump on SIGUSR1

2016-06-04 Thread Lars Volker (Code Review)
Lars Volker has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3312

Change subject: IMPALA-3677: Write minidump on SIGUSR1
..

IMPALA-3677: Write minidump on SIGUSR1

Sending SIGUSR1 to any of the impala daemons (catalogd, impalad,
statestored) will trigger a minidump write.

Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Example: killall -SIGUSR1 catalogd
---
M be/src/util/minidump.cc
M tests/custom_cluster/test_breakpad.py
2 files changed, 42 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/12/3312/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3312
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I40149e48e391451de21a5c8bda18e2307fc89513
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker