Mike Percy has posted comments on this change. Change subject: Add Google Breakpad support to Kudu ......................................................................
Patch Set 3: (46 comments) http://gerrit.cloudera.org:8080/#/c/5620/3//COMMIT_MSG Commit Message: Line 7: Add Google Breakpad support to Kudu > What does breakpad support mean for regular core dumps? Are they no longer Done PS3, Line 12: part > Nit: is part Done PS3, Line 17: critical crash information > Since your audience here is Kudu developers, it would be nice to add a touc It's well documented; I'll add a link to the docs. Line 21: the default log directory for a given Kudu daemon process. > Why was the log directory chosen as a suitable default home? Added a note to the commit message. Line 36: > It would be good to include a section on how-to go about debugging the mini I added a link to the breakpad docs, which explain the process. But we should also follow up with formal used documentation about use of breakpad in Kudu for the next release. http://gerrit.cloudera.org:8080/#/c/5620/3/src/kudu/integration-tests/minidump_generation-itest.cc File src/kudu/integration-tests/minidump_generation-itest.cc: PS3, Line 21: #include <glog/logging.h> > nit: is it needed? Not anymore. Removed Line 21: #include <glog/logging.h> > nit: consider adding <gtest/gtest.h> I don't think we need that. We get it from the base class. PS3, Line 22: #include <glog/stl_logging.h> > nit: is it needed? Removed PS3, Line 24: #include "kudu/gutil/strings/util.h" > nit: is it needed? Removed PS3, Line 26: #include "kudu/util/logging.h" > nit: is it needed? Removed Line 28: #include "kudu/util/subprocess.h" > nit: consider including "kudu/util/monotime.h" (for MonoDelta) Done Line 66: > You might want to add a test for writing minidumps on SIGUSR1, especially s Good idea. Done. http://gerrit.cloudera.org:8080/#/c/5620/3/src/kudu/master/master_main.cc File src/kudu/master/master_main.cc: PS3, Line 60: argv[0] > nit: consider adding a var for argv[0] and using it around instead. Especially since it is read-only, I personally find passing in argv[0] to make the intention clearer. http://gerrit.cloudera.org:8080/#/c/5620/3/src/kudu/util/env_posix.cc File src/kudu/util/env_posix.cc: Line 1246: return IOError(Substitute("Unable to canonicalize $0", path), errno); > Maybe this was intended for the path_util patch? Doesn't really matter thou It could go either way. It doesn't really belong to anything, it's just a minor improvement. http://gerrit.cloudera.org:8080/#/c/5620/3/src/kudu/util/minidump.cc File src/kudu/util/minidump.cc: Line 48: DEFINE_bool(enable_minidumps, true, "Whether to enable minidump collection upon process crash"); > Should also tag it with evolving, I think. Done Line 50: TAG_FLAG(enable_minidumps, runtime); > How is this safe for changing at runtime? If the initial value was false, R Yes, that changed during development. Removed. PS3, Line 56: nly be written when a daemon exits " : "unexpectedly, > This is actually not true anymore (but used to be), even for Impala, since Thanks for the catch. Done PS3, Line 57: Set to empty to " : "disable writing minidump files. > Agreed with Alexey; having two ways to disable minidumps (enable_minidumps= You guys bring up a good point. I took your advice, Adar, thanks for the suggestion. PS3, Line 62: Set to 0 to keep all minidump files > Shouldn't we allow 0 for "delete all outstanding minidumps at startup", and After discussing this with Lars and due to plans for adding a minidump reaper thread, if we changed to the suggested semantics then enable_minidumps=true with max_minidumps=0 would be nonintuitive and appear broken, since the deleter thread would remove the dumps created via SIGUSR1. So I think it's better to keep this flag the way it is. Line 76: static google_breakpad::ExceptionHandler* minidump_exception_handler = nullptr; > Our convention is to prefix global variables with g_. Done PS3, Line 84: by breakpad > Nit: drop this; it's implied by "Callback for breakpad". Done PS3, Line 85: It logs the event before breakpad crashes : // the process > Oh, so breakpad crashes the process now? Previously glog was doing that for I hadn't considered this, but it turned out to be an issue. Per IMPALA-3656, Breakpad is supposed to work with glog 0.3.4 but I had to backport a patch from glog master to get CHECK() to both display a stack trace and generate a minidump. I'll have to check with Lars about whether they had to do the same backport. In any case, was able to address the previous issues with this. Now, we get both a minidump and a stack trace on a CHECK() or a fatal signal. PS3, Line 97: sizeof(msg) / sizeof(msg[0]) - 1; > If we're ok calling into my_strlen() for the path, let's use it here too fo Done PS3, Line 99: libc > Would be good to say _why_ we shouldn't call into libc here. Done Line 118: minidump_exception_handler->WriteMinidump(FLAGS_minidump_path, DumpCallback, nullptr); > You sure this is safe to call from a signal handler? The docs (https://chro This is the USR1 signal handler. I'll rename it and add some comments to clarify. Line 137: Env* env = Env::Default(); > You can get this from the caller. Done PS3, Line 139: // Search for minidumps. There could be multiple minidumps for a single second. : multimap<int, string> timestamp_to_path; > FWIW, I think the approach used in the glog cleanup thread is nicer: a vect sgtm, done Line 147: unique_ptr<SequentialFile> in; > I think this is overkill. We don't verify that files that with glog-looking This code was borrowed from Impala, and I tend to agree with you that simpler is probably better. I reused the logic from the glog cleaner thread. Line 158: constexpr int kHeaderSize = sizeof(MDRawHeader); > So breakpad doesn't namespace its classes either? Would have expected this yup. not namespaced. PS3, Line 160: Slice > Do we explicitly want to include #include "kudu/util/slice.h" for this ? Done PS3, Line 164: Attempting to delete it > ok that makes sense, perhaps we could just reorder the error as "failed to I simplified this whole bit of logic. Now, if we can't delete the excess files on startup, it will be a fatal error. Otherwise, it generates a warning that blocks rotation of additional files. Line 203: static bool registered = false; > Maybe you can use a google once on minidump_exception_handler instead? I thought about it (this code came from Impala), and we could, but this is process-wide initialization stuff, so we don't need to worry about thread safety. Line 207: if (!FLAGS_enable_minidumps || FLAGS_minidump_path.empty()) return Status::OK(); > If you skip registering the signal handler for SIGUSR1 here and someone sen Thanks for the catch. Fixed. PS3, Line 215: if (!env->FileExists(FLAGS_minidump_path)) { : RETURN_NOT_OK(env->CreateDir(FLAGS_minidump_path)); : } > Maybe CreateDirRecursively() here, so that it probably handles symlinks and I intended to use CreateDirRecursively() here, which is why I wrote it, but Todd had a problem with it in a previous review. I tend to agree with you on this. I'll ask Todd to chime back in here. PS3, Line 221: Impala > I think the intent was to explain that this behavior (in kudu/util/minidump Adar is correct PS3, Line 225: // Create the directory if it is not there. The minidump doesn't get written if there is : // no directory. : if (!env->FileExists(FLAGS_minidump_path)) { : RETURN_NOT_OK(env->CreateDir(FLAGS_minidump_path)); : } > Here too? Maybe combine them into one CreateDirRecursively() call, which is See above; that's how I originally wrote it in CR1 PS3, Line 242: because usually only process crashes will trigger the : // creation of new minidump files (they are also created on SIGUSR1). > As you say, SIGUSR1 breaks that guarantee. What about doing what we did for Done PS3, Line 252: << "." > Why the period at the end? Seems unusual for a Kudu LOG statement. Done PS3, Line 259: desc, FilterCallback, DumpCallback, nullptr, true, -1); > Would be nice to reformat this so that one arg is on each line, then you ca Done http://gerrit.cloudera.org:8080/#/c/5620/3/src/kudu/util/minidump.h File src/kudu/util/minidump.h: Line 28: Status RegisterMinidump(const std::string& cmd_line_path); > If this is just a way to get the executable's argv[0], you can use gflags' Done and done. http://gerrit.cloudera.org:8080/#/c/5620/3/thirdparty/build-definitions.sh File thirdparty/build-definitions.sh: Line 563: make -j$PARALLEL install > Add $EXTRA_MAKEFLAGS too; see above. Done Line 573: xargs perl -p -i -e '@pre = qw(client common google_breakpad processor third_party); for $p (@pre) { s{#include "$p/}{#include "breakpad/$p/}; }' > Could you add perl to REQUIRED_TOOLS in preflight.py? This is the second us Done http://gerrit.cloudera.org:8080/#/c/5620/3/thirdparty/build-thirdparty.sh File thirdparty/build-thirdparty.sh: PS3, Line 284: F_ALL > Agreed; did you mean F_UNINSTRUMENTED here and F_TSAN below? Yeah, bad rebase. Fixed. http://gerrit.cloudera.org:8080/#/c/5620/3/thirdparty/download-thirdparty.sh File thirdparty/download-thirdparty.sh: Line 244: > nit: an extra-line, seems like a random change Done http://gerrit.cloudera.org:8080/#/c/5620/3/thirdparty/scripts/make-breakpad-src-archive.sh File thirdparty/scripts/make-breakpad-src-archive.sh: Line 37: VERSION=${1:-$(git rev-parse HEAD)} > Nit: call this REVISION to be consistent with LSS_REVISION (and to make it Done Line 41: git archive --format=tar --prefix=$NAME/ $VERSION | (cd .. && tar xf -) > Is this a fancy way to copy and rename the tree to another location? Yeah... we export only checked-in files, without the git metadata, and construct the expected directory structure so we can run the build. Then we tar it up. I'll add some comments. -- To view, visit http://gerrit.cloudera.org:8080/5620 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I495695cc38b75377f20b0497093a81ed5baa887f Gerrit-PatchSet: 3 Gerrit-Project: kudu Gerrit-Branch: master Gerrit-Owner: Mike Percy <[email protected]> Gerrit-Reviewer: Adar Dembo <[email protected]> Gerrit-Reviewer: Alexey Serbin <[email protected]> Gerrit-Reviewer: Dinesh Bhat <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Mike Percy <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-HasComments: Yes
