Lars Volker has posted comments on this change. Change subject: IMPALA-2686: Add breakpad crash handler to all daemons ......................................................................
Patch Set 12: (12 comments) Thanks for the review, please see PS13. http://gerrit.cloudera.org:8080/#/c/2028/12/be/src/common/global-flags.cc File be/src/common/global-flags.cc: Line 74: Leave > "Leave" implies that's the default. Maybe "Set to empty.." Good point, done. http://gerrit.cloudera.org:8080/#/c/2028/12/be/src/util/minidump.cc File be/src/util/minidump.cc: Line 49: stdout > and stderr Done Line 57: stdout > stdout/stderr Done Line 57: INFO > INFO/ERROR Done Line 59: snprintf > technically, even snprintf() can allocate memory (and I think it does on so Yes, on this one it shouldn't. I haven't read through all of its implementation, but it looked like it would allocate in scenarios with variable-width characters involved. I switched to calling write several times. Line 89: if (err) continue; > what cases might err by non-zero? This calls status() internally, which will call stat() eventually. Hence all error cases for stat() apply. In any of stat()'s error cases my assumption was that something outside of our control went awry, either a user messing with the files that we wrote sometime in the past, e.g. changing file permissions, or the filesystem went bad. I added a comment to explain in more detail. Should we do anything else in these cases? Here's the full list from man(2) stat: EACCES Search permission is denied for one of the directories in the path prefix of path. (See also path_resolution(7).) EBADF fd is bad. EFAULT Bad address. ELOOP Too many symbolic links encountered while traversing the path. ENAMETOOLONG path is too long. ENOENT A component of path does not exist, or path is an empty string. ENOMEM Out of memory (i.e., kernel memory). ENOTDIR A component of the path prefix of path is not a directory. EOVERFLOW path or fd refers to a file whose size, inode number, or number of blocks can? not be represented in, respectively, the types off_t, ino_t, or blkcnt_t. This error can occur when, for example, an application compiled on a 32-bit platform without -D_FILE_OFFSET_BITS=64 calls stat() on a file whose size exceeds (1<<31)-1 bytes. Line 111: boost::filesystem::remove(minidump_path, err); > why do we use boost here but std::remove on line 97. what's the difference? This one returns an error we want to handle, while std::remove() returns a numeric error code, that we just can ignore. I changed the call in line 97 to use boost::filesystem::remove as well. Line 131: boost::filesystem::remove(to_delete->second, err); > same question Done Line 144: assert > DCHECK Done Line 157: create_directories > FileSystemUtil::CreateDirectory() ? FileSystemUtil::CreateDirectory() will remove the target directory if it exists, which is the opposite of what we want here. I could go and add a flag there to disable this logic, but all that'd be left is the call to create_directories() and wrapping the result in a Status(). After talking to Tim we agreed that we should just use create_directories() here. However I think the name of FileSystemUtil::CreateDirectory() is misleading and should be changed, even more so as it is not clear that it will remove the directory if it already exists. To prevent future confusion I pushed a change to rename the method and added you as a reviewer. http://gerrit.cloudera.org:8080/#/c/2028/12/tests/custom_cluster/test_breakpad.py File tests/custom_cluster/test_breakpad.py: Line 76: time.sleep(1) > can't this be flaky? The cluster that's being killed here had just been started and does not execute any queries, so I'd expect it to die within a second. So far I haven't run into any issues here. Any other reason it could be flaky? Line 100: self.assert_impalad_log_contains('INFO', 'Wrote minidump to ', expected_count=1) > maybe factor these out and also check 'ERROR' for the message. Done -- To view, visit http://gerrit.cloudera.org:8080/2028 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7a37a38488716ffe34296f3490ae291bbb7228d6 Gerrit-PatchSet: 12 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Lars Volker <[email protected]> Gerrit-Reviewer: Casey Ching <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Silvius Rus <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
