Taras Bobrovytsky has posted comments on this change. Change subject: Add Breakpad minidump collection script ......................................................................
Patch Set 4: (13 comments) http://gerrit.cloudera.org:8080/#/c/2997/4/bin/collect_minidumps.py File bin/collect_minidumps.py: Line 15: # This script is to be called by Cloudera Manager to collect Breakpad minidump files up to > Can you provide a reasonable example invokation of the script that CM might Done Line 17: # We try to fit as many files as possible into the tarball until a limit is reached. > until a size limit is reached. Done Line 27: '''This is a generic class that makes a tarball out of a files in the source_dir. > typo: all files in the source_dir? Not sure if "all" is the right word either. I just removed it. Line 30: may be deleted or overwritten. Max_result_size is the maximum allowed size of the > Why "may"? Seems like we should know exactly what will happen, so just say Done Line 36: def __init__(self, > might be good to add some logging in a few places for easier debugging in c You mean printing something at each step. Maybe we can discuss it offline? Line 61: num_files = num_files or len(self.file_list) Do we need to worry about the case if the directory somehow has hundreds of files and terabytes of data (this script might take a very long time and create huge intermediate files)? Also do we need to worry about permissions? Line 75: files with modified date not in the eligible range. It is assumed that source_dir > eligible range -> desired time range Done Line 115: msg = 'No files found in the source_directory.' > include source directory in error message (for better debugging) Done Line 120: msg = 'Success, archived all {0} files.' > include src dir in all msg here Done Line 166: parser.add_option('--conf_dir', default='/tmp/impala-conf', > Is the default reasonable? Is there a case where the options are really pre Done Line 171: parser.add_option('--max_result_size', default=40*1024*1024, type='int', > max_output_size? Done Line 177: parser.add_option('--start_time', default=None, type='int', > nit: move start time before end time Done Line 179: parser.add_option('--output_file_path', default='/tmp/minidump_package.tar.gz', > Why not just 'minidump_package.tar.gz' as the default? Storing in /tmp seem Done. I guess that makes sense, but I think it's also not necessarily a good idea to store it in the same directory as this script (which is what you're suggesting). -- To view, visit http://gerrit.cloudera.org:8080/2997 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I85b3643133e28eca07507ac2a79acbf73128456f Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Ishaan Joshi <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: [email protected] Gerrit-HasComments: Yes
