Taras Bobrovytsky has posted comments on this change.

Change subject: Add Breakpad minidump collection script
......................................................................


Patch Set 4:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/2997/4/bin/collect_minidumps.py
File bin/collect_minidumps.py:

Line 15: # This script is to be called by Cloudera Manager to collect Breakpad 
minidump files up to
> Can you provide a reasonable example invokation of the script that CM might
Done


Line 17: # We try to fit as many files as possible into the tarball until a 
limit is reached.
> until a size limit is reached.
Done


Line 27:   '''This is a generic class that makes a tarball out of a files in 
the source_dir.
> typo: all files in the source_dir?
Not sure if "all" is the right word either. I just removed it.


Line 30:   may be deleted or overwritten. Max_result_size is the maximum 
allowed size of the
> Why "may"? Seems like we should know exactly what will happen, so just say 
Done


Line 36:   def __init__(self,
> might be good to add some logging in a few places for easier debugging in c
You mean printing something at each step. Maybe we can discuss it offline?


Line 61:     num_files = num_files or len(self.file_list)
Do we need to worry about the case if the directory somehow has hundreds of 
files and terabytes of data (this script might take a very long time and create 
huge intermediate files)? Also do we need to worry about permissions?


Line 75:     files with modified date not in the eligible range. It is assumed 
that source_dir
> eligible range -> desired time range
Done


Line 115:       msg = 'No files found in the source_directory.'
> include source directory in error message (for better debugging)
Done


Line 120:       msg = 'Success, archived all {0} files.'
> include src dir in all msg here
Done


Line 166:   parser.add_option('--conf_dir', default='/tmp/impala-conf',
> Is the default reasonable? Is there a case where the options are really pre
Done


Line 171:   parser.add_option('--max_result_size', default=40*1024*1024, 
type='int',
> max_output_size?
Done


Line 177:   parser.add_option('--start_time', default=None, type='int',
> nit: move start time before end time
Done


Line 179:   parser.add_option('--output_file_path', 
default='/tmp/minidump_package.tar.gz',
> Why not just 'minidump_package.tar.gz' as the default? Storing in /tmp seem
Done. I guess that makes sense, but I think it's also not necessarily a good 
idea to store it in the same directory as this script (which is what you're 
suggesting).


-- 
To view, visit http://gerrit.cloudera.org:8080/2997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85b3643133e28eca07507ac2a79acbf73128456f
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Ishaan Joshi <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: [email protected]
Gerrit-HasComments: Yes

Reply via email to