Alex Behm has posted comments on this change.

Change subject: Add Breakpad minidump collection script
......................................................................


Patch Set 4:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/2997/4/bin/collect_minidumps.py
File bin/collect_minidumps.py:

Line 15: # This script is to be called by Cloudera Manager to collect Breakpad 
minidump files up to
Can you provide a reasonable example invokation of the script that CM might do?


Line 17: # We try to fit as many files as possible into the tarball until a 
limit is reached.
until a size limit is reached.


Line 27:   '''This is a generic class that makes a tarball out of a files in 
the source_dir.
typo: all files in the source_dir?


Line 30:   may be deleted or overwritten. Max_result_size is the maximum 
allowed size of the
Why "may"? Seems like we should know exactly what will happen, so just say 
"will be deleted and re-created."


Line 36:   def __init__(self,
might be good to add some logging in a few places for easier debugging in case 
something goes wrong


Line 75:     files with modified date not in the eligible range. It is assumed 
that source_dir
eligible range -> desired time range

The sentence "It is assumed that" is a little confusing because it does not say 
what happens when a non-file entry is encountered. Why not just say something 
like "Directories and other non-files are ignored."


Line 115:       msg = 'No files found in the source_directory.'
include source directory in error message (for better debugging)


Line 120:       msg = 'Success, archived all {0} files.'
include src dir in all msg here


Line 166:   parser.add_option('--conf_dir', default='/tmp/impala-conf',
Is the default reasonable? Is there a case where the options are really present 
(apart from testing)?

I'm thinking we should make this param required with no default.


Line 171:   parser.add_option('--max_result_size', default=40*1024*1024, 
type='int',
max_output_size?

you have an --output_file_path param and I think we should either use "result" 
our "output" consistently


Line 177:   parser.add_option('--start_time', default=None, type='int',
nit: move start time before end time


Line 179:   parser.add_option('--output_file_path', 
default='/tmp/minidump_package.tar.gz',
Why not just 'minidump_package.tar.gz' as the default? Storing in /tmp seems 
like an odd choice.


-- 
To view, visit http://gerrit.cloudera.org:8080/2997
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I85b3643133e28eca07507ac2a79acbf73128456f
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Ishaan Joshi <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: [email protected]
Gerrit-HasComments: Yes

Reply via email to