Lars Volker has posted comments on this change.

Change subject: IMPALA-3489: Add script to extract breakpad symbols from 
binaries
......................................................................


Patch Set 4:

(28 comments)

http://gerrit.cloudera.org:8080/#/c/2961/4/bin/dump_breakpad_symbols.py
File bin/dump_breakpad_symbols.py:

Line 1: #!/usr/bin/env python
> Is there a reason not to use impala-python instead?
The script didn't need anything impala specific (at least until now, before 
addressing the comments) so I thought it would be nice for it not to depend on 
the impala env.

I changed this, however I also needed to add python-magic (the package) to our 
requirements.

All in all I would still prefer the script to be as self-contained as possible. 
Let me know which way of handling the dependency (python-magic) seems 
preferable to you.


Line 17: # in Google breakpad
> Nit: Breakpad
Done


Line 19: #  - Scan an impala build dir for ELF files
> Nit: Impala
Done


Line 22: #  - Extract an rpm and corresponding debuginfo rpm file
> This is specific to Impala / CDH RPMs, right? Please clear this up since de
Done


Line 29: #  - Goodle breakpad, either installed via the impala toolchain or 
separately
> s/Goodle/Google/
Done


Line 37: #   ./dump_breakpad_symbols -d /tmp/syms -r 
tmp/impala-2.5.0+cdh5.7.0+0-1.cdh5.7.0.p0.147.el6.x86_64.rpm -s 
tmp/impala-debuginfo-2.5.0+cdh5.7.0+0-1.cdh5.7.0.p0.147.el6.x86_64.rpm
> Very long line. Please wrap with \ chars.
Done, however I still needed the leading #, so copy and paste won't work 
anymore.


Line 47: #   $IMPALA_TOOLCHAIN/breakpad-*/bin/minidump_stackwalk 
/tmp/impala-minidumps/impalad/03c0ee26-bfd1-cf3e-43fa49ca-1a6aae25.dmp /tmp/syms
> Another long line.
Done


Line 61: Binary = namedtuple("Binary", "path, debug_path")
> This is more than just a "Binary", right? It contains a mapping of a binary
Sure. How about BinaryDebugInfo? BinaryDebugTuple? BinaryDebugPair?


Line 64: def log(msg):
       :   """Log a message to stdout."""
       :   print msg
> Do you need this? Why not use the logging module? A lot of our python scrip
Done


Line 71:   sys.stderr.write("ERROR: %s\n" % msg)
> If you used the logging module, this could be an error-level message.
Done


Line 76: breakpad
> nit: capital B
Done


Line 86:    dump_syms = glob.glob(os.path.join(toolchain, "breakpad-*", "bin", 
"dump_syms"))
> This seems risky. A long-lived working copy of Impala will have a toolchain
Done, see top comment.


Line 100: Path to a directory containing results
        :       from an impala build, e.g. be/build/debug
> I think this will make the help message look strange, "from an impala..." w
Seems to work fine. My guess would be argparse calls textwrap.dedent() on the 
string. Here's how it looks for me:

  -b BUILD_DIR, --build_dir BUILD_DIR
                        Path to a directory containing results from an Impala
                        build, e.g. be/build/debug


Line 112:   # Post processing checks
> Maybe also verify that, for example, if binary_files were specified then st
Done


Line 126:     if e.errno != errno.EEXIST:
> I think you should also check that path is a directory. Maybe rewrite like 
Done


Line 136: def is_file(path):
> is_regular_file ?
Done


Line 142:   """Check whether 'path' is an ELF file."""
> How about adding the is_file check here?
Done


Line 167:   result = []
> Not used.
Done


Line 176:     binary_base = os.path.join(tmp_dir, 'usr', 'lib', 'impala')
        :     debuginfo_base = os.path.join(tmp_dir, 'usr', 'lib', 'debug', 
'usr', 'lib', 'impala')
> It might be good to make 'usr/lib/impala' and 'usr/lib/debug/usr/lib/impala
Done


Line 193:   if (args.binary_files):
> I don't believe the parens are necessary.
Done


Line 195:   if args.stdin_files:
> It seems like this chain could all have elif instead.
Done


Line 201:   return []
> The script should not get to this point, maybe some kind of error message s
Done


Line 227:     with open(tmp_file, 'r') as f:
> The tmp_fd is never closed before you reopen the same file.
One the temporary file object in 218 is destroyed the underlying fd is closed: 
http://stackoverflow.com/questions/7114059/does-closing-a-file-opened-with-os-fdopen-close-the-os-level-fd

I ran the test program from there to confirm.


Line 230:       _, _, _, id, binary = header.split(' ')
        :       out_path = os.path.join(out_dir, binary, id)
> The use of the 'id' identifier here technically masks the Python builtin fu
Yes. I wasn't sure what our policies are on this. Renamed it.


Line 235:   except:
> At the very least this should be:
Done. I learned about BaseException on the way. :)


Line 237:     os.remove(tmp_file)
> Catch EnvironmentError here and...I dunno, print an error and move on?
Done


Line 238:     raise
> Assuming you make the L235 change...
Done


Line 246:   success = True
> maybe change this to status = 0, then after the loop you can do: sys.exit(s
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2961
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3ee0972efcb50609407b04cd6f4309b244a84861
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker <[email protected]>
Gerrit-Reviewer: Bharath Vissapragada <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to