David Knupp has uploaded a new patch set (#3). Change subject: IMPALA-2013: Issue Hbase queries individually during data-load. ......................................................................
IMPALA-2013: Issue Hbase queries individually during data-load. Loading data into HBase has traditionally been a bit flaky, with problems being hard to diagnose from existing logs. I think this is at least in part due to the fact that we have been relying on a command file to send queries to the HBase shell. When sending a series of queries in a file, the HBase shell does not check or halt operation after each query. From https://hbase.apache.org/book.html#_read_hbase_shell_commands_from_a_command_file "There is no way to programmatically check each individual command for success or failure. Also, though you see the output for each command, the commands themselves are not echoed to the screen so it can be difficult to line up the command with its output." Even if the HBase process dies completely, our data load process goes through the laborious process of continuing to send commands to the shell. Instead of trying to process the file all at once, the command file generated by generate-schema-statements.py should be iterated line-by-line, with each query being passed individually to the HBase shell, checking for errors in the output each time. If we get an error message, fail fast and loudly. Also, this commit fixes several flake8 linter complaints, and replaces print statements with specific log level output. Change-Id: I911d972ba8ad3a2a084c8195074556153722c7e2 --- M bin/load-data.py 1 file changed, 102 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/28/3728/3 -- To view, visit http://gerrit.cloudera.org:8080/3728 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I911d972ba8ad3a2a084c8195074556153722c7e2 Gerrit-PatchSet: 3 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: David Knupp <[email protected]> Gerrit-Reviewer: Harrison Sheinblatt <[email protected]> Gerrit-Reviewer: Ishaan Joshi <[email protected]> Gerrit-Reviewer: Michael Brown <[email protected]>
