Thomas Tauber-Marshall has posted comments on this change. Change subject: IMPALA-5799: Kudu DML can crash if schema has changed ......................................................................
Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/7688/1//COMMIT_MSG Commit Message: PS1, Line 14: its > it's Done PS1, Line 18: Wri > DML is really a SQL concept, technically this should be 'WriteOp' Done http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-table-sink.cc File be/src/exec/kudu-table-sink.cc: PS1, Line 144: g order of column positio > We can't assume this is true. E.g. see the comment in Send(). We actually Done http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.cc File be/src/exec/kudu-util.cc: PS1, Line 37: DataType = > why can't you just have Weird quirk of C++. Doing that gives the compiler message: 'error: ‘kudu::client::KuduColumnSchema’ is not a namespace' http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.h File be/src/exec/kudu-util.h: Line 81: /// Takes a Kudu client DataType and returns the corresponding Impala ColumnType. > comment Done http://gerrit.cloudera.org:8080/#/c/7688/1/tests/query_test/test_kudu.py File tests/query_test/test_kudu.py: Line 399: print cursor.fetchall() == [(i, )] > is this test reproducing a crash before your change? Yes, I ran it 20+ times and it repro-ed the crash every time. I has also originally thought that it would need a loop, but then decided not to add one based on those results. That may just be particular to my machine, and the timing may not work out quite so nicely in other environments, in which case a loop would be good to be safe. Adding a loop though increases the number of legitimate ways the insert may fail, eg. we may get analysis errors. Line 410: threading.current_thread().errors = [] > I was thinking we could do these operations in a loop to get more opportuni Done PS1, Line 416: except Exception as e: : threading.current_thread().errors.append(e) : : insert_thread = threading.Thread(target=insert_values) : insert_thread.start() : > this you can do on the main thread while the insert thread is inserting Done PS1, Line 432: rror in insert_thread.er > this is the error in the case we sent a write op to Kudu that was w/ the ol Yes. I was trying to avoid depending too much on the specific text of the error, esp. in this case where we don't control it, but its probably better to look for something more specific. -- To view, visit http://gerrit.cloudera.org:8080/7688 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I9fd6bf164310df0041144f75f5ee722665e9f587 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
