Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-5799: Kudu DML can crash if schema has changed
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7688/1//COMMIT_MSG
Commit Message:

PS1, Line 14: its
> it's
Done


PS1, Line 18: Wri
> DML is really a SQL concept, technically this should be 'WriteOp'
Done


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS1, Line 144: g order of column positio
> We can't assume this is true. E.g. see the comment in Send().  We actually 
Done


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

PS1, Line 37: DataType = 
> why can't you just have
Weird quirk of C++. Doing that gives the compiler message:
'error: ‘kudu::client::KuduColumnSchema’ is not a namespace'


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

Line 81: /// Takes a Kudu client DataType and returns the corresponding Impala 
ColumnType.
> comment
Done


http://gerrit.cloudera.org:8080/#/c/7688/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 399:     print cursor.fetchall() == [(i, )]
> is this test reproducing a crash before your change?
Yes, I ran it 20+ times and it repro-ed the crash every time. I has also 
originally thought that it would need a loop, but then decided not to add one 
based on those results.

That may just be particular to my machine, and the timing may not work out 
quite so nicely in other environments, in which case a loop would be good to be 
safe.

Adding a loop though increases the number of legitimate ways the insert may 
fail, eg. we may get analysis errors.


Line 410:       threading.current_thread().errors = []
> I was thinking we could do these operations in a loop to get more opportuni
Done


PS1, Line 416:         except Exception as e:
             :           threading.current_thread().errors.append(e)
             : 
             :     insert_thread = threading.Thread(target=insert_values)
             :     insert_thread.start()
             : 
> this you can do on the main thread while the insert thread is inserting
Done


PS1, Line 432: rror in insert_thread.er
> this is the error in the case we sent a write op to Kudu that was w/ the ol
Yes. I was trying to avoid depending too much on the specific text of the 
error, esp. in this case where we don't control it, but its probably better to 
look for something more specific.


-- 
To view, visit http://gerrit.cloudera.org:8080/7688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9fd6bf164310df0041144f75f5ee722665e9f587
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-HasComments: Yes

Reply via email to