Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5799: Kudu DML can crash if schema has changed
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7688/1//COMMIT_MSG
Commit Message:

PS1, Line 14: its
it's


PS1, Line 18: DML
DML is really a SQL concept, technically this should be 'WriteOp'


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS1, Line 144: output_expr_evals_.size()
We can't assume this is true. E.g. see the comment in Send().  We actually 
don't necessarily have the full Table descriptor in the BE (sorry I didn't 
think of that when we first started talking about this JIRA), so we could 
either (a) ship that from the FE to the BE or (b) just check the cols that are 
referenced. I think the latter makes sense, and that you can do something like 
this:

      // output_expr_evals_ only contains the columns that the op
      // applies to, i.e. columns explicitly mentioned in the query, and
      // referenced_columns is then used to map to actual column positions.
      int col = kudu_table_sink_.referenced_columns.empty() ?
          j : kudu_table_sink_.referenced_columns[j];


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

PS1, Line 37: DataType = 
why can't you just have
using kudu::client::...::DataType; as we do above?


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

Line 81: ColumnType 
KuduDataTypeToColumnType(kudu::client::KuduColumnSchema::DataType type);
comment


http://gerrit.cloudera.org:8080/#/c/7688/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 399:   def test_concurrent_schema_change(self, cursor, unique_database):
is this test reproducing a crash before your change?


Line 410:         client.execute("insert into %s values (0, 0), (1, 1)" % 
table_name)
I was thinking we could do these operations in a loop to get more opportunities 
to hit issues.


PS1, Line 416:       try:
             :         client = self.create_impala_client()
             :         client.execute("alter table %s drop column col1" % 
table_name)
             :         client.execute("alter table %s add columns (col1 
string)" % table_name)
             :       except Exception as e:
             :         threading.current_thread().error = e
this you can do on the main thread while the insert thread is inserting


PS1, Line 432: 'Kudu error(s) reported'
this is the error in the case we sent a write op to Kudu that was w/ the old 
schema? Is there any more of the error message we can match? This looks very 
generic.


-- 
To view, visit http://gerrit.cloudera.org:8080/7688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9fd6bf164310df0041144f75f5ee722665e9f587
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-HasComments: Yes

Reply via email to