Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13559 )

Change subject: IMPALA-8636: Implement INSERT for insert-only ACID tables
......................................................................


Patch Set 11:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/13559/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/13559/11//COMMIT_MSG@25
PS11, Line 25: The Backend commits/aborts the transaction by calling the 
Frontend via
I think this needs to be updated to explain what is done on the catalog.


http://gerrit.cloudera.org:8080/#/c/13559/11//COMMIT_MSG@34
PS11, Line 34:   tests only run with exhaustive exploration strategy)
I'm thinking about how we can test more of the error paths, that seems like the 
biggest gap at the moment. It might be worth adding cancellation tests like 
TestCancellationSerial.test_cancel_insert() that cancels a query against an 
ACID table, and I suppose also one that cancels a select query against an acid 
table.

Ideally we would also test some of the other error paths (e.g. HMS returns 
error) - I think we'd have to add failpoints for that. But I don't want to 
expand the scope of this too much - maybe take a look to see what it would take 
to do that. Don't think this is a blocker.


http://gerrit.cloudera.org:8080/#/c/13559/11//COMMIT_MSG@37
PS11, Line 37: * add locks and heartbeats
What are the implications of not having heartbeats? Will long-running queries 
against ACID tables have the transaction time out on the HMS?


http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/exec/hdfs-table-sink.h@134
PS11, Line 134: /
"base or delta". It was confusing whether it was a path separate or or not


http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/service/client-request-state.h@493
PS11, Line 493:   /// Gather and publish all required updates to the metastore
Can you document the postcondition around transactions being committed/aborted? 
It seems like we're guaranteeing that DML transactions on partitioned tables 
should be committed or aborted when this function returns.


http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/service/client-request-state.cc@1106
PS11, Line 1106:       Status rpc_status 
=client.DoRpc(&CatalogServiceClientWrapper::UpdateCatalog,
missing space


http://gerrit.cloudera.org:8080/#/c/13559/11/be/src/service/client-request-state.cc@1107
PS11, Line 1107:           catalog_update, &resp);
It would be more explicit to
  if (!rpc_status.ok()) {
   if (InTransaction()) AbortTransaction();
   return rpc_status;
  }

Actually could maybe combine the checks to reduce the number of error paths 
like:

  Status status = client.DoRpc();
  if (status.ok()) status = Status(resp.result.status);
  if (!status.ok()...) {
    if (InTransaction()) AbortTransaction();
    LOG(ERROR) << "ERROR Finalizing DML: " << status.GetDetail();
    return status;
  }

This has a slight behaviour change of logging RPC errors, but that seems like a 
reasonable thing to do. This is a bit of a nit-pick, and your code is correct, 
but this kind of error handling is easy to introduce bugs into. I think the 
current code is ok, will leave it to you to determine if it's an improvement or 
not.


http://gerrit.cloudera.org:8080/#/c/13559/11/common/thrift/Frontend.thrift
File common/thrift/Frontend.thrift:

http://gerrit.cloudera.org:8080/#/c/13559/11/common/thrift/Frontend.thrift@361
PS11, Line 361: traget
target


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@941
PS11, Line 941:         //TODO writeIDs may also be loaded in other code paths.
I guess you didn't actually add this comment, is there a JIRA for this?


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/common/TransactionException.java
File fe/src/main/java/org/apache/impala/common/TransactionException.java:

http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/common/TransactionException.java@21
PS11, Line 21:  * Thrown for errors related to ACID transactions.
I feel like this could be a little more specific, e.g. "Thrown for errors that 
occur when interacting with ACID transactions, e.g. failures to open, commit, 
or abort a transaction". I.e. it wouldn't include something like failing a 
query because it uses an unsupported table type or something.


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/common/TransactionException.java@22
PS11, Line 22:  *
unnecesssary blank line?


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3801
PS11, Line 3801: failers
failures


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3803
PS11, Line 3803:       // moved to ClientRequestState::UpdateCatalog().
I think handling the abort on the client is probably the right thing to do 
anyway, because we want to abort the transaction if there's an RPC error 
talking to the catalog anyway. So we could actually just say "The client, i.e. 
query coordinator, is always responsible for aborting transactions when queries 
hit errors."


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/Frontend.java@1266
PS11, Line 1266:       //TODO: should load table write ids in transaction 
context.
is there a jira?


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/Frontend.java@1379
PS11, Line 1379:         catch (TransactionException te) {
nit: extra newline before catch


http://gerrit.cloudera.org:8080/#/c/13559/11/fe/src/main/java/org/apache/impala/service/Frontend.java@1629
PS11, Line 1629:   private static boolean stmtNeedsTransaction(AnalysisResult 
analysisResult) {
The name/comment is a bit misleading, since it depends on whether the query is 
referencing an ACID table or not.



--
To view, visit http://gerrit.cloudera.org:8080/13559
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id6c36fa6902676f06b4e38730f737becfc7c06ad
Gerrit-Change-Number: 13559
Gerrit-PatchSet: 11
Gerrit-Owner: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Jul 2019 04:17:05 +0000
Gerrit-HasComments: Yes

Reply via email to