Dan Burkert has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9965 )

Change subject: KUDU-2191 (9/n): HMS Catalog should short-circuit no-op alter 
and create table calls

Patch Set 2:


Commit Message:

PS2, Line 9: 8
> Nit: it was actually part 7 :pedantic_parrot:

File src/kudu/hms/hms_catalog-test.cc:

PS2, Line 261: The
             :   // HMS catalog will automatically short-circuit creating the 
> Could we somehow assert that it was a short-circuit? Maybe we'd need some H
It will be tested once the notification listener lands.  If it weren't a 
short-circuit then the notification listener would get in a loop of altering 
the table, which would create a new alter table event, which would cause it to 
alter the table, ... ad infinitum.

I'll add a TODO here, because I think we should eventually keep such metrics, 
and at that point it's worth checking.  Not sure I want to roll it in to this 
patch, though.

PS2, Line 357:
             :   // Try the previous alter operation again. This should 
succeed, since the
             :   // destination table ID matches, so the HMS catalog knows its 
the same table.
             :   ASSERT_OK(hms_catalog_->AlterTable(kTableId, "default.ext", 
"default.b", schema));
> Likewise, if this is supposed to short-circuit, would be nice to assert tha

File src/kudu/hms/hms_catalog.cc:

PS2, Line 218: renamed
> What's the significance of the change from "altered" to "renamed" here and
We are now handling 'ALTER TABLE ...' and 'ALTER TABLE RENAME  ...' operations 
a bit differently.  Line 208 checks for the first case and handles it.  If we 
fall through then we know we're in a rename operation, so we can be a bit more 
specific with the error message.  Technically a rename operation could also 
alter the columns, and that would work fine here, but I don't think it's 
actually expressible in Hive syntax.

In general I think it's good to include in the error messages if it's is a 
rename, because it alerts that there are two distinct table names involved 
instead of one.

PS2, Line 222:         return CreateOrUpdateTable(client, id, new_name, schema, 
> I understand wanting to apply CreateOrUpdateTable consistently throughout,
No, this is actually the critical change of the patch; previously we weren't 
handling the case where a rename had already been applied to the HMS.  Before 
we simply created the table, whereas now we're checking if the table already 
exists and updating it appropriately.  This ends up being critical because the 
notification log can double-apply events in some cases, so we have to be able 
to recognize and short circuit in these cases.

PS2, Line 244:       // Overwrite fields in the table that have changed, 
including the new name.
> In the original code, we made a copy of the table before calling PopulateTa
Correct.  Since this is now checked to be a rename operation we know it can't 

PS2, Line 481:     // Otherwise we fail, since we should not create or update a 
non-Kudu table.
> Nit: it's not necessarily a non-Kudu table, right? If the storage handlers 

File src/kudu/hms/hms_client.cc:

PS2, Line 119: const char* const HmsClient::kStorageHandlerKey = 
> Why duplicate this into Kudu code instead of using the constant defined in
The motivation to re-define it is that the name of the constant is really wide, 
so it makes the line wrapping in HmsCatalog really ugly.  I don't think I can 
just reference it here, since it's defined as a std::string and we don't allow 
constant std::strings as far as I know.  Maybe you know of a way to do that, 
though?  I can also just remove this and go back to referencing the hive 
version directly if you think that's better.

To view, visit http://gerrit.cloudera.org:8080/9965
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c3594db0db253bc88e032b8c5aa2c8667c49853
Gerrit-Change-Number: 9965
Gerrit-PatchSet: 2
Gerrit-Owner: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Adar Dembo <a...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <danburk...@apache.org>
Gerrit-Reviewer: Hao Hao <hao....@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 10 Apr 2018 18:10:40 +0000
Gerrit-HasComments: Yes

Reply via email to