Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18866 )

Change subject: [master] KUDU-3322 / KUDU-3319 HMS event logging
......................................................................


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/18866/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/18866/2//COMMIT_MSG@15
PS2, Line 15: Was hesitant on crashing the
            : master and chose to log an ERROR as in the case an out of order 
event
            : id received from HMS might lead to unnecessary crash of the Kudu
            : master
What parts of the system are still functional after this happening?  I guess  
DDL operations do not work after this, but reading/writing data from existing 
tables is still possible, right?

Related question: what is the way to recover from this?  Should there be a way 
to reset the latest HMS event identifier stored in the catalog in such case?


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/hms/hms_catalog.h
File src/kudu/hms/hms_catalog.h:

http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/hms/hms_catalog.h@147
PS2, Line 147:   Status GetCurrentNotificationEventId(int64_t* event_id) 
WARN_UNUSED_RESULT;
nit: could this method be constant?


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/hms/mini_hms.h
File src/kudu/hms/mini_hms.h:

http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/hms/mini_hms.h@75
PS2, Line 75:   // Delete the HMS database directory
nit: follow the style of the comments in the rest of this file -- add a period 
in the end.


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/hms/mini_hms.cc
File src/kudu/hms/mini_hms.cc:

http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/hms/mini_hms.cc@188
PS2, Line 188:   
RETURN_NOT_OK(Env::Default()->DeleteRecursively(JoinPathSegments(data_root_, 
metadb_subdir_)));
             :   return Status::OK();
nit: could be more concise

  return Env::Default()->DeleteRecursively(...);


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/integration-tests/master_hms-itest.cc
File src/kudu/integration-tests/master_hms-itest.cc:

http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/integration-tests/master_hms-itest.cc@88
PS2, Line 88: opts.extra_master_flags.emplace_back("--logtostderr=false");
Maybe, create its own class to change this behavior just for the sake of a 
single test scenario?  My concern is that with not writing into stderr it might 
be harder to troubleshoot test failures when tests are run via dist-test.


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/integration-tests/master_hms-itest.cc@240
PS2, Line 240: rename it a couple of times to generate 3
Why is it needed?  Why not to go with just 'default.a' -- that would be one HMS 
event already, right?  Would be great to add comment to clarify.


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/integration-tests/master_hms-itest.cc@263
PS2, Line 263: line.find("latest event ID in HMS(0)") != std::string::npos
Why not to search for the whole error message as at line 264?


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/integration-tests/master_hms-itest.cc@271
PS2, Line 271:
nit: the indent seems to be messed up


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/master/catalog_manager.cc
File src/kudu/master/catalog_manager.cc:

http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/master/catalog_manager.cc@1454
PS2, Line 1454:
nit: why is this change?  I guess the 'if ...' line should have been shifted 
instead


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/master/catalog_manager.cc@1457
PS2, Line 1457:       LOG(INFO) << "Last processed Hive Metastore notification 
event ID: " << hms_notification_log_event_id_;
Maybe, move this into
CatalogManager::InitLatestNotificationLogEventId() ?

Also, isn't this line too long?


http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/master/hms_notification_log_listener.cc
File src/kudu/master/hms_notification_log_listener.cc:

http://gerrit.cloudera.org:8080/#/c/18866/2/src/kudu/master/hms_notification_log_listener.cc@295
PS2, Line 295:         LOG(ERROR) << "Found the last seen event ID in the local 
Kudu master("<< processed_event_id <<
             :                       ") to be greater than the latest event ID 
in HMS("<< event_id <<
             :                       "). Was there any backup or restore done 
on HMS recently?";
nit: maybe, use Substitute() to make this more readable?



--
To view, visit http://gerrit.cloudera.org:8080/18866
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I242c9cedf170ed621867b905f1cb1875347d5887
Gerrit-Change-Number: 18866
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 06 Sep 2022 23:13:31 +0000
Gerrit-HasComments: Yes

Reply via email to